# System Architecture ## Overview FUJ Management follows a **pipeline architecture** where data flows from external sources (Google Sheets, Fio Bank) through processing scripts into a web dashboard. There is no central database — Google Sheets serves as the persistent data store, and the Flask app renders views by fetching and processing data on every request. ## Component Architecture ``` ┌─────────────────────────────────────────────────┐ │ EXTERNAL DATA SOURCES │ │ │ │ ┌──────────────────┐ ┌──────────────────────┐ │ │ │ Attendance Sheet │ │ Fio Bank Account │ │ │ │ (Google Sheets) │ │ │ │ │ │ │ │ ┌────────────────┐ │ │ │ │ ID: 1E2e_gT... │ │ │ REST API │ │ │ │ │ │ │ │ (JSON, w/token)│ │ │ │ │ CSV export (pub) │ │ ├────────────────┤ │ │ │ │ │ │ │ Transparent │ │ │ │ └────────┬─────────┘ │ │ page (HTML) │ │ │ │ │ │ └───────┬────────┘ │ │ │ │ └──────────┼──────────┘ │ └───────────┼───────────────────────┼────────────┘ │ │ ─ ─ ─ ─ ─ ─ ┼ ─ ─ DATA INGESTION ─ ┼ ─ ─ ─ ─ ─ │ │ ┌───────────▼──────┐ ┌───────────▼──────────┐ │ attendance.py │ │ fio_utils.py │ │ │ │ │ │ fetch_csv() │ │ fetch_transactions() │ │ parse_dates() │ │ FioTableParser │ │ group_by_month() │ │ parse_czech_amount() │ │ calculate_fee() │ │ parse_czech_date() │ │ get_members() │ │ │ │ get_members_ │ │ API + HTML fallback │ │ with_fees() │ │ │ └───────────┬──────┘ └───────────┬──────────┘ │ │ ─ ─ ─ ─ ─ ─ ┼ ─ ─ PROCESSING ─ ─ ─ ┼ ─ ─ ─ ─ ─ │ │ │ ┌─────────────▼──────────┐ │ │ sync_fio_to_sheets.py │ ──▶ Payments Sheet │ │ │ (Google Sheets) │ │ generate_sync_id() │ │ │ sort_sheet_by_date() │ │ │ get_sheets_service() │ │ └────────────────────────┘ │ │ │ ┌─────────────▼──────────┐ │ │ infer_payments.py │ ──▶ Writes back to │ │ │ Payments Sheet │ │ infer Person/Purpose/ │ │ │ Amount for empty rows │ │ └────────────────────────┘ │ │ │ ┌──────────────────▼──────────┐ │ │ czech_utils.py │ │ │ │ │ │ normalize() — strip │ │ │ diacritics │ │ │ parse_month_references() │ │ │ CZECH_MONTHS dict │ │ └─────────────────────────────┘ │ │ ─ ─ ─ ─ ─ ─ ┼ ─ RECONCILIATION ─ ─┼ ─ ─ ─ ─ ─ │ │ ┌─────────▼───────────────────────▼───────────┐ │ match_payments.py │ │ │ │ _build_name_variants() — name matching │ │ match_members() — fuzzy match │ │ infer_transaction_details() │ │ fetch_sheet_data() — read payments │ │ fetch_exceptions() — fee overrides │ │ reconcile() — CORE ENGINE │ │ print_report() — CLI output │ └──────────────────────┬──────────────────────┘ │ ─ ─ ─ ─ ─ ─ ─ PRESENTATION ┼ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ ┌──────────────────────▼──────────────────────┐ │ app.py (Flask) │ │ │ │ GET / → redirect to /fees │ │ GET /fees → fees.html │ │ GET /reconcile → reconcile.html │ │ GET /payments → payments.html │ │ GET /qr → PNG QR code (SPD format) │ └─────────────────────────────────────────────┘ ``` ## Module Dependency Graph ``` app.py ├── attendance.py │ └── (stdlib: csv, urllib, datetime) └── match_payments.py ├── attendance.py ├── czech_utils.py │ └── (stdlib: re, unicodedata) └── sync_fio_to_sheets.py (for get_sheets_service, DEFAULT_SPREADSHEET_ID) └── fio_utils.py └── (stdlib: json, urllib, html.parser, datetime) infer_payments.py ├── sync_fio_to_sheets.py ├── match_payments.py └── attendance.py calculate_fees.py └── attendance.py ``` ### Import Relationships | Module | Imports from | |--------|-------------| | `app.py` | `attendance` (`get_members_with_fees`, `SHEET_ID`), `match_payments` (`reconcile`, `fetch_sheet_data`, `fetch_exceptions`, `normalize`, `DEFAULT_SPREADSHEET_ID`) | | `match_payments.py` | `attendance` (`get_members_with_fees`), `czech_utils` (`normalize`, `parse_month_references`), `sync_fio_to_sheets` (`get_sheets_service`, `DEFAULT_SPREADSHEET_ID`) | | `infer_payments.py` | `sync_fio_to_sheets` (`get_sheets_service`, `DEFAULT_SPREADSHEET_ID`), `match_payments` (`infer_transaction_details`), `attendance` (`get_members_with_fees`) | | `sync_fio_to_sheets.py` | `fio_utils` (`fetch_transactions`) | | `calculate_fees.py` | `attendance` (`get_members_with_fees`) | ## Data Flow Patterns ### Pattern 1: Sync & Enrich (Batch Pipeline) This is the primary workflow for keeping the payments ledger up to date: ``` 1. make sync 2. make infer ┌──────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ Fio │───▶│ Payments │ │ Payments │───▶│ Payments │ │ Bank │ │ Sheet │ │ Sheet │ │ Sheet │ └──────┘ │ (append) │ │ (read) │ │ (update) │ └──────────┘ └──────────┘ └──────────┘ - Fetches last 30 days - Reads empty Person/Purpose rows - SHA-256 dedup prevents - Uses name matching + Czech month duplicate entries parsing to auto-fill - Marks uncertain matches with [?] ``` ### Pattern 2: Real-Time Rendering (Web Dashboard) Every web request triggers a fresh data fetch — no caching layer exists: ``` Browser Request → Flask Route → Fetch (Google Sheets API/CSV) → Process → Render HTML │ │ │ attendance.py │ reconcile() │ fetch_sheet_data() │ or direct │ fetch_exceptions() │ formatting ▼ ▼ ~1-3 seconds Template with (network I/O) inline CSS + JS ``` ### Pattern 3: QR Code Generation (On-Demand) ``` Browser clicks "Pay" → GET /qr?account=...&amount=...&message=... → SPD QR PNG │ qrcode lib generates in-memory PNG ``` ## Key Design Patterns ### 1. Google Sheets as Database Instead of a traditional database, the system uses two Google Sheets: | Sheet | Purpose | Access Method | |-------|---------|---------------| | Attendance Sheet (`1E2e_gT...`) | Member names, tiers, practice dates, attendance marks | Public CSV export (no auth needed) | | Payments Sheet (`1Om0YPo...`) | Bank transactions with Person/Purpose annotations | Google Sheets API (service account auth) | **Trade-offs**: - ✅ Non-technical users can view and edit data directly - ✅ No database setup or maintenance - ✅ Built-in audit trail (Google Sheets version history) - ❌ Every page load incurs 1-3s of API latency - ❌ No complex queries or indexing - ❌ Rate limits on Google Sheets API ### 2. Dual-Mode Bank Access `fio_utils.py` implements a transparent fallback pattern: ```python def fetch_transactions(date_from, date_to): token = os.environ.get("FIO_API_TOKEN", "").strip() if token: return fetch_transactions_api(token, date_from, date_to) # Structured JSON return fetch_transactions_transparent(...) # HTML scraping ``` The API provides richer data (sender account numbers, stable bank IDs) but requires a token. The transparent page is always available but lacks some fields. ### 3. Name Matching with Confidence Levels The reconciliation engine uses a multi-tier matching strategy: | Priority | Method | Confidence | Example | |----------|--------|-----------|---------| | 1 | Full name match | `auto` | "František Vrbík" in message | | 2 | Both first + last name (any order) | `auto` | "Vrbík František" | | 3 | Nickname match | `auto` | "(Štrúdl)" from member list | | 4 | Last name only (≥4 chars, not common) | `review` | "Vrbík" alone | | 5 | First name only (≥3 chars) | `review` | "František" alone | When both `auto` and `review` matches exist, `review` matches are discarded. This prevents false positives from generic first names. ### 4. Exception System Fee overrides are managed through an `exceptions` sheet tab in the Payments Google Sheet: | Column | Content | |--------|---------| | Name | Member name | | Period | Month (YYYY-MM) | | Amount | Overridden fee in CZK | | Note | Reason for the exception | Exceptions are applied during reconciliation, replacing the attendance-calculated fee with the manually specified amount. ### 5. Render-Time Performance Tracking Every page includes a performance breakdown: ```python @app.before_request def start_timer(): g.start_time = time.perf_counter() g.steps = [] def record_step(name): g.steps.append((name, time.perf_counter())) ``` The footer displays total render time and, on click, reveals a detailed breakdown (e.g., `fetch_members:0.892s | fetch_payments:1.205s | reconcile:0.003s | render:0.015s`). ## Security Considerations | Concern | Mitigation | |---------|-----------| | PII in git | `.secret/` is gitignored; all data fetched at runtime | | Google API credentials | Service account JSON stored in `.secret/`, mounted as Docker secret | | Bank API token | Passed via `FIO_API_TOKEN` environment variable, never committed | | Web app authentication | **None currently** — the app has no auth layer | | CSRF protection | **None currently** — Flask default (no POST routes exist) | ## Scalability Notes This system is purpose-built for a small club (~20-40 members). It makes deliberate trade-offs favoring simplicity over scale: - **No caching**: Every page load fetches live data from Google Sheets (1-3s latency). For a single-user admin dashboard, this is acceptable. - **No background workers**: Sync and inference are manual `make` commands, not scheduled jobs. - **No database**: Google Sheets handles 10s of members and 100s of transactions with ease. - **Single-process Flask**: The built-in development server runs directly in production (via Docker). For this use case, this is intentional — it's a personal tool, not a public service. --- *Architecture documentation generated from comprehensive code analysis on 2026-03-03.*