docs: experiment with generated documentation, let's keep it in git for
All checks were successful
Deploy to K8s / deploy (push) Successful in 8s

now
This commit is contained in:
2026-03-11 11:57:30 +01:00
parent e83d6af1f5
commit 9b99f6d33b
17 changed files with 2367 additions and 0 deletions

View File

@@ -0,0 +1,268 @@
# System Architecture
## Overview
FUJ Management follows a **pipeline architecture** where data flows from external sources (Google Sheets, Fio Bank) through processing scripts into a web dashboard. There is no central database — Google Sheets serves as the persistent data store, and the Flask app renders views by fetching and processing data on every request.
## Component Architecture
```
┌─────────────────────────────────────────────────┐
│ EXTERNAL DATA SOURCES │
│ │
│ ┌──────────────────┐ ┌──────────────────────┐ │
│ │ Attendance Sheet │ │ Fio Bank Account │ │
│ │ (Google Sheets) │ │ │ │
│ │ │ │ ┌────────────────┐ │ │
│ │ ID: 1E2e_gT... │ │ │ REST API │ │ │
│ │ │ │ │ (JSON, w/token)│ │ │
│ │ CSV export (pub) │ │ ├────────────────┤ │ │
│ │ │ │ │ Transparent │ │ │
│ └────────┬─────────┘ │ │ page (HTML) │ │ │
│ │ │ └───────┬────────┘ │ │
│ │ └──────────┼──────────┘ │
└───────────┼───────────────────────┼────────────┘
│ │
─ ─ ─ ─ ─ ─ ┼ ─ ─ DATA INGESTION ─ ┼ ─ ─ ─ ─ ─
│ │
┌───────────▼──────┐ ┌───────────▼──────────┐
│ attendance.py │ │ fio_utils.py │
│ │ │ │
│ fetch_csv() │ │ fetch_transactions() │
│ parse_dates() │ │ FioTableParser │
│ group_by_month() │ │ parse_czech_amount() │
│ calculate_fee() │ │ parse_czech_date() │
│ get_members() │ │ │
│ get_members_ │ │ API + HTML fallback │
│ with_fees() │ │ │
└───────────┬──────┘ └───────────┬──────────┘
│ │
─ ─ ─ ─ ─ ─ ┼ ─ ─ PROCESSING ─ ─ ─ ┼ ─ ─ ─ ─ ─
│ │
│ ┌─────────────▼──────────┐
│ │ sync_fio_to_sheets.py │ ──▶ Payments Sheet
│ │ │ (Google Sheets)
│ │ generate_sync_id() │
│ │ sort_sheet_by_date() │
│ │ get_sheets_service() │
│ └────────────────────────┘
│ │
│ ┌─────────────▼──────────┐
│ │ infer_payments.py │ ──▶ Writes back to
│ │ │ Payments Sheet
│ │ infer Person/Purpose/ │
│ │ Amount for empty rows │
│ └────────────────────────┘
│ │
│ ┌──────────────────▼──────────┐
│ │ czech_utils.py │
│ │ │
│ │ normalize() — strip │
│ │ diacritics │
│ │ parse_month_references() │
│ │ CZECH_MONTHS dict │
│ └─────────────────────────────┘
│ │
─ ─ ─ ─ ─ ─ ┼ ─ RECONCILIATION ─ ─┼ ─ ─ ─ ─ ─
│ │
┌─────────▼───────────────────────▼───────────┐
│ match_payments.py │
│ │
│ _build_name_variants() — name matching │
│ match_members() — fuzzy match │
│ infer_transaction_details() │
│ fetch_sheet_data() — read payments │
│ fetch_exceptions() — fee overrides │
│ reconcile() — CORE ENGINE │
│ print_report() — CLI output │
└──────────────────────┬──────────────────────┘
─ ─ ─ ─ ─ ─ ─ PRESENTATION ┼ ─ ─ ─ ─ ─ ─ ─ ─ ─
┌──────────────────────▼──────────────────────┐
│ app.py (Flask) │
│ │
│ GET / → redirect to /fees │
│ GET /fees → fees.html │
│ GET /reconcile → reconcile.html │
│ GET /payments → payments.html │
│ GET /qr → PNG QR code (SPD format) │
└─────────────────────────────────────────────┘
```
## Module Dependency Graph
```
app.py
├── attendance.py
│ └── (stdlib: csv, urllib, datetime)
└── match_payments.py
├── attendance.py
├── czech_utils.py
│ └── (stdlib: re, unicodedata)
└── sync_fio_to_sheets.py (for get_sheets_service, DEFAULT_SPREADSHEET_ID)
└── fio_utils.py
└── (stdlib: json, urllib, html.parser, datetime)
infer_payments.py
├── sync_fio_to_sheets.py
├── match_payments.py
└── attendance.py
calculate_fees.py
└── attendance.py
```
### Import Relationships
| Module | Imports from |
|--------|-------------|
| `app.py` | `attendance` (`get_members_with_fees`, `SHEET_ID`), `match_payments` (`reconcile`, `fetch_sheet_data`, `fetch_exceptions`, `normalize`, `DEFAULT_SPREADSHEET_ID`) |
| `match_payments.py` | `attendance` (`get_members_with_fees`), `czech_utils` (`normalize`, `parse_month_references`), `sync_fio_to_sheets` (`get_sheets_service`, `DEFAULT_SPREADSHEET_ID`) |
| `infer_payments.py` | `sync_fio_to_sheets` (`get_sheets_service`, `DEFAULT_SPREADSHEET_ID`), `match_payments` (`infer_transaction_details`), `attendance` (`get_members_with_fees`) |
| `sync_fio_to_sheets.py` | `fio_utils` (`fetch_transactions`) |
| `calculate_fees.py` | `attendance` (`get_members_with_fees`) |
## Data Flow Patterns
### Pattern 1: Sync & Enrich (Batch Pipeline)
This is the primary workflow for keeping the payments ledger up to date:
```
1. make sync 2. make infer
┌──────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Fio │───▶│ Payments │ │ Payments │───▶│ Payments │
│ Bank │ │ Sheet │ │ Sheet │ │ Sheet │
└──────┘ │ (append) │ │ (read) │ │ (update) │
└──────────┘ └──────────┘ └──────────┘
- Fetches last 30 days - Reads empty Person/Purpose rows
- SHA-256 dedup prevents - Uses name matching + Czech month
duplicate entries parsing to auto-fill
- Marks uncertain matches with [?]
```
### Pattern 2: Real-Time Rendering (Web Dashboard)
Every web request triggers a fresh data fetch — no caching layer exists:
```
Browser Request → Flask Route → Fetch (Google Sheets API/CSV) → Process → Render HTML
│ │
│ attendance.py │ reconcile()
│ fetch_sheet_data() │ or direct
│ fetch_exceptions() │ formatting
▼ ▼
~1-3 seconds Template with
(network I/O) inline CSS + JS
```
### Pattern 3: QR Code Generation (On-Demand)
```
Browser clicks "Pay" → GET /qr?account=...&amount=...&message=... → SPD QR PNG
qrcode lib
generates
in-memory PNG
```
## Key Design Patterns
### 1. Google Sheets as Database
Instead of a traditional database, the system uses two Google Sheets:
| Sheet | Purpose | Access Method |
|-------|---------|---------------|
| Attendance Sheet (`1E2e_gT...`) | Member names, tiers, practice dates, attendance marks | Public CSV export (no auth needed) |
| Payments Sheet (`1Om0YPo...`) | Bank transactions with Person/Purpose annotations | Google Sheets API (service account auth) |
**Trade-offs**:
- ✅ Non-technical users can view and edit data directly
- ✅ No database setup or maintenance
- ✅ Built-in audit trail (Google Sheets version history)
- ❌ Every page load incurs 1-3s of API latency
- ❌ No complex queries or indexing
- ❌ Rate limits on Google Sheets API
### 2. Dual-Mode Bank Access
`fio_utils.py` implements a transparent fallback pattern:
```python
def fetch_transactions(date_from, date_to):
token = os.environ.get("FIO_API_TOKEN", "").strip()
if token:
return fetch_transactions_api(token, date_from, date_to) # Structured JSON
return fetch_transactions_transparent(...) # HTML scraping
```
The API provides richer data (sender account numbers, stable bank IDs) but requires a token. The transparent page is always available but lacks some fields.
### 3. Name Matching with Confidence Levels
The reconciliation engine uses a multi-tier matching strategy:
| Priority | Method | Confidence | Example |
|----------|--------|-----------|---------|
| 1 | Full name match | `auto` | "František Vrbík" in message |
| 2 | Both first + last name (any order) | `auto` | "Vrbík František" |
| 3 | Nickname match | `auto` | "(Štrúdl)" from member list |
| 4 | Last name only (≥4 chars, not common) | `review` | "Vrbík" alone |
| 5 | First name only (≥3 chars) | `review` | "František" alone |
When both `auto` and `review` matches exist, `review` matches are discarded. This prevents false positives from generic first names.
### 4. Exception System
Fee overrides are managed through an `exceptions` sheet tab in the Payments Google Sheet:
| Column | Content |
|--------|---------|
| Name | Member name |
| Period | Month (YYYY-MM) |
| Amount | Overridden fee in CZK |
| Note | Reason for the exception |
Exceptions are applied during reconciliation, replacing the attendance-calculated fee with the manually specified amount.
### 5. Render-Time Performance Tracking
Every page includes a performance breakdown:
```python
@app.before_request
def start_timer():
g.start_time = time.perf_counter()
g.steps = []
def record_step(name):
g.steps.append((name, time.perf_counter()))
```
The footer displays total render time and, on click, reveals a detailed breakdown (e.g., `fetch_members:0.892s | fetch_payments:1.205s | reconcile:0.003s | render:0.015s`).
## Security Considerations
| Concern | Mitigation |
|---------|-----------|
| PII in git | `.secret/` is gitignored; all data fetched at runtime |
| Google API credentials | Service account JSON stored in `.secret/`, mounted as Docker secret |
| Bank API token | Passed via `FIO_API_TOKEN` environment variable, never committed |
| Web app authentication | **None currently** — the app has no auth layer |
| CSRF protection | **None currently** — Flask default (no POST routes exist) |
## Scalability Notes
This system is purpose-built for a small club (~20-40 members). It makes deliberate trade-offs favoring simplicity over scale:
- **No caching**: Every page load fetches live data from Google Sheets (1-3s latency). For a single-user admin dashboard, this is acceptable.
- **No background workers**: Sync and inference are manual `make` commands, not scheduled jobs.
- **No database**: Google Sheets handles 10s of members and 100s of transactions with ease.
- **Single-process Flask**: The built-in development server runs directly in production (via Docker). For this use case, this is intentional — it's a personal tool, not a public service.
---
*Architecture documentation generated from comprehensive code analysis on 2026-03-03.*