docs: experiment with generated documentation, let's keep it in git for
All checks were successful
Deploy to K8s / deploy (push) Successful in 8s
All checks were successful
Deploy to K8s / deploy (push) Successful in 8s
now
This commit is contained in:
268
docs/by-claude-opus/architecture.md
Normal file
268
docs/by-claude-opus/architecture.md
Normal file
@@ -0,0 +1,268 @@
|
||||
# System Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
FUJ Management follows a **pipeline architecture** where data flows from external sources (Google Sheets, Fio Bank) through processing scripts into a web dashboard. There is no central database — Google Sheets serves as the persistent data store, and the Flask app renders views by fetching and processing data on every request.
|
||||
|
||||
## Component Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ EXTERNAL DATA SOURCES │
|
||||
│ │
|
||||
│ ┌──────────────────┐ ┌──────────────────────┐ │
|
||||
│ │ Attendance Sheet │ │ Fio Bank Account │ │
|
||||
│ │ (Google Sheets) │ │ │ │
|
||||
│ │ │ │ ┌────────────────┐ │ │
|
||||
│ │ ID: 1E2e_gT... │ │ │ REST API │ │ │
|
||||
│ │ │ │ │ (JSON, w/token)│ │ │
|
||||
│ │ CSV export (pub) │ │ ├────────────────┤ │ │
|
||||
│ │ │ │ │ Transparent │ │ │
|
||||
│ └────────┬─────────┘ │ │ page (HTML) │ │ │
|
||||
│ │ │ └───────┬────────┘ │ │
|
||||
│ │ └──────────┼──────────┘ │
|
||||
└───────────┼───────────────────────┼────────────┘
|
||||
│ │
|
||||
─ ─ ─ ─ ─ ─ ┼ ─ ─ DATA INGESTION ─ ┼ ─ ─ ─ ─ ─
|
||||
│ │
|
||||
┌───────────▼──────┐ ┌───────────▼──────────┐
|
||||
│ attendance.py │ │ fio_utils.py │
|
||||
│ │ │ │
|
||||
│ fetch_csv() │ │ fetch_transactions() │
|
||||
│ parse_dates() │ │ FioTableParser │
|
||||
│ group_by_month() │ │ parse_czech_amount() │
|
||||
│ calculate_fee() │ │ parse_czech_date() │
|
||||
│ get_members() │ │ │
|
||||
│ get_members_ │ │ API + HTML fallback │
|
||||
│ with_fees() │ │ │
|
||||
└───────────┬──────┘ └───────────┬──────────┘
|
||||
│ │
|
||||
─ ─ ─ ─ ─ ─ ┼ ─ ─ PROCESSING ─ ─ ─ ┼ ─ ─ ─ ─ ─
|
||||
│ │
|
||||
│ ┌─────────────▼──────────┐
|
||||
│ │ sync_fio_to_sheets.py │ ──▶ Payments Sheet
|
||||
│ │ │ (Google Sheets)
|
||||
│ │ generate_sync_id() │
|
||||
│ │ sort_sheet_by_date() │
|
||||
│ │ get_sheets_service() │
|
||||
│ └────────────────────────┘
|
||||
│ │
|
||||
│ ┌─────────────▼──────────┐
|
||||
│ │ infer_payments.py │ ──▶ Writes back to
|
||||
│ │ │ Payments Sheet
|
||||
│ │ infer Person/Purpose/ │
|
||||
│ │ Amount for empty rows │
|
||||
│ └────────────────────────┘
|
||||
│ │
|
||||
│ ┌──────────────────▼──────────┐
|
||||
│ │ czech_utils.py │
|
||||
│ │ │
|
||||
│ │ normalize() — strip │
|
||||
│ │ diacritics │
|
||||
│ │ parse_month_references() │
|
||||
│ │ CZECH_MONTHS dict │
|
||||
│ └─────────────────────────────┘
|
||||
│ │
|
||||
─ ─ ─ ─ ─ ─ ┼ ─ RECONCILIATION ─ ─┼ ─ ─ ─ ─ ─
|
||||
│ │
|
||||
┌─────────▼───────────────────────▼───────────┐
|
||||
│ match_payments.py │
|
||||
│ │
|
||||
│ _build_name_variants() — name matching │
|
||||
│ match_members() — fuzzy match │
|
||||
│ infer_transaction_details() │
|
||||
│ fetch_sheet_data() — read payments │
|
||||
│ fetch_exceptions() — fee overrides │
|
||||
│ reconcile() — CORE ENGINE │
|
||||
│ print_report() — CLI output │
|
||||
└──────────────────────┬──────────────────────┘
|
||||
│
|
||||
─ ─ ─ ─ ─ ─ ─ PRESENTATION ┼ ─ ─ ─ ─ ─ ─ ─ ─ ─
|
||||
│
|
||||
┌──────────────────────▼──────────────────────┐
|
||||
│ app.py (Flask) │
|
||||
│ │
|
||||
│ GET / → redirect to /fees │
|
||||
│ GET /fees → fees.html │
|
||||
│ GET /reconcile → reconcile.html │
|
||||
│ GET /payments → payments.html │
|
||||
│ GET /qr → PNG QR code (SPD format) │
|
||||
└─────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Module Dependency Graph
|
||||
|
||||
```
|
||||
app.py
|
||||
├── attendance.py
|
||||
│ └── (stdlib: csv, urllib, datetime)
|
||||
└── match_payments.py
|
||||
├── attendance.py
|
||||
├── czech_utils.py
|
||||
│ └── (stdlib: re, unicodedata)
|
||||
└── sync_fio_to_sheets.py (for get_sheets_service, DEFAULT_SPREADSHEET_ID)
|
||||
└── fio_utils.py
|
||||
└── (stdlib: json, urllib, html.parser, datetime)
|
||||
|
||||
infer_payments.py
|
||||
├── sync_fio_to_sheets.py
|
||||
├── match_payments.py
|
||||
└── attendance.py
|
||||
|
||||
calculate_fees.py
|
||||
└── attendance.py
|
||||
```
|
||||
|
||||
### Import Relationships
|
||||
|
||||
| Module | Imports from |
|
||||
|--------|-------------|
|
||||
| `app.py` | `attendance` (`get_members_with_fees`, `SHEET_ID`), `match_payments` (`reconcile`, `fetch_sheet_data`, `fetch_exceptions`, `normalize`, `DEFAULT_SPREADSHEET_ID`) |
|
||||
| `match_payments.py` | `attendance` (`get_members_with_fees`), `czech_utils` (`normalize`, `parse_month_references`), `sync_fio_to_sheets` (`get_sheets_service`, `DEFAULT_SPREADSHEET_ID`) |
|
||||
| `infer_payments.py` | `sync_fio_to_sheets` (`get_sheets_service`, `DEFAULT_SPREADSHEET_ID`), `match_payments` (`infer_transaction_details`), `attendance` (`get_members_with_fees`) |
|
||||
| `sync_fio_to_sheets.py` | `fio_utils` (`fetch_transactions`) |
|
||||
| `calculate_fees.py` | `attendance` (`get_members_with_fees`) |
|
||||
|
||||
## Data Flow Patterns
|
||||
|
||||
### Pattern 1: Sync & Enrich (Batch Pipeline)
|
||||
|
||||
This is the primary workflow for keeping the payments ledger up to date:
|
||||
|
||||
```
|
||||
1. make sync 2. make infer
|
||||
┌──────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
|
||||
│ Fio │───▶│ Payments │ │ Payments │───▶│ Payments │
|
||||
│ Bank │ │ Sheet │ │ Sheet │ │ Sheet │
|
||||
└──────┘ │ (append) │ │ (read) │ │ (update) │
|
||||
└──────────┘ └──────────┘ └──────────┘
|
||||
|
||||
- Fetches last 30 days - Reads empty Person/Purpose rows
|
||||
- SHA-256 dedup prevents - Uses name matching + Czech month
|
||||
duplicate entries parsing to auto-fill
|
||||
- Marks uncertain matches with [?]
|
||||
```
|
||||
|
||||
### Pattern 2: Real-Time Rendering (Web Dashboard)
|
||||
|
||||
Every web request triggers a fresh data fetch — no caching layer exists:
|
||||
|
||||
```
|
||||
Browser Request → Flask Route → Fetch (Google Sheets API/CSV) → Process → Render HTML
|
||||
│ │
|
||||
│ attendance.py │ reconcile()
|
||||
│ fetch_sheet_data() │ or direct
|
||||
│ fetch_exceptions() │ formatting
|
||||
▼ ▼
|
||||
~1-3 seconds Template with
|
||||
(network I/O) inline CSS + JS
|
||||
```
|
||||
|
||||
### Pattern 3: QR Code Generation (On-Demand)
|
||||
|
||||
```
|
||||
Browser clicks "Pay" → GET /qr?account=...&amount=...&message=... → SPD QR PNG
|
||||
│
|
||||
qrcode lib
|
||||
generates
|
||||
in-memory PNG
|
||||
```
|
||||
|
||||
## Key Design Patterns
|
||||
|
||||
### 1. Google Sheets as Database
|
||||
|
||||
Instead of a traditional database, the system uses two Google Sheets:
|
||||
|
||||
| Sheet | Purpose | Access Method |
|
||||
|-------|---------|---------------|
|
||||
| Attendance Sheet (`1E2e_gT...`) | Member names, tiers, practice dates, attendance marks | Public CSV export (no auth needed) |
|
||||
| Payments Sheet (`1Om0YPo...`) | Bank transactions with Person/Purpose annotations | Google Sheets API (service account auth) |
|
||||
|
||||
**Trade-offs**:
|
||||
- ✅ Non-technical users can view and edit data directly
|
||||
- ✅ No database setup or maintenance
|
||||
- ✅ Built-in audit trail (Google Sheets version history)
|
||||
- ❌ Every page load incurs 1-3s of API latency
|
||||
- ❌ No complex queries or indexing
|
||||
- ❌ Rate limits on Google Sheets API
|
||||
|
||||
### 2. Dual-Mode Bank Access
|
||||
|
||||
`fio_utils.py` implements a transparent fallback pattern:
|
||||
|
||||
```python
|
||||
def fetch_transactions(date_from, date_to):
|
||||
token = os.environ.get("FIO_API_TOKEN", "").strip()
|
||||
if token:
|
||||
return fetch_transactions_api(token, date_from, date_to) # Structured JSON
|
||||
return fetch_transactions_transparent(...) # HTML scraping
|
||||
```
|
||||
|
||||
The API provides richer data (sender account numbers, stable bank IDs) but requires a token. The transparent page is always available but lacks some fields.
|
||||
|
||||
### 3. Name Matching with Confidence Levels
|
||||
|
||||
The reconciliation engine uses a multi-tier matching strategy:
|
||||
|
||||
| Priority | Method | Confidence | Example |
|
||||
|----------|--------|-----------|---------|
|
||||
| 1 | Full name match | `auto` | "František Vrbík" in message |
|
||||
| 2 | Both first + last name (any order) | `auto` | "Vrbík František" |
|
||||
| 3 | Nickname match | `auto` | "(Štrúdl)" from member list |
|
||||
| 4 | Last name only (≥4 chars, not common) | `review` | "Vrbík" alone |
|
||||
| 5 | First name only (≥3 chars) | `review` | "František" alone |
|
||||
|
||||
When both `auto` and `review` matches exist, `review` matches are discarded. This prevents false positives from generic first names.
|
||||
|
||||
### 4. Exception System
|
||||
|
||||
Fee overrides are managed through an `exceptions` sheet tab in the Payments Google Sheet:
|
||||
|
||||
| Column | Content |
|
||||
|--------|---------|
|
||||
| Name | Member name |
|
||||
| Period | Month (YYYY-MM) |
|
||||
| Amount | Overridden fee in CZK |
|
||||
| Note | Reason for the exception |
|
||||
|
||||
Exceptions are applied during reconciliation, replacing the attendance-calculated fee with the manually specified amount.
|
||||
|
||||
### 5. Render-Time Performance Tracking
|
||||
|
||||
Every page includes a performance breakdown:
|
||||
|
||||
```python
|
||||
@app.before_request
|
||||
def start_timer():
|
||||
g.start_time = time.perf_counter()
|
||||
g.steps = []
|
||||
|
||||
def record_step(name):
|
||||
g.steps.append((name, time.perf_counter()))
|
||||
```
|
||||
|
||||
The footer displays total render time and, on click, reveals a detailed breakdown (e.g., `fetch_members:0.892s | fetch_payments:1.205s | reconcile:0.003s | render:0.015s`).
|
||||
|
||||
## Security Considerations
|
||||
|
||||
| Concern | Mitigation |
|
||||
|---------|-----------|
|
||||
| PII in git | `.secret/` is gitignored; all data fetched at runtime |
|
||||
| Google API credentials | Service account JSON stored in `.secret/`, mounted as Docker secret |
|
||||
| Bank API token | Passed via `FIO_API_TOKEN` environment variable, never committed |
|
||||
| Web app authentication | **None currently** — the app has no auth layer |
|
||||
| CSRF protection | **None currently** — Flask default (no POST routes exist) |
|
||||
|
||||
## Scalability Notes
|
||||
|
||||
This system is purpose-built for a small club (~20-40 members). It makes deliberate trade-offs favoring simplicity over scale:
|
||||
|
||||
- **No caching**: Every page load fetches live data from Google Sheets (1-3s latency). For a single-user admin dashboard, this is acceptable.
|
||||
- **No background workers**: Sync and inference are manual `make` commands, not scheduled jobs.
|
||||
- **No database**: Google Sheets handles 10s of members and 100s of transactions with ease.
|
||||
- **Single-process Flask**: The built-in development server runs directly in production (via Docker). For this use case, this is intentional — it's a personal tool, not a public service.
|
||||
|
||||
---
|
||||
|
||||
*Architecture documentation generated from comprehensive code analysis on 2026-03-03.*
|
||||
Reference in New Issue
Block a user