Files
fuj-management/docs/by-claude-opus/data-model.md
Jan Novak 9b99f6d33b
All checks were successful
Deploy to K8s / deploy (push) Successful in 8s
docs: experiment with generated documentation, let's keep it in git for
now
2026-03-11 11:57:30 +01:00

202 lines
7.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Data Model
## Overview
FUJ Management operates on two Google Sheets and an external bank account. There is no local database — all persistent data lives in Google Sheets, and all member data is fetched at runtime (never committed to git).
## External Data Sources
### 1. Attendance Google Sheet
| Property | Value |
|----------|-------|
| **Sheet ID** | `1E2e_gT_K5AwSRCDLDTa2UetZTkHmBOcz0kFbBUNUNBA` |
| **Access** | Public CSV export (no authentication required) |
| **Purpose** | Member roster, weekly practice attendance marks |
| **Scope** | Tuesday practices (20:3022:00) |
#### Schema
```
Row 1: [Title] [blank] [blank] [10/1/2025] [10/8/2025] [10/15/2025] ...
Row 2: Venue per date (ignored by the system)
Row 3: Subtotals per date (ignored by the system)
Row 4+: [Name] [Tier] [Total] [TRUE/FALSE] [TRUE/FALSE] ...
...
Row N: # last line (sentinel — stops parsing)
```
| Column | Index | Content | Example |
|--------|-------|---------|---------|
| A | 0 | Member name | `Jan Novák` |
| B | 1 | Tier code | `A`, `J`, or `X` |
| C | 2 | Total attendance (auto-calculated, ignored by the system) | `12` |
| D+ | 3+ | Attendance per date | `TRUE` or `FALSE` |
#### Tier Codes
| Code | Meaning | Pays fees? |
|------|---------|-----------|
| `A` | Adult | Yes — calculated from this sheet |
| `J` | Junior | No — managed via a separate sheet |
| `X` | Exempt | No |
#### Sentinel Row
The system stops parsing member rows when it encounters a row whose first column contains `# last line` (case-insensitive). Rows starting with `#` are also skipped as comments.
### 2. Payments Google Sheet
| Property | Value |
|----------|-------|
| **Sheet ID** | `1Om0YPoDVCH5cV8BrNz5LG5eR5MMU05ypQC7UMN1xn_Y` |
| **Access** | Google Sheets API (service account authentication) |
| **Purpose** | Intermediary ledger for bank transactions + manual annotations |
| **Managed by** | `sync_fio_to_sheets.py` (append), `infer_payments.py` (update) |
#### Main Sheet Schema (Columns AK)
| Column | Label | Populated by | Description |
|--------|-------|-------------|-------------|
| A | Date | `sync` | Transaction date (`YYYY-MM-DD`) |
| B | Amount | `sync` | Bank transaction amount in CZK |
| C | manual fix | Human | If non-empty, `infer` will skip this row |
| D | Person | `infer` or human | Member name(s), comma-separated for multi-person payments |
| E | Purpose | `infer` or human | Month(s) covered, e.g. `2026-01` or `2026-01, 2026-02` |
| F | Inferred Amount | `infer` or human | Amount to use for reconciliation (may differ from bank amount) |
| G | Sender | `sync` | Bank sender name/account |
| H | VS | `sync` | Variable symbol |
| I | Message | `sync` | Payment message for recipient |
| J | Bank ID | `sync` | Fio transaction ID (API only) |
| K | Sync ID | `sync` | SHA-256 deduplication hash |
#### Exceptions Sheet Tab
A separate tab named `exceptions` in the same spreadsheet, used for manual fee overrides:
| Column | Label | Content |
|--------|-------|---------|
| A | Name | Member name (plain text) |
| B | Period | Month (`YYYY-MM`) |
| C | Amount | Overridden fee in CZK |
| D | Note | Reason for override (optional) |
The first row is assumed to be a header and is skipped. Name and period values are normalized (diacritics stripped, lowercased) for matching.
### 3. Fio Bank Account
| Property | Value |
|----------|-------|
| **Account number** | `2800359168/2010` |
| **IBAN** | `CZ8520100000002800359168` |
| **Type** | Transparent account |
| **Owner** | Nathan Heilmann |
| **Public URL** | `https://ib.fio.cz/ib/transparent?a=2800359168` |
#### Access Methods
| Method | Trigger | Data richness |
|--------|---------|--------------|
| REST API | `FIO_API_TOKEN` env var set | Full data: sender account, bank ID, user identification, currency |
| HTML scraping | `FIO_API_TOKEN` not set | Partial: date, amount, sender name, message, VS/KS/SS |
#### API Rate Limit
The Fio REST API allows 1 request per 30 seconds per token.
## Fee Calculation Rules
Fees apply only to **tier A (Adult)** members. They are calculated per calendar month based on Tuesday practice attendance:
| Practices attended | Monthly fee |
|-------------------|-------------|
| 0 | 0 CZK |
| 1 | 200 CZK |
| 2+ | 750 CZK |
### Exception Overrides
The fee can be manually overridden per member per month via the `exceptions` tab. When an exception exists:
- The `expected` amount in reconciliation uses the exception amount
- The `original_expected` amount preserves the attendance-based calculation
- The override is displayed in amber/orange in the web UI
### Advance Payments
If a payment references a month not yet covered by attendance data:
- It is tracked as **credit** on the member's account
- Credits are added to the total balance
- When attendance data becomes available for that month, the credit effectively offsets the expected fee
## Reconciliation Data Model
The `reconcile()` function returns this structure:
```python
{
"members": {
"Jan Novák": {
"tier": "A",
"months": {
"2026-01": {
"expected": 750, # Fee after exception application
"original_expected": 750, # Attendance-based fee
"attendance_count": 4, # How many times they came
"exception": None, # or {"amount": 400, "note": "..."}
"paid": 750.0, # Total matched payments
"transactions": [ # Individual payment records
{
"amount": 750.0,
"date": "2026-01-15",
"sender": "Jan Novák",
"message": "leden",
"confidence": "auto"
}
]
}
},
"total_balance": 0 # sum(paid - expected) across all months + off-window credits
}
},
"unmatched": [ # Transactions that couldn't be assigned
{
"date": "2026-01-20",
"amount": 500,
"sender": "Unknown",
"message": "dar"
}
],
"credits": { # Alias for positive total_balance entries
"Jan Novák": 200
}
}
```
## Sync ID Generation
The deduplication key for bank transactions is a SHA-256 hash of:
```
sha256("date|amount|currency|sender|vs|message|bank_id")
```
All values are lowercased before hashing. This ensures:
- Same transaction fetched twice produces the same ID
- Two payments on the same day with different amounts/senders produce different IDs
- The hash is stable across API and HTML scraping modes (shared fields)
## Date Handling
| Source | Format | Normalization |
|--------|--------|--------------|
| Attendance Sheet header | `M/D/YYYY` (US format) | `datetime.strptime(raw, "%m/%d/%Y")` |
| Fio API | `YYYY-MM-DD+HHMM` | Take first 10 characters |
| Fio transparent page | `DD.MM.YYYY` | `datetime.strptime(raw, "%d.%m.%Y")` |
| Google Sheets (unformatted) | Serial number (days since 1899-12-30) | `datetime(1899, 12, 30) + timedelta(days=val)` |
All internal date representation uses `YYYY-MM-DD` format. Month keys use `YYYY-MM`.
---
*Data model documentation generated from comprehensive code analysis on 2026-03-03.*