Files
fuj-management/docs/by-claude-opus/data-model.md
Jan Novak 9b99f6d33b
All checks were successful
Deploy to K8s / deploy (push) Successful in 8s
docs: experiment with generated documentation, let's keep it in git for
now
2026-03-11 11:57:30 +01:00

7.2 KiB
Raw Blame History

Data Model

Overview

FUJ Management operates on two Google Sheets and an external bank account. There is no local database — all persistent data lives in Google Sheets, and all member data is fetched at runtime (never committed to git).

External Data Sources

1. Attendance Google Sheet

Property Value
Sheet ID 1E2e_gT_K5AwSRCDLDTa2UetZTkHmBOcz0kFbBUNUNBA
Access Public CSV export (no authentication required)
Purpose Member roster, weekly practice attendance marks
Scope Tuesday practices (20:3022:00)

Schema

Row 1:  [Title]  [blank]  [blank]  [10/1/2025]  [10/8/2025]  [10/15/2025]  ...
Row 2:  Venue per date (ignored by the system)
Row 3:  Subtotals per date (ignored by the system)
Row 4+: [Name]   [Tier]   [Total]  [TRUE/FALSE]  [TRUE/FALSE] ...
...
Row N:  # last line  (sentinel — stops parsing)
Column Index Content Example
A 0 Member name Jan Novák
B 1 Tier code A, J, or X
C 2 Total attendance (auto-calculated, ignored by the system) 12
D+ 3+ Attendance per date TRUE or FALSE

Tier Codes

Code Meaning Pays fees?
A Adult Yes — calculated from this sheet
J Junior No — managed via a separate sheet
X Exempt No

Sentinel Row

The system stops parsing member rows when it encounters a row whose first column contains # last line (case-insensitive). Rows starting with # are also skipped as comments.

2. Payments Google Sheet

Property Value
Sheet ID 1Om0YPoDVCH5cV8BrNz5LG5eR5MMU05ypQC7UMN1xn_Y
Access Google Sheets API (service account authentication)
Purpose Intermediary ledger for bank transactions + manual annotations
Managed by sync_fio_to_sheets.py (append), infer_payments.py (update)

Main Sheet Schema (Columns AK)

Column Label Populated by Description
A Date sync Transaction date (YYYY-MM-DD)
B Amount sync Bank transaction amount in CZK
C manual fix Human If non-empty, infer will skip this row
D Person infer or human Member name(s), comma-separated for multi-person payments
E Purpose infer or human Month(s) covered, e.g. 2026-01 or 2026-01, 2026-02
F Inferred Amount infer or human Amount to use for reconciliation (may differ from bank amount)
G Sender sync Bank sender name/account
H VS sync Variable symbol
I Message sync Payment message for recipient
J Bank ID sync Fio transaction ID (API only)
K Sync ID sync SHA-256 deduplication hash

Exceptions Sheet Tab

A separate tab named exceptions in the same spreadsheet, used for manual fee overrides:

Column Label Content
A Name Member name (plain text)
B Period Month (YYYY-MM)
C Amount Overridden fee in CZK
D Note Reason for override (optional)

The first row is assumed to be a header and is skipped. Name and period values are normalized (diacritics stripped, lowercased) for matching.

3. Fio Bank Account

Property Value
Account number 2800359168/2010
IBAN CZ8520100000002800359168
Type Transparent account
Owner Nathan Heilmann
Public URL https://ib.fio.cz/ib/transparent?a=2800359168

Access Methods

Method Trigger Data richness
REST API FIO_API_TOKEN env var set Full data: sender account, bank ID, user identification, currency
HTML scraping FIO_API_TOKEN not set Partial: date, amount, sender name, message, VS/KS/SS

API Rate Limit

The Fio REST API allows 1 request per 30 seconds per token.

Fee Calculation Rules

Fees apply only to tier A (Adult) members. They are calculated per calendar month based on Tuesday practice attendance:

Practices attended Monthly fee
0 0 CZK
1 200 CZK
2+ 750 CZK

Exception Overrides

The fee can be manually overridden per member per month via the exceptions tab. When an exception exists:

  • The expected amount in reconciliation uses the exception amount
  • The original_expected amount preserves the attendance-based calculation
  • The override is displayed in amber/orange in the web UI

Advance Payments

If a payment references a month not yet covered by attendance data:

  • It is tracked as credit on the member's account
  • Credits are added to the total balance
  • When attendance data becomes available for that month, the credit effectively offsets the expected fee

Reconciliation Data Model

The reconcile() function returns this structure:

{
    "members": {
        "Jan Novák": {
            "tier": "A",
            "months": {
                "2026-01": {
                    "expected": 750,           # Fee after exception application
                    "original_expected": 750,  # Attendance-based fee
                    "attendance_count": 4,     # How many times they came
                    "exception": None,         # or {"amount": 400, "note": "..."}
                    "paid": 750.0,             # Total matched payments
                    "transactions": [          # Individual payment records
                        {
                            "amount": 750.0,
                            "date": "2026-01-15",
                            "sender": "Jan Novák",
                            "message": "leden",
                            "confidence": "auto"
                        }
                    ]
                }
            },
            "total_balance": 0  # sum(paid - expected) across all months + off-window credits
        }
    },
    "unmatched": [              # Transactions that couldn't be assigned
        {
            "date": "2026-01-20",
            "amount": 500,
            "sender": "Unknown",
            "message": "dar"
        }
    ],
    "credits": {                # Alias for positive total_balance entries
        "Jan Novák": 200
    }
}

Sync ID Generation

The deduplication key for bank transactions is a SHA-256 hash of:

sha256("date|amount|currency|sender|vs|message|bank_id")

All values are lowercased before hashing. This ensures:

  • Same transaction fetched twice produces the same ID
  • Two payments on the same day with different amounts/senders produce different IDs
  • The hash is stable across API and HTML scraping modes (shared fields)

Date Handling

Source Format Normalization
Attendance Sheet header M/D/YYYY (US format) datetime.strptime(raw, "%m/%d/%Y")
Fio API YYYY-MM-DD+HHMM Take first 10 characters
Fio transparent page DD.MM.YYYY datetime.strptime(raw, "%d.%m.%Y")
Google Sheets (unformatted) Serial number (days since 1899-12-30) datetime(1899, 12, 30) + timedelta(days=val)

All internal date representation uses YYYY-MM-DD format. Month keys use YYYY-MM.


Data model documentation generated from comprehensive code analysis on 2026-03-03.