Files
fuj-management/docs/plans/2026-05-06-2341-go-m4-io-layer.md
Jan Novak 6465e2a221
All checks were successful
Deploy to K8s / deploy (push) Successful in 11s
feat(go): IO layer behind interfaces (M4)
- io/attendance: CSV-over-public-URL client + Fake for adult/junior tabs
- io/drive: Drive v3 modifiedTime client + Fake
- io/sheets: Sheets v4 client (GetValues/AppendValues/BatchUpdateValues/
  WriteHeader/SortByDateColumn) + Fake with call-capture
- io/cache: Drive-modifiedTime-gated FileCache; two TTL knobs; atomic
  writes; generic Get[T]; Python-compatible JSON format; Flush()
- io/fio: Client interface backed by Fio REST API (apiClient) and HTML
  scraper (transparentClient); Fake; testdata fixtures
- membership/sources: NewSources wires attendance CSV + Sheets + cache
  into LoadAdults/LoadJuniors/LoadTransactions/LoadExceptions; Czech
  month parsing + merged-month maps
- banksync: SyncToSheets (SHA-256 dedup, optional sort) and
  InferPayments ([?] review prefix, dry-run) — tested with fakes
- cmd/fuj: sync and infer subcommands wired; fees and reconcile use
  real NewSources; go.mod gains google.golang.org/api + x/net
- gofumpt extra-rules applied across all packages; lint clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-07 01:05:59 +02:00

16 KiB
Raw Permalink Blame History

Plan: Go rewrite — M4 IO layer behind interfaces

Companion to 2026-05-03-2349-go-backend-rewrite.md and 2026-05-03-2349-go-backend-rewrite-progress.md.

Context

M1M3 are merged: skeleton + tooling, every pure-domain function ported and parity-tested against PII-scrubbed fixtures, and the fuj fees / fuj reconcile subcommands wired but stubbed (membership.NewStubSources() returns ErrIOPending for every loader). M4's job is to replace that stub with real IO: read attendance CSVs, read the payments sheet + exceptions tab, fetch Drive modifiedTime for cache gating, fetch Fio bank transactions, and append/update rows on the payments sheet — all behind narrow Go interfaces that have in-memory fakes for tests.

Once M4 lands, fuj fees, fuj reconcile, fuj sync, and fuj infer all work end-to-end against the real Google Sheets and the real Fio account, and M5 can start porting the JSON API on top of that IO.

User-confirmed scope choices for this milestone:

  • No live integration tests. Fakes-only at unit level; live verification deferred to manual smoke during M7.
  • Three PRs (sheets/drive/cache → fio/sync → infer), one per major area, each independently reviewable.
  • Attendance stays on CSV-via-public-URL — matches Python, no extra service-account grant needed.

Approach

Layering

internal/io/         ← raw, narrow clients (one per external system)
  sheets/            ← typed wrapper around google.golang.org/api/sheets/v4
  drive/             ← Drive v3, only ModifiedTime
  attendance/        ← CSV-via-public-URL fetcher (no auth, no Sheets API)
  fio/               ← FioClient interface + apiClient + transparentClient
  cache/             ← FileCache: modifiedTime gate + two-TTL fallback + atomic write

internal/services/membership/   ← already exists; M4 adds adapters that satisfy
                                  AttendanceLoader / TransactionLoader / ExceptionLoader
                                  by composing io/sheets + io/drive + io/cache + io/attendance.

internal/services/banksync/     ← new: SyncToSheets (M4.7) + InferPayments (M4.8)
                                  composing fio + sheets + attendance loaders.

The existing interfaces in go/internal/services/membership/loader.go (AttendanceLoader, TransactionLoader, ExceptionLoader, Sources) are the seam — M4 adds a NewSources(cfg config.Config) (Sources, error) constructor next to NewStubSources(), and cmd/fuj/main.go swaps the stub for it.

Auth — service-account only

Drop the OAuth+token.pickle path entirely (the production already uses a service account; the fallback only existed because the original Python script ran from a developer laptop). Sheets and Drive both authenticate via option.WithCredentialsFile(cfg.CredentialsPath) plus option.WithScopes(...). Single shared *http.Client per backend with a 10s timeout (matches DRIVE_TIMEOUT).

Cache shape

Match Python's wire format so the tmp/*_cache.json directory is shared safely while both backends run side-by-side:

{ "modifiedTime": "<RFC3339>", "data": <list|object>, "cachedAt": "<RFC3339>" }

Improvements over Python:

  • Atomic write: marshal → os.WriteFile(path+".tmp", ..., 0o600)os.Rename. Python's plain truncate-write stays as-is until M8.
  • The two TTLs (CacheTTL and CacheAPICheckTTL) live in config.Config already; only the CacheDir field is new.

The four cache keys mirror Python's CACHE_SHEET_MAP: attendance_regular, attendance_juniors, exceptions_dict, payments_transactions → maps to either AttendanceSheetID or PaymentsSheetID.

When Drive fails, fall back to a synthetic key fmt.Sprintf("ttl-5m-%d", time.Now().Unix()/300) so cache still keys deterministically per 5-min bucket (same as Python).

Fio: two impls behind one interface

type Client interface {
    FetchTransactions(ctx context.Context, from, to time.Time) ([]Transaction, error)
}

apiClient (when cfg.FioAPIToken != "") hits https://fioapi.fio.cz/v1/rest/periods/{token}/{from}/{to}/transactions.json, unmarshals via a typed struct, and maps column0..column22 to fields per scripts/fio_utils.py. Negative-amount rows dropped (matches Python).

transparentClient (fallback) GETs https://ib.fio.cz/ib/transparent?a={accountNum}&f={DD.MM.YYYY}&t={DD.MM.YYYY} and walks the response with golang.org/x/net/html token visitor, counting <table class="table"> tags and grabbing rows from the second one (skipping <thead>). bank_id, currency, user_id, sender_account are empty (matches Python — known limitation).

accountNum is derived from cfg.BankAccount by stripping the IBAN prefix (CZ85 2010 0000 0028 0035 91682800359168); add a small helper in config for this since both the API URL and the transparent URL need it.

Fakes

In-memory fakes live next to each real impl: sheets/fake.go, drive/fake.go, fio/fake.go, attendance/fake.go, cache/fake.go (a passthrough). All exported as Fake so tests do sheets.NewFake(rows) and inject. The membership-adapter tests use these fakes plus a couple of new raw-bytes fixtures under go/internal/io/<pkg>/testdata/:

  • sheets/testdata/payments_minimal.json — 2D-string array shaped like values.get would return.
  • sheets/testdata/exceptions_minimal.json — same, for the exceptions tab.
  • attendance/testdata/adults_minimal.csv — small adult attendance CSV.
  • attendance/testdata/juniors_minimal.csv — small junior CSV.
  • fio/testdata/api_response.json — captured Fio API JSON shape.
  • fio/testdata/transparent.html — captured transparent-page HTML.

Existing M3 domain fixtures under go/tests/fixtures/ stay where they are and continue to drive parity tests; they aren't reused for IO-layer tests because they're at the wrong layer (post-parse domain types).

Tasks (mapped to tracker)

Same 8 sub-milestones as the tracker, grouped into 3 PRs.

PR 1 — sheets / drive / cache + membership wiring (M4.1, M4.2, M4.3, M4.6)

  1. Add deps in go/go.mod: google.golang.org/api/{sheets/v4,drive/v3,option}, golang.org/x/oauth2/google (transitively pulled), golang.org/x/net/html.
  2. internal/io/sheets/:
    • client.goClient struct holding *sheets.Service; methods GetValues(ctx, spreadsheetID, a1Range string) ([][]any, error), AppendValues(ctx, spreadsheetID, a1Range string, rows [][]any) error, BatchUpdateValues(ctx, spreadsheetID, updates []ValueRange) error, SortByColumn(ctx, spreadsheetID, sheetGID int64, columnIndex int) error.
    • fake.go — exported Fake with seedable Values map[string][][]any.
  3. internal/io/drive/:
    • client.goClient.ModifiedTime(ctx, fileID string) (string, error) using drive.New(...).Files.Get(fileID).Fields("modifiedTime").SupportsAllDrives(true).
    • fake.go with seedable Times map[string]string.
  4. internal/io/attendance/ (new — public-URL CSV):
    • client.goClient.FetchAdults(ctx) ([][]string, error) and FetchJuniors(ctx) ([][]string, error) using http.Get on https://docs.google.com/spreadsheets/d/{ID}/export?format=csv&gid={GID}, decoded via encoding/csv.
    • Add AttendanceAdultSheetGID = "0" constant in internal/config.
  5. internal/io/cache/:
    • filecache.goFileCache with Get(ctx, key string, fetch func(ctx) (any, error)) (any, error) wired through Drive.ModifiedTime and the two TTL knobs. Atomic write via tmp-file + rename.
    • Cache key → sheet ID map mirrors Python's CACHE_SHEET_MAP.
  6. internal/services/membership/sources.go (new file in existing package):
    • realSources struct { sheets *sheets.Client; drive *drive.Client; attendance *attendance.Client; cache *cache.FileCache }.
    • Constructor NewSources(ctx, cfg) (Sources, error) builds all clients.
    • LoadAdults reads cached attendance CSV, runs through domain/fees.CalculateFee + merged-month logic (port of scripts/attendance.py get_members_with_fees), returns []reconcile.Member.
    • LoadTransactions reads payments sheet rows via cache, parses to []reconcile.Transaction (port of match_payments.py:208 fetch_sheet_data).
    • LoadExceptions reads 'exceptions'!A2:D via cache, builds map[ExceptionKey]Exception (port of match_payments.py:266).
  7. Add LoadJuniors to the AttendanceLoader interface (Python infer pulls both adult + junior member lists; needed for M4.8).
  8. Wire into cmd/fuj/main.go: replace membership.NewStubSources() in feesCmd and reconcileCmd with membership.NewSources(ctx, cfg).
  9. Tests (default tag, no live IO):
    • sheets/client_test.go, drive/client_test.go, cache/filecache_test.go — exercise fakes + parsing logic with testdata fixtures.
    • membership/sources_test.go — adapter tests with sheets/drive/cache fakes verify CSV→Member, rows→Transaction, exceptions tab → map.
  10. Config additions: CacheDir (default tmp relative to $PWD, overridable via CACHE_DIR env), DriveTimeout (default 10s).
  11. Manual verification: make go-build && go run ./cmd/fuj fees and ... reconcile print real reports against the live sheet (with valid .secret/...credentials.json).
  12. CHANGELOG entry; tick M4.1, M4.2, M4.3, M4.6 in the progress tracker.

PR 2 — fio + bank sync (M4.4, M4.5, M4.7)

  1. internal/io/fio/:
    • client.goClient interface, Transaction struct.
    • api.goapiClient impl + URL builder + JSON struct definitions for accountStatement.transactionList.transaction[].column{N}.value.
    • transparent.gotransparentClient impl using golang.org/x/net/html token visitor; helper functions parseCzechAmount (NBSP/space strip + comma→dot) and parseCzechDate (DD.MM.YYYY / DD/MM/YYYY).
    • fake.go.
    • New(cfg) Client chooses impl based on cfg.FioAPIToken.
    • accountNum(iban) helper in internal/config strips IBAN prefix.
  2. internal/services/banksync/sync.go (new package):
    • SyncToSheets(ctx, cfg, fio Client, sheets *sheets.Client, opts SyncOpts) (added int, err error).
    • Reads existing rows via sheets.GetValues(... "A1:K"), validates header against COLUMN_LABELS, writes header if missing, builds existingIDs from column K (Sync ID).
    • Computes date window: explicit from/to or now - days*24h (default 30d).
    • For each fetched tx, computes domain/synch.GenerateSyncID, skips if present, otherwise builds row in COLUMN_LABELS order with empty manual/person/purpose/inferred slots.
    • sheets.AppendValues(... "A2", rows).
    • Optional sort: sheets.SortByColumn(... gid, 0) — sheet GID resolved once via spreadsheets.Get.
  3. Wire fuj sync subcommand in cmd/fuj/main.go:
    • Flags: --days N (default 30), --from YYYY-MM-DD, --to YYYY-MM-DD, --sort (default true matching make sync-2026).
    • Replace the M4-stub error path.
  4. Tests (default tag): banksync/sync_test.go with fakes — verify header insertion, dedup against existing sync IDs, multi-row append, sort call.
  5. Manual verification: dry-run sync against the real Fio account in a throwaway test sheet; or visually verify --from --to window in stdout with a no-write flag (only if cheap to add — otherwise skip per the "no live integration tests" decision).
  6. CHANGELOG entry; tick M4.4, M4.5, M4.7.

PR 3 — infer (M4.8)

  1. internal/services/banksync/infer.go:
    • InferPayments(ctx, cfg, sheets *sheets.Client, attendanceLoader, juniorLoader, opts InferOpts) (updated int, err error).
    • Reads payments sheet A1:Z with case-insensitive header lookup.
    • Required columns: Person, Purpose, Inferred Amount. Optional input: Date, Amount, Sender, Message, VS, manual fix.
    • Skip rule (matches scripts/infer_payments.py:127): non-empty manual fix OR Person OR Purpose → leave row alone.
    • Member list = union of LoadAdults + LoadJuniors deduped via domain/matching.CanonicalKey (already exists from M2).
    • For each empty row: build tx dict, call domain/matching.InferTransactionDetails, prefix [?] if confidence == "review", emit a ValueRange update with R1C1 range R{i}C{personCol+1}:R{i}C{amountCol+1}.
    • Single sheets.BatchUpdateValues call for all updates.
  2. Wire fuj infer subcommand: flags --dry-run (prints planned updates, no API write).
  3. Tests (default tag): banksync/infer_test.go — fixture rows, verify skip rule, verify [?] prefix on review matches, verify batchUpdate payload shape, verify --dry-run is no-op.
  4. CHANGELOG entry; tick M4.8 → milestone gate .

Critical files

To modify:

To create:

  • go/internal/io/{sheets,drive,attendance,fio,cache}/{client,fake,*_test}.go
  • go/internal/io/{sheets,attendance,fio}/testdata/*
  • go/internal/services/membership/sources.go (+ sources_test.go)
  • go/internal/services/banksync/{sync,infer}.go (+ tests)

Reused existing helpers

  • domain/fees.CalculateFee / CalculateJuniorFee — fee math (M2.3, M2.4).
  • domain/matching.{BuildNameVariants,MatchMembers,InferTransactionDetails,FormatDate,CanonicalKey} — match logic (M2.7M2.9).
  • domain/synch.GenerateSyncID — dedup hash (M2.6).
  • domain/reconcile.{Member,Transaction,Exception,ExceptionKey} — domain types.
  • domain/czech.{Normalize,ParseMonthReferences} — used inside the attendance/exceptions parsers.
  • domain/money.ParseCZK — for parsing transparent-scrape amounts.

Verification

End-to-end checks once all three PRs land:

  1. make go-build && make go-lint && make go-test — clean.
  2. make go-parity — M3 fixtures still pass (no domain regressions).
  3. ./bin/fuj fees — prints adult fee report matching Python make fees (visual diff acceptable for now; byte-equality enforced in M5).
  4. ./bin/fuj reconcile — prints balance report comparable to scripts/match_payments.py print_balance_report.
  5. ./bin/fuj sync --days 7 — appends new Fio rows to the payments sheet (run with a real but recent date window; verify by counting added rows and confirming no duplicates on a second run).
  6. ./bin/fuj infer --dry-run — prints planned Person/Purpose/Inferred Amount updates without modifying the sheet. Then ./bin/fuj infer applies them; second run is a no-op (skip rule).
  7. Cache check: delete tmp/*_cache.json, run fuj fees, verify file appears with modifiedTime matching Drive. Re-run within 5 min; verify no Drive call (debug log).
  8. Cross-process cache safety: while make web-py is running, run fuj reconcile; verify Python's cache file isn't corrupted and Go reads the same data.

Gate (per tracker):

go test -tags=integration ./internal/io/... round-trips against test sheet; default-tag tests run on fakes.

Per the user's scope decision, the integration-test gate is downgraded to "default-tag tests on fakes" only. Live verification is deferred to manual smoke during M7's parallel-run watch period. The progress tracker's M4 gate line will be amended in PR 1.