feat(go): IO layer behind interfaces (M4)
All checks were successful
Deploy to K8s / deploy (push) Successful in 11s

- io/attendance: CSV-over-public-URL client + Fake for adult/junior tabs
- io/drive: Drive v3 modifiedTime client + Fake
- io/sheets: Sheets v4 client (GetValues/AppendValues/BatchUpdateValues/
  WriteHeader/SortByDateColumn) + Fake with call-capture
- io/cache: Drive-modifiedTime-gated FileCache; two TTL knobs; atomic
  writes; generic Get[T]; Python-compatible JSON format; Flush()
- io/fio: Client interface backed by Fio REST API (apiClient) and HTML
  scraper (transparentClient); Fake; testdata fixtures
- membership/sources: NewSources wires attendance CSV + Sheets + cache
  into LoadAdults/LoadJuniors/LoadTransactions/LoadExceptions; Czech
  month parsing + merged-month maps
- banksync: SyncToSheets (SHA-256 dedup, optional sort) and
  InferPayments ([?] review prefix, dry-run) — tested with fakes
- cmd/fuj: sync and infer subcommands wired; fees and reconcile use
  real NewSources; go.mod gains google.golang.org/api + x/net
- gofumpt extra-rules applied across all packages; lint clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-07 01:05:59 +02:00
parent 7afd12d9a5
commit 6465e2a221
45 changed files with 3292 additions and 46 deletions

View File

@@ -2,9 +2,9 @@
Companion to [2026-05-03-2349-go-backend-rewrite.md](2026-05-03-2349-go-backend-rewrite.md).
**Current milestone:** M3Fixture capture + characterization framework
**Current milestone:** M4IO layer behind interfaces
**Started:** 2026-05-04
**Last updated:** 2026-05-06
**Last updated:** 2026-05-07
## How to use
@@ -80,16 +80,16 @@ Goal: deterministic, PII-free fixture corpus that drives parity tests. Runs in p
Goal: every external IO (Sheets, Drive, Fio, file cache) accessed through a narrow Go interface with both a real and a fake implementation.
- [ ] **M4.1** Design IO interfaces (`SheetsClient`, `DriveClient`, `FioClient`, `FileCache`) + in-memory fakes seeded from M3 fixtures
- [ ] **M4.2** `internal/io/sheets` — Google client (read + append + batchUpdate); integration test against a separate test sheet (NOT prod)
- [ ] **M4.3** `internal/io/drive` — Drive `modifiedTime` client + integration test
- [ ] **M4.4** `internal/io/fio` — API JSON impl (token-based); parses by hardcoded `column0..column22` indices matching [fio_utils.py](scripts/fio_utils.py)
- [ ] **M4.5** `internal/io/fio` — transparent-page HTML scraper using `golang.org/x/net/html` token visitor; targets the **second** `<table class="table">`
- [ ] **M4.6** `internal/io/cache` — FileCache with `modifiedTime` gating + two TTL knobs + atomic writes (`os.Rename`)
- [ ] **M4.7** `services/banksync.SyncToSheets` + `fuj sync` subcommand
- [ ] **M4.8** `services/banksync.InferPayments` + `fuj infer [--dry-run]` subcommand
- [x] **M4.1** Design IO interfaces (`SheetsClient`, `DriveClient`, `FioClient`, `FileCache`) + in-memory fakes seeded from M3 fixtures
- [x] **M4.2** `internal/io/sheets` — Google client (read + append + batchUpdate); fake with call-capture
- [x] **M4.3** `internal/io/drive` — Drive `modifiedTime` client + fake
- [x] **M4.4** `internal/io/fio` — API JSON impl (token-based); parses by hardcoded `column0..column22` indices matching [fio_utils.py](scripts/fio_utils.py)
- [x] **M4.5** `internal/io/fio` — transparent-page HTML scraper using `golang.org/x/net/html` token visitor; targets the **second** `<table class="table">`
- [x] **M4.6** `internal/io/cache` — FileCache with `modifiedTime` gating + two TTL knobs + atomic writes (`os.Rename`)
- [x] **M4.7** `services/banksync.SyncToSheets` + `fuj sync` subcommand
- [x] **M4.8** `services/banksync.InferPayments` + `fuj infer [--dry-run]` subcommand; `NewSources` wires all IO into fees+reconcile
**Gate:** `go test -tags=integration ./internal/io/...` round-trips against test sheet; default-tag tests run on fakes.
**Gate:** ✅ Fakes-only unit tests; `make go-test` + `make go-lint` both green. Live smoke test deferred to first real sync run.
---
@@ -155,4 +155,5 @@ Goal: Go is the one true backend.
(Add entries as you go. Format: `YYYY-MM-DD — short note`.)
- 2026-05-04 — Plan approved. Versioning policy: latest stable for Go and all libs at the time M1 starts. Frontends explicitly allowed to diverge between Python and Go; only the JSON API contract is parity-locked. No reverse proxy — both backends run on different ports via `make web-py` / `make web-go`.
- 2026-05-07 — M4 complete. Chose fakes-only unit tests (no live integration tests) and CSV-via-public-URL for attendance (no Sheets API auth required for read-only). golangci-lint gofumpt extra-rules differ slightly from standalone gofumpt; used `golangci-lint run --fix --enable-only gofumpt` to auto-resolve formatting.
- 2026-05-04 — M1 complete. Dockerfile base changed from `distroless/static:nonroot` → `alpine:3` for debuggability (can tighten later). CLI dispatcher uses stdlib `flag`; module path `fuj-management/go`. golangci-lint v1 embedded gofumpt merges all imports into one group (no stdlib/local split) — accepted as the project style.

View File

@@ -0,0 +1,313 @@
# Plan: Go rewrite — M4 IO layer behind interfaces
Companion to [2026-05-03-2349-go-backend-rewrite.md](2026-05-03-2349-go-backend-rewrite.md)
and [2026-05-03-2349-go-backend-rewrite-progress.md](2026-05-03-2349-go-backend-rewrite-progress.md).
## Context
M1M3 are merged: skeleton + tooling, every pure-domain function ported and
parity-tested against PII-scrubbed fixtures, and the `fuj fees` / `fuj
reconcile` subcommands wired but stubbed (`membership.NewStubSources()`
returns `ErrIOPending` for every loader). M4's job is to replace that stub
with real IO: read attendance CSVs, read the payments sheet + exceptions
tab, fetch Drive `modifiedTime` for cache gating, fetch Fio bank
transactions, and append/update rows on the payments sheet — all behind
narrow Go interfaces that have in-memory fakes for tests.
Once M4 lands, `fuj fees`, `fuj reconcile`, `fuj sync`, and `fuj infer` all
work end-to-end against the real Google Sheets and the real Fio account, and
M5 can start porting the JSON API on top of that IO.
User-confirmed scope choices for this milestone:
- **No live integration tests.** Fakes-only at unit level; live
verification deferred to manual smoke during M7.
- **Three PRs** (sheets/drive/cache → fio/sync → infer), one per major
area, each independently reviewable.
- **Attendance stays on CSV-via-public-URL** — matches Python, no extra
service-account grant needed.
## Approach
### Layering
```
internal/io/ ← raw, narrow clients (one per external system)
sheets/ ← typed wrapper around google.golang.org/api/sheets/v4
drive/ ← Drive v3, only ModifiedTime
attendance/ ← CSV-via-public-URL fetcher (no auth, no Sheets API)
fio/ ← FioClient interface + apiClient + transparentClient
cache/ ← FileCache: modifiedTime gate + two-TTL fallback + atomic write
internal/services/membership/ ← already exists; M4 adds adapters that satisfy
AttendanceLoader / TransactionLoader / ExceptionLoader
by composing io/sheets + io/drive + io/cache + io/attendance.
internal/services/banksync/ ← new: SyncToSheets (M4.7) + InferPayments (M4.8)
composing fio + sheets + attendance loaders.
```
The existing interfaces in [go/internal/services/membership/loader.go](../../go/internal/services/membership/loader.go)
(`AttendanceLoader`, `TransactionLoader`, `ExceptionLoader`, `Sources`) are
the seam — M4 adds a `NewSources(cfg config.Config) (Sources, error)`
constructor next to `NewStubSources()`, and `cmd/fuj/main.go` swaps the
stub for it.
### Auth — service-account only
Drop the OAuth+`token.pickle` path entirely (the production already uses a
service account; the fallback only existed because the original Python
script ran from a developer laptop). Sheets and Drive both authenticate via
`option.WithCredentialsFile(cfg.CredentialsPath)` plus
`option.WithScopes(...)`. Single shared `*http.Client` per backend with a
10s timeout (matches `DRIVE_TIMEOUT`).
### Cache shape
Match Python's wire format so the `tmp/*_cache.json` directory is shared
safely while both backends run side-by-side:
```json
{ "modifiedTime": "<RFC3339>", "data": <list|object>, "cachedAt": "<RFC3339>" }
```
Improvements over Python:
- Atomic write: marshal → `os.WriteFile(path+".tmp", ..., 0o600)`
`os.Rename`. Python's plain truncate-write stays as-is until M8.
- The two TTLs (`CacheTTL` and `CacheAPICheckTTL`) live in `config.Config`
already; only the `CacheDir` field is new.
The four cache keys mirror Python's `CACHE_SHEET_MAP`:
`attendance_regular`, `attendance_juniors`, `exceptions_dict`,
`payments_transactions` → maps to either `AttendanceSheetID` or
`PaymentsSheetID`.
When Drive fails, fall back to a synthetic key
`fmt.Sprintf("ttl-5m-%d", time.Now().Unix()/300)` so cache still keys
deterministically per 5-min bucket (same as Python).
### Fio: two impls behind one interface
```go
type Client interface {
FetchTransactions(ctx context.Context, from, to time.Time) ([]Transaction, error)
}
```
`apiClient` (when `cfg.FioAPIToken != ""`) hits
`https://fioapi.fio.cz/v1/rest/periods/{token}/{from}/{to}/transactions.json`,
unmarshals via a typed struct, and maps `column0..column22` to fields per
[scripts/fio_utils.py](../../scripts/fio_utils.py:90). Negative-amount rows
dropped (matches Python).
`transparentClient` (fallback) GETs
`https://ib.fio.cz/ib/transparent?a={accountNum}&f={DD.MM.YYYY}&t={DD.MM.YYYY}`
and walks the response with `golang.org/x/net/html` token visitor, counting
`<table class="table">` tags and grabbing rows from the **second** one
(skipping `<thead>`). `bank_id`, `currency`, `user_id`, `sender_account`
are empty (matches Python — known limitation).
`accountNum` is derived from `cfg.BankAccount` by stripping the IBAN prefix
(`CZ85 2010 0000 0028 0035 9168``2800359168`); add a small helper in
`config` for this since both the API URL and the transparent URL need it.
### Fakes
In-memory fakes live next to each real impl: `sheets/fake.go`,
`drive/fake.go`, `fio/fake.go`, `attendance/fake.go`,
`cache/fake.go` (a passthrough). All exported as `Fake` so tests do
`sheets.NewFake(rows)` and inject. The membership-adapter tests use these
fakes plus a couple of new raw-bytes fixtures under
`go/internal/io/<pkg>/testdata/`:
- `sheets/testdata/payments_minimal.json` — 2D-string array shaped like
`values.get` would return.
- `sheets/testdata/exceptions_minimal.json` — same, for the exceptions tab.
- `attendance/testdata/adults_minimal.csv` — small adult attendance CSV.
- `attendance/testdata/juniors_minimal.csv` — small junior CSV.
- `fio/testdata/api_response.json` — captured Fio API JSON shape.
- `fio/testdata/transparent.html` — captured transparent-page HTML.
Existing M3 domain fixtures under `go/tests/fixtures/` stay where they are
and continue to drive parity tests; they aren't reused for IO-layer tests
because they're at the wrong layer (post-parse domain types).
## Tasks (mapped to tracker)
Same 8 sub-milestones as the tracker, grouped into 3 PRs.
### PR 1 — sheets / drive / cache + membership wiring (M4.1, M4.2, M4.3, M4.6)
1. **Add deps** in [go/go.mod](../../go/go.mod):
`google.golang.org/api/{sheets/v4,drive/v3,option}`,
`golang.org/x/oauth2/google` (transitively pulled), `golang.org/x/net/html`.
2. **`internal/io/sheets/`**:
- `client.go``Client` struct holding `*sheets.Service`; methods
`GetValues(ctx, spreadsheetID, a1Range string) ([][]any, error)`,
`AppendValues(ctx, spreadsheetID, a1Range string, rows [][]any) error`,
`BatchUpdateValues(ctx, spreadsheetID, updates []ValueRange) error`,
`SortByColumn(ctx, spreadsheetID, sheetGID int64, columnIndex int) error`.
- `fake.go` — exported `Fake` with seedable `Values map[string][][]any`.
3. **`internal/io/drive/`**:
- `client.go``Client.ModifiedTime(ctx, fileID string) (string, error)`
using `drive.New(...).Files.Get(fileID).Fields("modifiedTime").SupportsAllDrives(true)`.
- `fake.go` with seedable `Times map[string]string`.
4. **`internal/io/attendance/`** (new — public-URL CSV):
- `client.go``Client.FetchAdults(ctx) ([][]string, error)` and
`FetchJuniors(ctx) ([][]string, error)` using `http.Get` on
`https://docs.google.com/spreadsheets/d/{ID}/export?format=csv&gid={GID}`,
decoded via `encoding/csv`.
- Add `AttendanceAdultSheetGID = "0"` constant in `internal/config`.
5. **`internal/io/cache/`**:
- `filecache.go``FileCache` with `Get(ctx, key string, fetch func(ctx) (any, error)) (any, error)`
wired through `Drive.ModifiedTime` and the two TTL knobs. Atomic write
via tmp-file + rename.
- Cache key → sheet ID map mirrors Python's `CACHE_SHEET_MAP`.
6. **`internal/services/membership/sources.go`** (new file in existing
package):
- `realSources struct { sheets *sheets.Client; drive *drive.Client; attendance *attendance.Client; cache *cache.FileCache }`.
- Constructor `NewSources(ctx, cfg) (Sources, error)` builds all clients.
- `LoadAdults` reads cached attendance CSV, runs through
`domain/fees.CalculateFee` + merged-month logic (port of
[scripts/attendance.py](../../scripts/attendance.py:170)
`get_members_with_fees`), returns `[]reconcile.Member`.
- `LoadTransactions` reads payments sheet rows via cache, parses to
`[]reconcile.Transaction` (port of
[match_payments.py:208](../../scripts/match_payments.py:208)
`fetch_sheet_data`).
- `LoadExceptions` reads `'exceptions'!A2:D` via cache, builds
`map[ExceptionKey]Exception` (port of `match_payments.py:266`).
7. **Add `LoadJuniors`** to the `AttendanceLoader` interface (Python infer
pulls both adult + junior member lists; needed for M4.8).
8. **Wire into [cmd/fuj/main.go](../../go/cmd/fuj/main.go)**: replace
`membership.NewStubSources()` in `feesCmd` and `reconcileCmd` with
`membership.NewSources(ctx, cfg)`.
9. **Tests** (default tag, no live IO):
- `sheets/client_test.go`, `drive/client_test.go`,
`cache/filecache_test.go` — exercise fakes + parsing logic with
testdata fixtures.
- `membership/sources_test.go` — adapter tests with sheets/drive/cache
fakes verify CSV→Member, rows→Transaction, exceptions tab → map.
10. **Config additions**: `CacheDir` (default `tmp` relative to `$PWD`,
overridable via `CACHE_DIR` env), `DriveTimeout` (default 10s).
11. **Manual verification**: `make go-build && go run ./cmd/fuj fees` and
`... reconcile` print real reports against the live sheet (with valid
`.secret/...credentials.json`).
12. CHANGELOG entry; tick M4.1, M4.2, M4.3, M4.6 in the progress tracker.
### PR 2 — fio + bank sync (M4.4, M4.5, M4.7)
1. **`internal/io/fio/`**:
- `client.go``Client` interface, `Transaction` struct.
- `api.go``apiClient` impl + URL builder + JSON struct definitions
for `accountStatement.transactionList.transaction[].column{N}.value`.
- `transparent.go``transparentClient` impl using
`golang.org/x/net/html` token visitor; helper functions
`parseCzechAmount` (NBSP/space strip + comma→dot) and
`parseCzechDate` (DD.MM.YYYY / DD/MM/YYYY).
- `fake.go`.
- `New(cfg) Client` chooses impl based on `cfg.FioAPIToken`.
- `accountNum(iban)` helper in `internal/config` strips IBAN prefix.
2. **`internal/services/banksync/sync.go`** (new package):
- `SyncToSheets(ctx, cfg, fio Client, sheets *sheets.Client, opts SyncOpts) (added int, err error)`.
- Reads existing rows via `sheets.GetValues(... "A1:K")`, validates
header against `COLUMN_LABELS`, writes header if missing, builds
`existingIDs` from column K (`Sync ID`).
- Computes date window: explicit `from`/`to` or `now - days*24h` (default 30d).
- For each fetched tx, computes `domain/synch.GenerateSyncID`, skips if
present, otherwise builds row in COLUMN_LABELS order with empty
manual/person/purpose/inferred slots.
- `sheets.AppendValues(... "A2", rows)`.
- Optional sort: `sheets.SortByColumn(... gid, 0)` — sheet GID resolved
once via `spreadsheets.Get`.
3. **Wire `fuj sync` subcommand** in `cmd/fuj/main.go`:
- Flags: `--days N` (default 30), `--from YYYY-MM-DD`, `--to YYYY-MM-DD`,
`--sort` (default true matching `make sync-2026`).
- Replace the M4-stub error path.
4. **Tests** (default tag): `banksync/sync_test.go` with fakes — verify
header insertion, dedup against existing sync IDs, multi-row append,
sort call.
5. **Manual verification**: dry-run sync against the real Fio account in a
throwaway test sheet; or visually verify `--from --to` window in stdout
with a no-write flag (only if cheap to add — otherwise skip per the
"no live integration tests" decision).
6. CHANGELOG entry; tick M4.4, M4.5, M4.7.
### PR 3 — infer (M4.8)
1. **`internal/services/banksync/infer.go`**:
- `InferPayments(ctx, cfg, sheets *sheets.Client, attendanceLoader, juniorLoader, opts InferOpts) (updated int, err error)`.
- Reads payments sheet `A1:Z` with case-insensitive header lookup.
- Required columns: `Person, Purpose, Inferred Amount`. Optional input:
`Date, Amount, Sender, Message, VS, manual fix`.
- Skip rule (matches [scripts/infer_payments.py:127](../../scripts/infer_payments.py:127)):
non-empty `manual fix` OR `Person` OR `Purpose` → leave row alone.
- Member list = union of `LoadAdults` + `LoadJuniors` deduped via
`domain/matching.CanonicalKey` (already exists from M2).
- For each empty row: build tx dict, call
`domain/matching.InferTransactionDetails`, prefix `[?] ` if
confidence == "review", emit a `ValueRange` update with R1C1 range
`R{i}C{personCol+1}:R{i}C{amountCol+1}`.
- Single `sheets.BatchUpdateValues` call for all updates.
2. **Wire `fuj infer` subcommand**: flags `--dry-run` (prints planned
updates, no API write).
3. **Tests** (default tag): `banksync/infer_test.go` — fixture rows,
verify skip rule, verify `[?]` prefix on review matches, verify
batchUpdate payload shape, verify `--dry-run` is no-op.
4. CHANGELOG entry; tick M4.8 → milestone gate ✅.
## Critical files
To modify:
- [go/internal/services/membership/loader.go](../../go/internal/services/membership/loader.go) — add `LoadJuniors` to `AttendanceLoader`, add `NewSources`.
- [go/cmd/fuj/main.go](../../go/cmd/fuj/main.go) — swap stub for real sources, add `sync`/`infer` subcommands.
- [go/internal/config/config.go](../../go/internal/config/config.go) — add `CacheDir`, `DriveTimeout`, `AttendanceAdultSheetGID` constant, IBAN→account-num helper.
- [go/go.mod](../../go/go.mod) / `go.sum` — google APIs + `x/net/html`.
- [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](2026-05-03-2349-go-backend-rewrite-progress.md) — tick M4.x boxes after each PR.
- [CHANGELOG.md](../../CHANGELOG.md) — entry per PR.
To create:
- `go/internal/io/{sheets,drive,attendance,fio,cache}/{client,fake,*_test}.go`
- `go/internal/io/{sheets,attendance,fio}/testdata/*`
- `go/internal/services/membership/sources.go` (+ `sources_test.go`)
- `go/internal/services/banksync/{sync,infer}.go` (+ tests)
## Reused existing helpers
- `domain/fees.CalculateFee` / `CalculateJuniorFee` — fee math (M2.3, M2.4).
- `domain/matching.{BuildNameVariants,MatchMembers,InferTransactionDetails,FormatDate,CanonicalKey}` — match logic (M2.7M2.9).
- `domain/synch.GenerateSyncID` — dedup hash (M2.6).
- `domain/reconcile.{Member,Transaction,Exception,ExceptionKey}` — domain types.
- `domain/czech.{Normalize,ParseMonthReferences}` — used inside the
attendance/exceptions parsers.
- `domain/money.ParseCZK` — for parsing transparent-scrape amounts.
## Verification
End-to-end checks once all three PRs land:
1. `make go-build && make go-lint && make go-test` — clean.
2. `make go-parity` — M3 fixtures still pass (no domain regressions).
3. `./bin/fuj fees` — prints adult fee report matching Python `make fees`
(visual diff acceptable for now; byte-equality enforced in M5).
4. `./bin/fuj reconcile` — prints balance report comparable to
[scripts/match_payments.py](../../scripts/match_payments.py) `print_balance_report`.
5. `./bin/fuj sync --days 7` — appends new Fio rows to the payments sheet
(run with a real but recent date window; verify by counting added rows
and confirming no duplicates on a second run).
6. `./bin/fuj infer --dry-run` — prints planned Person/Purpose/Inferred
Amount updates without modifying the sheet. Then `./bin/fuj infer`
applies them; second run is a no-op (skip rule).
7. **Cache check**: delete `tmp/*_cache.json`, run `fuj fees`, verify file
appears with `modifiedTime` matching Drive. Re-run within 5 min;
verify no Drive call (debug log).
8. **Cross-process cache safety**: while `make web-py` is running, run
`fuj reconcile`; verify Python's cache file isn't corrupted and Go
reads the same data.
Gate (per tracker):
> `go test -tags=integration ./internal/io/...` round-trips against test sheet; default-tag tests run on fakes.
Per the user's scope decision, **the integration-test gate is downgraded
to "default-tag tests on fakes" only**. Live verification is deferred to
manual smoke during M7's parallel-run watch period. The progress tracker's
M4 gate line will be amended in PR 1.