feat: Go rewrite M1 — skeleton, tooling, and hello server

Stand up the Go project alongside the Python backend so both run
independently during migration. `make web-go` builds and serves on :8080;
`make web-py` (alias: `make web`) keeps the Python side on :5001.

- go/: new module `fuj-management/go` (Go 1.26)
  - cmd/fuj: stdlib-flag dispatcher; `server` + `version` work,
    fees/reconcile/sync/infer stubbed for M2/M4
  - internal/config: env loader mirroring scripts/config.py
  - internal/logging: slog setup, level taken from config
  - internal/web: net/http ServeMux + request-timer middleware
  - build/Dockerfile: golang:1.26 → alpine:3 multi-stage image
  - .golangci.yml: govet, staticcheck, errcheck, gofumpt, unused
- Makefile: web→web-py alias; go-build/go-test/go-run/go-lint/web-go
- CI: parallel build-go job in .gitea/workflows/build.yaml (<tag>-go image)
- docs/plans/: M1 kickoff plan + progress tracker (M1 complete)
- .claude/settings.json: gofumpt + golangci-lint permissions

Gate: make go-build ✓  make go-lint ✓  make go-test ✓  curl :8080 ✓

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-04 12:05:46 +02:00
parent 5a41cdae83
commit cf0f176d3f
18 changed files with 1247 additions and 10 deletions

View File

@@ -0,0 +1,424 @@
# Plan: Full Go rewrite of the Python/Flask backend
## Context
The current Flask app ([app.py](app.py) + [scripts/](scripts/), ~2400 LOC of
Python) handles attendance-based fee calculation, Fio bank sync, payment
reconciliation, and a server-rendered dashboard. The user wants a full
rewrite in Go with two goals:
1. **Quality Go code** as the primary outcome — idiomatic stdlib-first
design, strong typing, proper layering. The Python codebase grew
organically and mixes domain logic, IO, and HTTP concerns.
2. **Feature-parity certainty** — no behavioural drift between the Python
and Go versions on anything that touches money. Reconciliation is real
money; silent divergence is unacceptable.
**Switchable runtime**: both backends run on different TCP ports, started
independently via Makefile targets (`make web-py` on :5001, `make web-go` on
:8080). The user opens whichever they want in a browser. No reverse proxy,
no traffic-splitting, no shared frontend constraint — just two services
that read the same Google Sheets and the same `tmp/` cache.
**Frontends are allowed to diverge.** The Go web layer is designed cleanly
in its own right rather than as a byte-compatible Jinja port. Both backends
expose a JSON API (`/api/...`) with an identical contract — that's what
parity testing locks down. Rendered HTML and inline JS can be different.
## Versioning policy
- **Go**: latest stable release at project start. Pin in `go.mod` via the
`go` directive (e.g. `go 1.X`) and use the matching `golang:1.X` builder
image. Bump on each new minor as it lands stable.
- **Go libraries**: latest stable for every dependency in `go.mod`; run
`go get -u ./... && go mod tidy` at the start and quarterly thereafter.
- **Python deps** (during the parallel-run period): keep
[pyproject.toml](pyproject.toml) on its current versions to avoid
destabilizing the parity baseline; bump only after Python retires.
- **Base images**: `golang:latest-stable` builder → `gcr.io/distroless/static:latest`
runtime, both pinned by digest in CI for reproducibility.
- **CI runners**: latest stable Linux image on Gitea Actions.
The plan does not hardcode specific version numbers below — implementation
picks current-stable at the time M1 starts.
## Approach summary
- **Three-layer Go architecture**: pure domain (no IO) → IO clients (behind
interfaces, easily faked) → HTTP/services (composition).
- **Capture-then-port**: dump current Python outputs as JSON fixtures, port
Go function-by-function, assert byte-equality with `cmp.Diff`.
- **JSON contract is the spec, not the templates.** Each Python route gets
an `/api/X` shadow that returns the dict already passed to the template.
Go defines typed structs matching that shape; both sides validate against
generated JSON Schema.
- **Money is integer CZK**: existing fees are integer CZK (750/200/500);
keep it that way to avoid float drift in reconcile allocation. Where
Sheets returns floats, parse and round at the boundary.
- **Frontend rewrite, not port**: Go uses `html/template` with cleanly
organized templates and JS extracted into static files served via
`embed.FS`. Same UX (filterable table, member-detail modal, QR launcher)
but designed natively, no Jinja-port baggage.
## Go project layout
`go/` lives at the repo root alongside `scripts/` and `templates/` so both
backends share the same git history during migration.
```
go/
cmd/
fuj/main.go # single binary, subcommands: server | fees | sync | infer | reconcile
parity/main.go # diff tool: hits both backends' /api/X, prints JSON diff
internal/
domain/ # pure, no IO, no net/*
czech/ # normalize, parse_month_references
fees/ # calculate_fee, calculate_junior_fee, "?" sentinel type
money/ # parse_czk_amount, format helpers
reconcile/ # reconcile() + Ledger, MemberResult types
matching/ # _build_name_variants, match_members, infer_transaction_details
synch/ # generate_sync_id (pure hash)
io/ # IO behind interfaces, all impls have an in-memory fake
sheets/ # SheetsClient + Google impl + fake
drive/ # DriveClient for modifiedTime
fio/ # FioClient: API JSON impl + transparent-page HTML scraper
cache/ # FileCache with modifiedTime gating + two TTL knobs
services/ # composition layer; pure + IO, no HTTP
attendance/ # GetMembersWithFees, GetJuniorMembersWithFees
payments/ # FetchTransactions, FetchExceptions, BuildView
banksync/ # SyncToSheets, InferPayments (write ops)
web/
handlers/ # one file per route family
view/ # HTML view-model structs (per route)
api/ # JSON view-model structs (the parity-locked contract)
templates/ # *.tmpl, embed.FS — designed natively, not a Jinja port
static/ # js/*.js, css/*.css served via embed.FS
middleware/ # request timer, recovery, slog
config/ # mirrors scripts/config.py (env loading)
qr/ # SPD string builder + PNG via go-qrcode
tests/
fixtures/ # JSON fixtures captured from Python (PII-scrubbed)
parity/ # Go-side characterization tests (replay fixtures)
build/Dockerfile # multi-stage: latest-stable golang builder → distroless static
go.mod
```
## Library choices
All on latest stable as per the versioning policy above.
| Concern | Pick | Rationale |
|---|---|---|
| HTTP routing | `net/http` ServeMux | 8 static routes; no need for chi/gin given modern stdlib pattern matching |
| Templates | `html/template` | Auto-escaping; native Go feel |
| Static assets | `embed.FS` | Single binary, no loose files |
| Sheets/Drive | `google.golang.org/api/{sheets/v4,drive/v3}` + `option` | Official client; service-account auth via `option.WithCredentialsFile` |
| OAuth | `golang.org/x/oauth2/google` (token only; drop installed-app flow + pickle) | Production already uses service accounts |
| QR PNG | `github.com/skip2/go-qrcode` | Mature, byte-stable PNG output |
| NFKD | `golang.org/x/text/unicode/norm` + `unicode.IsMark` | Direct equivalent of `unicodedata.normalize("NFKD", ...)` |
| HTML scrape | `golang.org/x/net/html` token visitor | Counts `<table class="table">` to target the second one |
| CSV | `encoding/csv` (stdlib) | Match for Python `csv.reader` |
| Logging | `log/slog` (stdlib) | Honors `LOG_LEVEL` env |
| Diff/testing | `testing` + `github.com/google/go-cmp/cmp` | Readable `cmp.Diff` for parity assertions |
| Lint | `golangci-lint` (govet, staticcheck, errcheck, gofumpt, unused) | Standard quality gate |
## Migration sequencing — eight milestones with hard gates
**M1 — Skeleton + tooling.** Create `go/` tree, `go.mod` (latest stable
Go), Makefile targets (`go-build`, `go-test`, `go-run`, `web-go`),
`golangci-lint` config. `cmd/fuj server` prints a hello + version and
listens on :8080.
*Gate:* `make go-build` succeeds; `make web-go` serves a "hello" page on
:8080 in parallel with `make web` on :5001; lint clean.
**M2 — Pure-domain helpers, port leaf-first.** Order:
[czech_utils.py](scripts/czech_utils.py) `normalize``parse_month_references`
[attendance.py](scripts/attendance.py) `calculate_fee`/`calculate_junior_fee`
[infer_payments.py](scripts/infer_payments.py) `parse_czk_amount`
[sync_fio_to_sheets.py](scripts/sync_fio_to_sheets.py) `generate_sync_id`
[match_payments.py](scripts/match_payments.py) helpers (`_build_name_variants`,
`match_members`, `infer_transaction_details`, `format_date`) → `reconcile`.
Each gets a Go unit test plus a parity test driven by JSON fixtures from M3.
Also: `fuj fees` and `fuj reconcile` subcommands wired up (pure-domain CLIs).
*Gate:* All ported helpers pass parity tests.
**M3 — Fixture capture + characterization framework.** Build
`scripts/capture_fixtures.py` (Python helper that prints function results as
JSON to stdout — user pipes to disk) and `scripts/scrub_fixtures.py`
(replaces member names with deterministic pseudonyms `Member_<8hex>`,
scrambles sender/account/VS/bank_id while preserving structural
relationships, dates, amounts, exception keys). Capture ~10 reconcile
fixtures spanning every code path: greedy, proportional with float
remainder, even-split fallback, out-of-window credit, exception override,
`other:` purpose, junior `"?"`, comma-separated multi-person, multi-month
range, unmatched.
*Gate:* `tests/fixtures/` populated and committed; M2 parity tests green.
**M4 — IO layer behind interfaces.** Implement Sheets/Drive/Fio clients
matching Python return shapes. Drop the OAuth+pickle path entirely (service
account only). All clients have in-memory fakes for tests. Wire `fuj sync`
and `fuj infer` subcommands.
*Gate:* `go test -tags=integration ./internal/io/...` round-trips against a
test sheet (separate from prod); default-tag tests use fakes.
**M5 — JSON-only `/api/...` routes.** Add 8 Go route handlers that return
JSON. Add symmetric `/api/X` shadow endpoints in [app.py](app.py) that
`jsonify` the existing view-model dict (no transformation).
*Gate:* For each route, `cmd/parity` asserts
`cmp.Diff(python.json, go.json) == ""` modulo allowlist
(`render_time.total`, `build_meta`).
**M6 — Go-native HTML frontend.** Design Go templates cleanly (not a Jinja
port). Extract JS from inline into `internal/web/static/js/*.js` served via
`embed.FS`. Vanilla JS, no framework — same UX as Python (sortable table,
member-detail modal, name filter, month range filter, QR launcher) but
organized as proper modules. Templates render the JSON API response into
HTML; frontend JS fetches additional data from `/api/X` for the modal
rather than embedding `member_data` in `<script>`.
*Gate:* Browser smoke test of all routes on :8080 covers: name filter,
month filter, modal opens with correct months/transactions/exceptions, QR
modal renders, navigation between adults/juniors/payments works.
**M7 — Parallel-running watch period.** Both `make web-py` and `make web-go`
running locally (and in production via two containers on different ports).
Daily/manual `cmd/parity` runs catch any JSON drift. The user verifies the
Go UI matches what they expect feature-by-feature against the Python UI.
Run 12 weeks.
*Gate:* Zero non-allowlisted JSON diffs over 7 consecutive days, including
a sync-bank execution, a flush, and an attendance update. User sign-off
that the Go UI is feature-complete.
**M8 — Cutover + Python retirement.** Switch the bookmarked URL / docs to
the Go port. Keep Python container running but unrouted (or stopped) for
1 week as rollback. Then delete [app.py](app.py), [scripts/](scripts/),
the Python `Dockerfile`, and the Python tests. Update
[CLAUDE.md](CLAUDE.md) to reflect the Go-only state.
*Gate:* Two consecutive months of Go-only operation including end-of-month
settlement.
## CLI port (decided: port as Go subcommands)
Single Go binary `fuj` with subcommands replacing the existing Makefile
targets. Each reuses the domain layer directly:
| Old | New | Backed by | Milestone |
|---|---|---|---|
| `make fees` | `fuj fees` | `domain/fees` + `services/attendance` | M2 |
| `make reconcile` | `fuj reconcile` | `domain/reconcile` | M2 |
| `make sync-2026` | `fuj sync --year=2026` | `services/banksync.SyncToSheets` | M4 |
| `make infer` | `fuj infer [--dry-run]` | `services/banksync.InferPayments` | M4 |
| `make web` (py) | stays as Python `make web-py` until M8 | — | — |
| `make web-go` | `fuj server` | `web/handlers` | M1 |
Makefile targets get rewritten to invoke `./bin/fuj <subcommand>` once each
is ported. The Python `make` targets for already-ported commands stay as
`make X-py` aliases until M8, so you can run either side for cross-checks.
## JSON API contract strategy
**Go-defines, Python-conforms** with a 1-step bootstrap:
1. Run Python locally and dump `result["members"]`, `formatted_results`,
`monthly_totals`, etc., to JSON. This is the spec.
2. Hand-author Go structs with explicit `json:` tags matching exact Python
keys (`total_balance`, `original_expected`, `attendance_count` — no
reliance on default lowercasing).
3. Generate `tests/fixtures/api-schema/*.schema.json` from the Go structs
using `github.com/invopop/jsonschema`. Commit them.
4. Add a Python-side schema validator running in CI against the new
`/api/X` responses.
**Two known-tricky shapes:**
- Junior `expected: int | "?"`
```go
type Expected struct{ Value int; Unknown bool }
// MarshalJSON emits 42 or "?"
```
Same for `original_expected`.
- Tuple dict keys `(normalize(name), normalize(period))` for exceptions —
internal only, never crosses JSON. Use
`map[ExceptionKey]Exception` with `ExceptionKey struct{ Name, Period string }`.
## Characterization test harness — two tiers
(HTML rendering parity dropped: frontends are intentionally different.)
**Tier 1 — Pure-function parity** (fast, every commit). Fixtures at
`tests/fixtures/pure/<func>/<case>.json` containing `{input, output}`,
captured once via `scripts/capture_fixtures.py`. Go test reads each, calls
the ported function, asserts deep equality with `cmp.Diff`. Functions in
scope: `normalize`, `parse_month_references`, `parse_czk_amount`,
`parse_czech_amount`, `parse_czech_date`, `format_date`,
`_build_name_variants`, `match_members`, `infer_transaction_details`,
`generate_sync_id`, `calculate_fee`, `calculate_junior_fee`, `reconcile`.
**Tier 2 — JSON API parity** (medium, on PR + nightly). `cmd/parity/main.go`
hits both `:5001/api/X` and `:8080/api/X` with a fixture-seeded `tmp/`
cache, normalizes volatile fields (`render_time`, build metadata), asserts
byte-equality. Cache freezing: pre-populate `tmp/*_cache.json` from
scrubbed snapshots so both backends read identical data.
**PII scrubbing** is mandatory ([CLAUDE.md](CLAUDE.md): "Member data must
never be committed"). `scripts/scrub_fixtures.py` produces deterministic
pseudonyms preserving uniqueness and structural relationships. Only
scrubbed fixtures land in `tests/fixtures/`; raw `tmp/*.json` stays
gitignored.
## Side-by-side runtime
Two services on different ports, started independently. No reverse proxy.
```
make web-py # Python on :5001 (existing target, perhaps renamed from `make web`)
make web-go # Go on :8080
```
Both read the same Google Sheets and write to the same `tmp/` cache
directory. The user opens `localhost:5001` or `localhost:8080` directly to
A/B compare.
**Cache directory coordination**: both backends use `tmp/`. Go writes via
`os.WriteFile` to `tmp/<key>_cache.json.tmp` then `os.Rename` (atomic on
Linux). Python's writes are pre-existing-non-atomic; accept until Python
retires.
**Sync coordination**: `/sync-bank` is non-idempotent under concurrency.
Both backends `flock` on `tmp/sync.lock`; Go uses `syscall.Flock`. (In
practice the user is unlikely to trigger sync from both UIs at once, but
the lock is cheap insurance.)
**Production deployment**: keep the existing Python container; add a Go
container in `docker-compose.yml` exposed on a different port. After M8,
remove the Python service.
## CI/CD
Currently zero test CI ([.gitea/workflows/build.yaml](.gitea/workflows/build.yaml)
only does `docker build`/`push`). Add `/.gitea/workflows/test.yml`:
```yaml
jobs:
python-tests: # fix M3 broken-test references first
- uv sync && pytest tests/
go-tests:
- cd go && go test -race ./...
- cd go && golangci-lint run
parity-pure: # Tier 1
- cd go && go test -tags=parity ./tests/parity/...
```
Branch protection: `python-tests`, `go-tests`, `parity-pure` block merge.
Tier-2 parity runs nightly via `parity-nightly.yml` (boots both servers
via docker-compose with seeded caches, replays a fixed transaction script,
fails on any non-allowlisted diff).
A new Go `build/Dockerfile` (multi-stage: latest-stable `golang` builder →
`gcr.io/distroless/static:latest`, both pinned by digest) mirrors the
existing Python build job and produces a single static binary image.
## Risk register (top 4)
(Template auto-escape divergence dropped: irrelevant when frontends differ.)
1. **Sync ID hash drift** — HIGH/HIGH. Python builds the SHA-256 input by
`str()`-ing each field then `.lower()`-ing the joined string;
`str(750.0) == "750.0"`, `str(750) == "750"`. If Sheets API returns
floats in Python but Go unmarshals as int, `750` vs `750.0` → different
hash → duplicate rows. *Mitigation:* dedicated parity test with ~50
real-row fixtures; if Go can't reproduce Python's float string format,
normalize at the boundary (round to 2 decimals, format with explicit
precision).
2. **Float allocation in `reconcile()` proportional phase** — HIGH/MEDIUM.
Python's "last month absorbs remainder" depends on dict iteration order;
Go map iteration is randomized. *Mitigation:* always iterate
`sorted_months` explicitly in Go, never the map. Lock the distribution
with a parity test on (300, 300, 150) months × 751-CZK payment.
3. **NFKD edge cases** — MEDIUM/MEDIUM. Python `unicodedata` and Go
`golang.org/x/text` use the same algorithm but can differ on niche
compatibility decompositions if `x/text` is older than CPython's tables.
*Mitigation:* parity test with every distinct character ever observed in
member names; pin `x/text` version explicitly.
4. **Czech month parser semantics** — MEDIUM/MEDIUM. Wrap-around year
inference (`if start_m > end_m and m >= start_m: year = default_year - 1`)
plus the "month >= 10 → previous year" heuristic are easy to mis-port.
*Mitigation:* port table and algorithm verbatim line-for-line; parity
test with ~30 real `message`-field fixture strings.
## Cutover plan
Simpler without a proxy in the middle:
1. After M7's 7-day clean window + user sign-off, treat Go as primary.
Update bookmarks, docs, `make web` to point at Go.
2. Keep `make web-py` available for 1-week rollback. Run both containers
in production but only point users at the Go one.
3. Watch 2 weeks including a month-end settlement on Go-only.
4. Decommission Python: remove from `docker-compose.yml`, delete
[app.py](app.py) and [scripts/](scripts/), update
[CLAUDE.md](CLAUDE.md). Keep image tagged `python-final` in registry as
a 6-month rollback option.
**Retirement criteria:** zero parity-diff incidents in last 30 days, zero
rollbacks, two month-end settlements completed Go-only, manual
reconciliation review against `python-final` signed off.
## Critical files
- [scripts/match_payments.py](scripts/match_payments.py) — `reconcile()` is
the single most load-bearing function (~200 lines of allocation logic)
that must port byte-equivalently.
- [scripts/czech_utils.py](scripts/czech_utils.py) — `normalize` and
`parse_month_references` underpin every member/month match across the
system. 45 Czech month declensions, range wrap-around, year inference.
- [app.py](app.py) — defines the 8-route HTTP surface and view-model
shapes. The spec for the Go web layer's JSON API.
- [scripts/sync_fio_to_sheets.py](scripts/sync_fio_to_sheets.py) —
`generate_sync_id` defines the dedup contract against existing rows in
the live sheet. Any drift creates duplicates.
- [scripts/attendance.py](scripts/attendance.py) — fee math + merged-month
logic + junior `"?"` sentinel.
- [scripts/cache_utils.py](scripts/cache_utils.py) — Drive `modifiedTime`
gating + two-TTL fallback that must be reproduced for shared-cache
safety.
- [templates/adults.html](templates/adults.html) — read for the JSON shape
the existing inline JS consumes (`member_data`); the Go frontend doesn't
have to mirror the template, but the JSON contract derived from this
page's data injection is the parity spec.
## Verification
End-to-end checks per milestone:
- **M1**: `make go-build && ./bin/fuj server --help` prints subcommand
list. `make web-go` serves :8080 in parallel with `make web-py` on :5001.
- **M2-M3**: `cd go && go test -tags=parity ./tests/parity/pure/...` green.
Spot-check: feed a known Czech-message string through both
`parse_month_references` implementations, diff outputs.
- **M4**: `go test -tags=integration ./internal/io/sheets/...` round-trips
against a test sheet (separate from prod).
- **M5**: `curl localhost:5001/api/adults | jq -S . > py.json && curl
localhost:8080/api/adults | jq -S . > go.json && diff py.json go.json` —
empty diff modulo allowlist.
- **M6**: Browser open `localhost:8080/adults`, click a member row, modal
opens with all months / transactions / exceptions correctly populated.
Same on `/juniors`. Click a Pay button → QR loads. Name filter and month
range filter work.
- **M7**: Run `cd go && ./bin/parity --base http://localhost:5001
--candidate http://localhost:8080 --routes adults,juniors,payments`
daily for 7 days, zero non-allowlisted diffs. User confirms Go UI is
feature-complete vs Python UI side-by-side.
- **M8**: `make web-py` removed from Makefile; `make web` points at Go;
manual end-of-month settlement on Go matches the prior month's
Python-produced report.
## Open questions / forks the user can override at review
- **Frontend JS organization in M6**: default is vanilla JS in separate
files via `embed.FS`. If the user wants HTMX, Alpine.js, or a small
framework, raise it before M6.
- **CI host**: Gitea Actions assumed (matches existing
[.gitea/workflows/build.yaml](.gitea/workflows/build.yaml)).
- **Test sheet for M4 integration tests**: would need provisioning.
Confirm whether to use a copy of the production sheet (PII!) or a
synthetic one seeded by the fixture-capture process.