Files
fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite.md
Jan Novak cf0f176d3f feat: Go rewrite M1 — skeleton, tooling, and hello server
Stand up the Go project alongside the Python backend so both run
independently during migration. `make web-go` builds and serves on :8080;
`make web-py` (alias: `make web`) keeps the Python side on :5001.

- go/: new module `fuj-management/go` (Go 1.26)
  - cmd/fuj: stdlib-flag dispatcher; `server` + `version` work,
    fees/reconcile/sync/infer stubbed for M2/M4
  - internal/config: env loader mirroring scripts/config.py
  - internal/logging: slog setup, level taken from config
  - internal/web: net/http ServeMux + request-timer middleware
  - build/Dockerfile: golang:1.26 → alpine:3 multi-stage image
  - .golangci.yml: govet, staticcheck, errcheck, gofumpt, unused
- Makefile: web→web-py alias; go-build/go-test/go-run/go-lint/web-go
- CI: parallel build-go job in .gitea/workflows/build.yaml (<tag>-go image)
- docs/plans/: M1 kickoff plan + progress tracker (M1 complete)
- .claude/settings.json: gofumpt + golangci-lint permissions

Gate: make go-build ✓  make go-lint ✓  make go-test ✓  curl :8080 ✓

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 12:05:46 +02:00

425 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Plan: Full Go rewrite of the Python/Flask backend
## Context
The current Flask app ([app.py](app.py) + [scripts/](scripts/), ~2400 LOC of
Python) handles attendance-based fee calculation, Fio bank sync, payment
reconciliation, and a server-rendered dashboard. The user wants a full
rewrite in Go with two goals:
1. **Quality Go code** as the primary outcome — idiomatic stdlib-first
design, strong typing, proper layering. The Python codebase grew
organically and mixes domain logic, IO, and HTTP concerns.
2. **Feature-parity certainty** — no behavioural drift between the Python
and Go versions on anything that touches money. Reconciliation is real
money; silent divergence is unacceptable.
**Switchable runtime**: both backends run on different TCP ports, started
independently via Makefile targets (`make web-py` on :5001, `make web-go` on
:8080). The user opens whichever they want in a browser. No reverse proxy,
no traffic-splitting, no shared frontend constraint — just two services
that read the same Google Sheets and the same `tmp/` cache.
**Frontends are allowed to diverge.** The Go web layer is designed cleanly
in its own right rather than as a byte-compatible Jinja port. Both backends
expose a JSON API (`/api/...`) with an identical contract — that's what
parity testing locks down. Rendered HTML and inline JS can be different.
## Versioning policy
- **Go**: latest stable release at project start. Pin in `go.mod` via the
`go` directive (e.g. `go 1.X`) and use the matching `golang:1.X` builder
image. Bump on each new minor as it lands stable.
- **Go libraries**: latest stable for every dependency in `go.mod`; run
`go get -u ./... && go mod tidy` at the start and quarterly thereafter.
- **Python deps** (during the parallel-run period): keep
[pyproject.toml](pyproject.toml) on its current versions to avoid
destabilizing the parity baseline; bump only after Python retires.
- **Base images**: `golang:latest-stable` builder → `gcr.io/distroless/static:latest`
runtime, both pinned by digest in CI for reproducibility.
- **CI runners**: latest stable Linux image on Gitea Actions.
The plan does not hardcode specific version numbers below — implementation
picks current-stable at the time M1 starts.
## Approach summary
- **Three-layer Go architecture**: pure domain (no IO) → IO clients (behind
interfaces, easily faked) → HTTP/services (composition).
- **Capture-then-port**: dump current Python outputs as JSON fixtures, port
Go function-by-function, assert byte-equality with `cmp.Diff`.
- **JSON contract is the spec, not the templates.** Each Python route gets
an `/api/X` shadow that returns the dict already passed to the template.
Go defines typed structs matching that shape; both sides validate against
generated JSON Schema.
- **Money is integer CZK**: existing fees are integer CZK (750/200/500);
keep it that way to avoid float drift in reconcile allocation. Where
Sheets returns floats, parse and round at the boundary.
- **Frontend rewrite, not port**: Go uses `html/template` with cleanly
organized templates and JS extracted into static files served via
`embed.FS`. Same UX (filterable table, member-detail modal, QR launcher)
but designed natively, no Jinja-port baggage.
## Go project layout
`go/` lives at the repo root alongside `scripts/` and `templates/` so both
backends share the same git history during migration.
```
go/
cmd/
fuj/main.go # single binary, subcommands: server | fees | sync | infer | reconcile
parity/main.go # diff tool: hits both backends' /api/X, prints JSON diff
internal/
domain/ # pure, no IO, no net/*
czech/ # normalize, parse_month_references
fees/ # calculate_fee, calculate_junior_fee, "?" sentinel type
money/ # parse_czk_amount, format helpers
reconcile/ # reconcile() + Ledger, MemberResult types
matching/ # _build_name_variants, match_members, infer_transaction_details
synch/ # generate_sync_id (pure hash)
io/ # IO behind interfaces, all impls have an in-memory fake
sheets/ # SheetsClient + Google impl + fake
drive/ # DriveClient for modifiedTime
fio/ # FioClient: API JSON impl + transparent-page HTML scraper
cache/ # FileCache with modifiedTime gating + two TTL knobs
services/ # composition layer; pure + IO, no HTTP
attendance/ # GetMembersWithFees, GetJuniorMembersWithFees
payments/ # FetchTransactions, FetchExceptions, BuildView
banksync/ # SyncToSheets, InferPayments (write ops)
web/
handlers/ # one file per route family
view/ # HTML view-model structs (per route)
api/ # JSON view-model structs (the parity-locked contract)
templates/ # *.tmpl, embed.FS — designed natively, not a Jinja port
static/ # js/*.js, css/*.css served via embed.FS
middleware/ # request timer, recovery, slog
config/ # mirrors scripts/config.py (env loading)
qr/ # SPD string builder + PNG via go-qrcode
tests/
fixtures/ # JSON fixtures captured from Python (PII-scrubbed)
parity/ # Go-side characterization tests (replay fixtures)
build/Dockerfile # multi-stage: latest-stable golang builder → distroless static
go.mod
```
## Library choices
All on latest stable as per the versioning policy above.
| Concern | Pick | Rationale |
|---|---|---|
| HTTP routing | `net/http` ServeMux | 8 static routes; no need for chi/gin given modern stdlib pattern matching |
| Templates | `html/template` | Auto-escaping; native Go feel |
| Static assets | `embed.FS` | Single binary, no loose files |
| Sheets/Drive | `google.golang.org/api/{sheets/v4,drive/v3}` + `option` | Official client; service-account auth via `option.WithCredentialsFile` |
| OAuth | `golang.org/x/oauth2/google` (token only; drop installed-app flow + pickle) | Production already uses service accounts |
| QR PNG | `github.com/skip2/go-qrcode` | Mature, byte-stable PNG output |
| NFKD | `golang.org/x/text/unicode/norm` + `unicode.IsMark` | Direct equivalent of `unicodedata.normalize("NFKD", ...)` |
| HTML scrape | `golang.org/x/net/html` token visitor | Counts `<table class="table">` to target the second one |
| CSV | `encoding/csv` (stdlib) | Match for Python `csv.reader` |
| Logging | `log/slog` (stdlib) | Honors `LOG_LEVEL` env |
| Diff/testing | `testing` + `github.com/google/go-cmp/cmp` | Readable `cmp.Diff` for parity assertions |
| Lint | `golangci-lint` (govet, staticcheck, errcheck, gofumpt, unused) | Standard quality gate |
## Migration sequencing — eight milestones with hard gates
**M1 — Skeleton + tooling.** Create `go/` tree, `go.mod` (latest stable
Go), Makefile targets (`go-build`, `go-test`, `go-run`, `web-go`),
`golangci-lint` config. `cmd/fuj server` prints a hello + version and
listens on :8080.
*Gate:* `make go-build` succeeds; `make web-go` serves a "hello" page on
:8080 in parallel with `make web` on :5001; lint clean.
**M2 — Pure-domain helpers, port leaf-first.** Order:
[czech_utils.py](scripts/czech_utils.py) `normalize``parse_month_references`
[attendance.py](scripts/attendance.py) `calculate_fee`/`calculate_junior_fee`
[infer_payments.py](scripts/infer_payments.py) `parse_czk_amount`
[sync_fio_to_sheets.py](scripts/sync_fio_to_sheets.py) `generate_sync_id`
[match_payments.py](scripts/match_payments.py) helpers (`_build_name_variants`,
`match_members`, `infer_transaction_details`, `format_date`) → `reconcile`.
Each gets a Go unit test plus a parity test driven by JSON fixtures from M3.
Also: `fuj fees` and `fuj reconcile` subcommands wired up (pure-domain CLIs).
*Gate:* All ported helpers pass parity tests.
**M3 — Fixture capture + characterization framework.** Build
`scripts/capture_fixtures.py` (Python helper that prints function results as
JSON to stdout — user pipes to disk) and `scripts/scrub_fixtures.py`
(replaces member names with deterministic pseudonyms `Member_<8hex>`,
scrambles sender/account/VS/bank_id while preserving structural
relationships, dates, amounts, exception keys). Capture ~10 reconcile
fixtures spanning every code path: greedy, proportional with float
remainder, even-split fallback, out-of-window credit, exception override,
`other:` purpose, junior `"?"`, comma-separated multi-person, multi-month
range, unmatched.
*Gate:* `tests/fixtures/` populated and committed; M2 parity tests green.
**M4 — IO layer behind interfaces.** Implement Sheets/Drive/Fio clients
matching Python return shapes. Drop the OAuth+pickle path entirely (service
account only). All clients have in-memory fakes for tests. Wire `fuj sync`
and `fuj infer` subcommands.
*Gate:* `go test -tags=integration ./internal/io/...` round-trips against a
test sheet (separate from prod); default-tag tests use fakes.
**M5 — JSON-only `/api/...` routes.** Add 8 Go route handlers that return
JSON. Add symmetric `/api/X` shadow endpoints in [app.py](app.py) that
`jsonify` the existing view-model dict (no transformation).
*Gate:* For each route, `cmd/parity` asserts
`cmp.Diff(python.json, go.json) == ""` modulo allowlist
(`render_time.total`, `build_meta`).
**M6 — Go-native HTML frontend.** Design Go templates cleanly (not a Jinja
port). Extract JS from inline into `internal/web/static/js/*.js` served via
`embed.FS`. Vanilla JS, no framework — same UX as Python (sortable table,
member-detail modal, name filter, month range filter, QR launcher) but
organized as proper modules. Templates render the JSON API response into
HTML; frontend JS fetches additional data from `/api/X` for the modal
rather than embedding `member_data` in `<script>`.
*Gate:* Browser smoke test of all routes on :8080 covers: name filter,
month filter, modal opens with correct months/transactions/exceptions, QR
modal renders, navigation between adults/juniors/payments works.
**M7 — Parallel-running watch period.** Both `make web-py` and `make web-go`
running locally (and in production via two containers on different ports).
Daily/manual `cmd/parity` runs catch any JSON drift. The user verifies the
Go UI matches what they expect feature-by-feature against the Python UI.
Run 12 weeks.
*Gate:* Zero non-allowlisted JSON diffs over 7 consecutive days, including
a sync-bank execution, a flush, and an attendance update. User sign-off
that the Go UI is feature-complete.
**M8 — Cutover + Python retirement.** Switch the bookmarked URL / docs to
the Go port. Keep Python container running but unrouted (or stopped) for
1 week as rollback. Then delete [app.py](app.py), [scripts/](scripts/),
the Python `Dockerfile`, and the Python tests. Update
[CLAUDE.md](CLAUDE.md) to reflect the Go-only state.
*Gate:* Two consecutive months of Go-only operation including end-of-month
settlement.
## CLI port (decided: port as Go subcommands)
Single Go binary `fuj` with subcommands replacing the existing Makefile
targets. Each reuses the domain layer directly:
| Old | New | Backed by | Milestone |
|---|---|---|---|
| `make fees` | `fuj fees` | `domain/fees` + `services/attendance` | M2 |
| `make reconcile` | `fuj reconcile` | `domain/reconcile` | M2 |
| `make sync-2026` | `fuj sync --year=2026` | `services/banksync.SyncToSheets` | M4 |
| `make infer` | `fuj infer [--dry-run]` | `services/banksync.InferPayments` | M4 |
| `make web` (py) | stays as Python `make web-py` until M8 | — | — |
| `make web-go` | `fuj server` | `web/handlers` | M1 |
Makefile targets get rewritten to invoke `./bin/fuj <subcommand>` once each
is ported. The Python `make` targets for already-ported commands stay as
`make X-py` aliases until M8, so you can run either side for cross-checks.
## JSON API contract strategy
**Go-defines, Python-conforms** with a 1-step bootstrap:
1. Run Python locally and dump `result["members"]`, `formatted_results`,
`monthly_totals`, etc., to JSON. This is the spec.
2. Hand-author Go structs with explicit `json:` tags matching exact Python
keys (`total_balance`, `original_expected`, `attendance_count` — no
reliance on default lowercasing).
3. Generate `tests/fixtures/api-schema/*.schema.json` from the Go structs
using `github.com/invopop/jsonschema`. Commit them.
4. Add a Python-side schema validator running in CI against the new
`/api/X` responses.
**Two known-tricky shapes:**
- Junior `expected: int | "?"`
```go
type Expected struct{ Value int; Unknown bool }
// MarshalJSON emits 42 or "?"
```
Same for `original_expected`.
- Tuple dict keys `(normalize(name), normalize(period))` for exceptions —
internal only, never crosses JSON. Use
`map[ExceptionKey]Exception` with `ExceptionKey struct{ Name, Period string }`.
## Characterization test harness — two tiers
(HTML rendering parity dropped: frontends are intentionally different.)
**Tier 1 — Pure-function parity** (fast, every commit). Fixtures at
`tests/fixtures/pure/<func>/<case>.json` containing `{input, output}`,
captured once via `scripts/capture_fixtures.py`. Go test reads each, calls
the ported function, asserts deep equality with `cmp.Diff`. Functions in
scope: `normalize`, `parse_month_references`, `parse_czk_amount`,
`parse_czech_amount`, `parse_czech_date`, `format_date`,
`_build_name_variants`, `match_members`, `infer_transaction_details`,
`generate_sync_id`, `calculate_fee`, `calculate_junior_fee`, `reconcile`.
**Tier 2 — JSON API parity** (medium, on PR + nightly). `cmd/parity/main.go`
hits both `:5001/api/X` and `:8080/api/X` with a fixture-seeded `tmp/`
cache, normalizes volatile fields (`render_time`, build metadata), asserts
byte-equality. Cache freezing: pre-populate `tmp/*_cache.json` from
scrubbed snapshots so both backends read identical data.
**PII scrubbing** is mandatory ([CLAUDE.md](CLAUDE.md): "Member data must
never be committed"). `scripts/scrub_fixtures.py` produces deterministic
pseudonyms preserving uniqueness and structural relationships. Only
scrubbed fixtures land in `tests/fixtures/`; raw `tmp/*.json` stays
gitignored.
## Side-by-side runtime
Two services on different ports, started independently. No reverse proxy.
```
make web-py # Python on :5001 (existing target, perhaps renamed from `make web`)
make web-go # Go on :8080
```
Both read the same Google Sheets and write to the same `tmp/` cache
directory. The user opens `localhost:5001` or `localhost:8080` directly to
A/B compare.
**Cache directory coordination**: both backends use `tmp/`. Go writes via
`os.WriteFile` to `tmp/<key>_cache.json.tmp` then `os.Rename` (atomic on
Linux). Python's writes are pre-existing-non-atomic; accept until Python
retires.
**Sync coordination**: `/sync-bank` is non-idempotent under concurrency.
Both backends `flock` on `tmp/sync.lock`; Go uses `syscall.Flock`. (In
practice the user is unlikely to trigger sync from both UIs at once, but
the lock is cheap insurance.)
**Production deployment**: keep the existing Python container; add a Go
container in `docker-compose.yml` exposed on a different port. After M8,
remove the Python service.
## CI/CD
Currently zero test CI ([.gitea/workflows/build.yaml](.gitea/workflows/build.yaml)
only does `docker build`/`push`). Add `/.gitea/workflows/test.yml`:
```yaml
jobs:
python-tests: # fix M3 broken-test references first
- uv sync && pytest tests/
go-tests:
- cd go && go test -race ./...
- cd go && golangci-lint run
parity-pure: # Tier 1
- cd go && go test -tags=parity ./tests/parity/...
```
Branch protection: `python-tests`, `go-tests`, `parity-pure` block merge.
Tier-2 parity runs nightly via `parity-nightly.yml` (boots both servers
via docker-compose with seeded caches, replays a fixed transaction script,
fails on any non-allowlisted diff).
A new Go `build/Dockerfile` (multi-stage: latest-stable `golang` builder →
`gcr.io/distroless/static:latest`, both pinned by digest) mirrors the
existing Python build job and produces a single static binary image.
## Risk register (top 4)
(Template auto-escape divergence dropped: irrelevant when frontends differ.)
1. **Sync ID hash drift** — HIGH/HIGH. Python builds the SHA-256 input by
`str()`-ing each field then `.lower()`-ing the joined string;
`str(750.0) == "750.0"`, `str(750) == "750"`. If Sheets API returns
floats in Python but Go unmarshals as int, `750` vs `750.0` → different
hash → duplicate rows. *Mitigation:* dedicated parity test with ~50
real-row fixtures; if Go can't reproduce Python's float string format,
normalize at the boundary (round to 2 decimals, format with explicit
precision).
2. **Float allocation in `reconcile()` proportional phase** — HIGH/MEDIUM.
Python's "last month absorbs remainder" depends on dict iteration order;
Go map iteration is randomized. *Mitigation:* always iterate
`sorted_months` explicitly in Go, never the map. Lock the distribution
with a parity test on (300, 300, 150) months × 751-CZK payment.
3. **NFKD edge cases** — MEDIUM/MEDIUM. Python `unicodedata` and Go
`golang.org/x/text` use the same algorithm but can differ on niche
compatibility decompositions if `x/text` is older than CPython's tables.
*Mitigation:* parity test with every distinct character ever observed in
member names; pin `x/text` version explicitly.
4. **Czech month parser semantics** — MEDIUM/MEDIUM. Wrap-around year
inference (`if start_m > end_m and m >= start_m: year = default_year - 1`)
plus the "month >= 10 → previous year" heuristic are easy to mis-port.
*Mitigation:* port table and algorithm verbatim line-for-line; parity
test with ~30 real `message`-field fixture strings.
## Cutover plan
Simpler without a proxy in the middle:
1. After M7's 7-day clean window + user sign-off, treat Go as primary.
Update bookmarks, docs, `make web` to point at Go.
2. Keep `make web-py` available for 1-week rollback. Run both containers
in production but only point users at the Go one.
3. Watch 2 weeks including a month-end settlement on Go-only.
4. Decommission Python: remove from `docker-compose.yml`, delete
[app.py](app.py) and [scripts/](scripts/), update
[CLAUDE.md](CLAUDE.md). Keep image tagged `python-final` in registry as
a 6-month rollback option.
**Retirement criteria:** zero parity-diff incidents in last 30 days, zero
rollbacks, two month-end settlements completed Go-only, manual
reconciliation review against `python-final` signed off.
## Critical files
- [scripts/match_payments.py](scripts/match_payments.py) — `reconcile()` is
the single most load-bearing function (~200 lines of allocation logic)
that must port byte-equivalently.
- [scripts/czech_utils.py](scripts/czech_utils.py) — `normalize` and
`parse_month_references` underpin every member/month match across the
system. 45 Czech month declensions, range wrap-around, year inference.
- [app.py](app.py) — defines the 8-route HTTP surface and view-model
shapes. The spec for the Go web layer's JSON API.
- [scripts/sync_fio_to_sheets.py](scripts/sync_fio_to_sheets.py) —
`generate_sync_id` defines the dedup contract against existing rows in
the live sheet. Any drift creates duplicates.
- [scripts/attendance.py](scripts/attendance.py) — fee math + merged-month
logic + junior `"?"` sentinel.
- [scripts/cache_utils.py](scripts/cache_utils.py) — Drive `modifiedTime`
gating + two-TTL fallback that must be reproduced for shared-cache
safety.
- [templates/adults.html](templates/adults.html) — read for the JSON shape
the existing inline JS consumes (`member_data`); the Go frontend doesn't
have to mirror the template, but the JSON contract derived from this
page's data injection is the parity spec.
## Verification
End-to-end checks per milestone:
- **M1**: `make go-build && ./bin/fuj server --help` prints subcommand
list. `make web-go` serves :8080 in parallel with `make web-py` on :5001.
- **M2-M3**: `cd go && go test -tags=parity ./tests/parity/pure/...` green.
Spot-check: feed a known Czech-message string through both
`parse_month_references` implementations, diff outputs.
- **M4**: `go test -tags=integration ./internal/io/sheets/...` round-trips
against a test sheet (separate from prod).
- **M5**: `curl localhost:5001/api/adults | jq -S . > py.json && curl
localhost:8080/api/adults | jq -S . > go.json && diff py.json go.json` —
empty diff modulo allowlist.
- **M6**: Browser open `localhost:8080/adults`, click a member row, modal
opens with all months / transactions / exceptions correctly populated.
Same on `/juniors`. Click a Pay button → QR loads. Name filter and month
range filter work.
- **M7**: Run `cd go && ./bin/parity --base http://localhost:5001
--candidate http://localhost:8080 --routes adults,juniors,payments`
daily for 7 days, zero non-allowlisted diffs. User confirms Go UI is
feature-complete vs Python UI side-by-side.
- **M8**: `make web-py` removed from Makefile; `make web` points at Go;
manual end-of-month settlement on Go matches the prior month's
Python-produced report.
## Open questions / forks the user can override at review
- **Frontend JS organization in M6**: default is vanilla JS in separate
files via `embed.FS`. If the user wants HTMX, Alpine.js, or a small
framework, raise it before M6.
- **CI host**: Gitea Actions assumed (matches existing
[.gitea/workflows/build.yaml](.gitea/workflows/build.yaml)).
- **Test sheet for M4 integration tests**: would need provisioning.
Confirm whether to use a copy of the production sheet (PII!) or a
synthetic one seeded by the fixture-capture process.