Files
fuj-management/docs/plans/2026-05-06-2111-go-m3-fixture-capture.md
Jan Novak 67d2f11d7c
All checks were successful
Deploy to K8s / deploy (push) Successful in 7s
feat(go): fixture capture + characterization framework (M3)
Closes M3.1–M3.6.  Parity safety net proving Go output matches Python
for every ported pure-domain function (M2.1–M2.9) and reconcile (M2.10).

Capture pipeline:
- scripts/capture_fixtures.py: calls each Python function with seeded
  inputs, emits JSON fixtures to stdout (never writes files directly).
- scripts/scrub_fixtures.py: deterministic PII scrubber — SHA-256
  pseudonyms for member names, digit-preserving hashes for VS/account/
  bank_id, name-sweep in message text.  Idempotent; no salt.
- scripts/_fixture_seeds.py: handcrafted seeds for all 11 functions;
  synthetic names throughout (no real roster members).
- scripts/capture_all_fixtures.sh: convenience wrapper for full corpus
  regeneration outside of make.

Fixture corpus (98 files, all PII-free):
- go/tests/fixtures/pure/<func>/<case>.json — 10 function directories.
- go/tests/fixtures/reconcile/<NN>_<case>.json — 10 branch-coverage
  cases: greedy, overpayment credit, proportional remainder, even-split,
  out-of-window, exception override, other: purpose, junior ?, multi-
  person+month fan-out, unmatched.

Go parity tests (//go:build parity):
- go/tests/parity/parityio.go: generic LoadDir/RunAll helpers + typed
  In/Out struct pairs for all 10 pure functions; Envelope decoder for
  int/float/none disambiguation.
- 10 pure-function test packages + bespoke reconcile test with per-cell
  float tolerance (math.Abs <= 0.01 for `paid` values).

Makefile: go-parity, go-test-all, capture-fixtures targets.
go/tests/fixtures/README.md: refresh workflow + PII audit guide.

Gate: make go-test green, make go-parity green (11/11 packages),
      make go-lint clean (parity tag), make go-build clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 23:26:24 +02:00

262 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# M3 — Fixture capture + characterization framework
> On approval: copy this plan to `docs/plans/2026-05-06-2111-go-m3-fixture-capture.md` per [CLAUDE.md](../../srv/personal/fuj-management/CLAUDE.md) plan-location convention.
## Context
The Go rewrite (tracked in [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md)) finished M2.1M2.12 — every pure-domain helper is ported and the `fuj fees` / `fuj reconcile` CLIs are wired. M3 closes the loop: it builds the **parity safety net** that proves Go output matches Python output for every ported function. Without it, M2 is "trust me", and the rewrite has no defensible cutover criterion.
M3 has three deliverables:
1. **A capture pipeline** (`scripts/capture_fixtures.py` + `scripts/scrub_fixtures.py`) that produces deterministic, PII-free JSON fixtures from the live Python implementations.
2. **A fixture corpus** at [go/tests/fixtures/](../../srv/personal/fuj-management/go/tests/fixtures/) covering the 10 pure functions of M2 (M2.1M2.9) plus 10 reconcile cases spanning every code path of `reconcile()` (M2.10).
3. **A parity test runner** in [go/tests/parity/](../../srv/personal/fuj-management/go/tests/parity/) under `//go:build parity` that replays each fixture and asserts byte/value equality against the Go port.
User-confirmed scope decisions:
- **Single MR** for all six sub-tasks (M3.1M3.6) — they're tightly coupled; no half-state is committable.
- **Type envelope only where it matters** — four fields (`generate_sync_id.tx.amount`, `parse_czk_amount.val`, `format_date.val`, `infer_transaction_details.tx.date`) use `{"type":..., "value":...}` to disambiguate int/float/none. Everything else uses raw JSON.
- **Real seeds for `parse_month_references` and `match_members` only** — read curated message strings from `tmp/payments_transactions_cache.json`, scrub, ship. Other functions stay on handcrafted seeds.
- **Plan committed at `docs/plans/2026-05-06-2111-go-m3-fixture-capture.md`** — same convention as every M-series predecessor.
## Branch + landing
- Branch: `feat/m3-fixture-capture`. Single MR via `tea pr create`. Tick M3.1M3.6 on merge with the SHA.
- No edits to existing Python or Go production code. M3 is purely additive: new scripts, new fixtures, new test files, new Makefile targets, README, CHANGELOG entry, plan archive, progress tracker tick.
## File layout
**Python (capture pipeline):**
- [scripts/capture_fixtures.py](../../srv/personal/fuj-management/scripts/capture_fixtures.py) — dispatcher CLI; one entry per function via `--func`.
- [scripts/scrub_fixtures.py](../../srv/personal/fuj-management/scripts/scrub_fixtures.py) — stdin→stdout deterministic bijection scrubber.
- [scripts/_fixture_seeds.py](../../srv/personal/fuj-management/scripts/_fixture_seeds.py) — internal: handcrafted seeds keyed by `(func, case_id)`, plus the curated real-message extractor.
**Fixture corpus** (committed, PII-free):
- [go/tests/fixtures/README.md](../../srv/personal/fuj-management/go/tests/fixtures/README.md) — refresh workflow + scrubbing audit guide.
- `go/tests/fixtures/pure/<func>/<case>.json` — one directory per function (10 functions: `normalize`, `parse_month_references`, `calculate_fee`, `calculate_junior_fee`, `parse_czk_amount`, `generate_sync_id`, `build_name_variants`, `match_members`, `infer_transaction_details`, `format_date`).
- `go/tests/fixtures/reconcile/<NN>_<case>.json` — 10 numbered reconcile cases.
**Go parity tests** (all under `//go:build parity`):
- [go/tests/parity/parityio.go](../../srv/personal/fuj-management/go/tests/parity/parityio.go) — shared loader with generic `Case[I,O]` walker, type envelopes mirrored from §3, float tolerance helper.
- [go/tests/parity/pure/<func>/<func>_parity_test.go](../../srv/personal/fuj-management/go/tests/parity/pure/) — one file per function, ~30 lines each.
- [go/tests/parity/reconcile/reconcile_parity_test.go](../../srv/personal/fuj-management/go/tests/parity/reconcile/) — bespoke comparator using `math.Abs(got-want) <= 0.01` for `paid` floats, exact equality on int balances.
**Modified:**
- [Makefile](../../srv/personal/fuj-management/Makefile) — append `go-parity`, `go-test-all`, `capture-fixtures` targets.
- [CHANGELOG.md](../../srv/personal/fuj-management/CHANGELOG.md) — single entry at top.
- [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md) — tick M3.1M3.6 with SHA.
## Capture invocation interface
Two-stage pipeline (capture | scrub) so each stage is independently debuggable:
```bash
python scripts/capture_fixtures.py --func <name> --case <id> --input-seed <id> \
| python scripts/scrub_fixtures.py \
> go/tests/fixtures/pure/<func>/<id>.json
```
Capture flags:
- `--func` — target function (`normalize`, `reconcile`, etc.).
- `--case` — human-authored case ID, becomes the file stem. Never auto-generated (auto-IDs cause git churn).
- `--input-seed <id>` — pull from `_fixture_seeds.py` registry (the default mode for handcrafted cases).
- `--input-stdin` — read a single JSON `{"args":[...], "kwargs":{...}}` doc from stdin (used by the real-message extractor for `parse_month_references` / `match_members`).
- `--all` — iterate every seed for one function, emit newline-delimited JSON to stdout. Used by the `make capture-fixtures` recipe.
Capture **never writes files**. Output goes to stdout; the caller redirects. The scrubber is always stdin→stdout. Both are pure transforms.
The `make capture-fixtures` target codifies the full refresh workflow. Humans read the target before they read the README.
## Fixture JSON shape (normative)
One JSON object per case:
```json
{
"case": "range_wrap_nov_to_jan",
"func": "scripts.czech_utils.parse_month_references",
"captured_at": "2026-05-06",
"input": { ... },
"output": { ... }
}
```
`captured_at` is date-only — same-day re-runs produce byte-identical files. No git SHA, no hostname, no time component.
### Per-function input/output schemas
The schema is the **stable contract** between Python capture and Go consumption. Where Python returns heterogeneous types, the capture step pre-translates to the typed shape Go expects.
| Function | Input | Output |
|---|---|---|
| `normalize` | `{"text":"…"}` | `{"text":"…"}` |
| `parse_month_references` | `{"text":"…","default_year":2026}` | `{"months":["2026-01",…]}` |
| `calculate_fee` | `{"attendance_count":3,"month_key":"2026-02"}` | `{"fee":750}` |
| `calculate_junior_fee` | `{"attendance_count":1,"month_key":"2026-02"}` | `{"value":0,"unknown":true}` (mirrors `fees.Expected{Value, Unknown}`) |
| `parse_czk_amount` | `{"val":<envelope>}` | `{"amount":1500.0}` |
| `generate_sync_id` | `{"tx":{"date":"…","amount":<envelope>,"currency":"CZK","sender":"…","vs":"…","message":"…","bank_id":"…"}}` | `{"sync_id":"<sha256-hex>"}` |
| `_build_name_variants` | `{"name":"…"}` | `{"variants":["…"]}` |
| `match_members` | `{"text":"…","member_names":["…"]}` | `{"matches":[{"name":"…","confidence":"auto"}]}` |
| `infer_transaction_details` | `{"tx":{"sender":"…","message":"…","user_id":"…","date":<envelope>},"member_names":[…],"default_year":2026}` | `{"members":[…],"months":[…],"search_text":"…"}` |
| `format_date` | `{"val":<envelope>}` | `{"date":"…"}` |
**Type envelope** (used in 4 fields above):
```json
{"type":"int","value":750} // distinguishes 750 from 750.0
{"type":"float","value":750.0}
{"type":"string","value":"…"}
{"type":"none"}
```
The envelope is the answer to the `generate_sync_id` parity risk: Python's `str(750.0) == "750.0"` vs `str(750) == "750"` produces different SHA-256 inputs. JSON natively conflates these; the envelope round-trips them. Go's loader switches on `type` and constructs the matching native value before calling the port.
**`reconcile`** uses raw JSON for everything (its inputs are typed maps/slices already), with one nuance: the `Member.fees[month]` value can be an `int` or a `(fee, count)` tuple per [match_payments.py:339-340](../../srv/personal/fuj-management/scripts/match_payments.py#L339). Capture normalises both to `{"fee":int,"count":int}` so Go side has one shape.
## Scrubber strategy
`scrub_fixtures.py`: stdin → stdout, no state, no salt, no random. Deterministic plain SHA-256. Re-runs are idempotent. Trade-off acknowledged: an attacker with the script can mathematically reverse the mapping. That's fine — the scrubber's job is to keep PII out of git diffs and Claude transcripts, not to defend against an adversary with the source tree.
### Scramble whitelist (only these field keys are scrambled)
`name`, `member_names[]`, `person`, `sender`, `sender_account`, `account`, `vs`, `bank_id`, `user_id`, `note`. Plus a per-document name-substring sweep over `message` strings — applied **before** the field-key walk, because real names show up embedded in message text.
Everything else (dates, amounts, currency, `month_key`, `attendance_count`, `purpose`, `confidence`, `expected`, `paid`, `total_balance`, `fee`, all `YYYY-MM` keys, `match`/`matches` structure) is preserved verbatim. **Whitelist-of-scramble** (not blacklist-of-preserve): when a new field appears, it stays raw until someone explicitly adds it to the list. Fails safe.
### Scrambling functions
- **Names**: `Member_<8hex>` where `<8hex> = sha256(name).hexdigest()[:8]`. Same name → same pseudonym across the whole document and across all fixtures. Stable diffs.
- **Account numbers** (`[0-9]+/[0-9]{4}`): scramble prefix and bank-suffix separately, preserving length and format.
- **VS / bank_id / user_id**: digit-string-preserving hash to a same-length numeric token. Non-numeric input → `id_<8hex>`.
- **Note**: replaced verbatim with `"<scrubbed>"`. Notes are never load-bearing for any test.
- **Message** (free text): name-sweep applied; rest preserved. Corpus author spot-checks before commit. README §5 documents the audit grep.
## Reconcile fixtures (10 handcrafted cases)
All seeds live in `_fixture_seeds.py` as triples `(members, sorted_months, transactions, exceptions, default_year)`. Capture runs the live Python `reconcile()` and emits canonical JSON; scrubber is a no-op for handcrafted synthetic names but runs anyway for uniformity.
| File | Branch exercised |
|---|---|
| `01_greedy_exact.json` | Greedy: amount == sum(expected); zero credit. |
| `02_greedy_overpayment_credit.json` | Greedy with overflow → credit. |
| `03_proportional_remainder.json` | Underpayment across 3 months with non-integer split (last month absorbs float remainder per [match_payments.py:421+](../../srv/personal/fuj-management/scripts/match_payments.py#L421)). |
| `04_even_split_prepayment.json` | All `expected == 0` → even-split fallback. |
| `05_out_of_window_credit.json` | Month outside `sorted_months` → that share goes to credits, in-window proportional for the rest. |
| `06_exception_override.json` | Exception entry overrides expected. |
| `07_other_purpose_split.json` | `purpose="other:tournament"` with two members. |
| `08_junior_question_mark.json` | Junior with attendance count 1 → `Expected{Unknown:true}`; reconcile reads it as 0 expected. |
| `09_multiperson_multimonth.json` | `person="Alice, Bob", purpose="2026-01, 2026-02"` → 2x2 fan-out: even-split-by-people then proportional-by-month. |
| `10_unmatched.json` | Empty `person`, garbage message → goes to `unmatched`. |
The seed registry is the **single source of truth** for these inputs. If Python behaviour drifts intentionally, fixtures regenerate cleanly via `make capture-fixtures`.
## Real-data seeds (for `parse_month_references` and `match_members` only)
`_fixture_seeds.py` reads `tmp/payments_transactions_cache.json` (already gitignored) and selects:
- **`parse_month_references`**: ~15 distinct messages exercising the 45 Czech month declensions, range wraps (`"prosinec-leden"`), year inference, and the `m >= 10 → previous year` heuristic. Selection done once interactively, the chosen indices hardcoded into `_fixture_seeds.py` so re-runs are deterministic. Messages flow through capture (which calls `parse_month_references(msg, default_year=2026)`) then scrubber (name-sweep against the live member roster).
- **`match_members`**: ~10 distinct `(message, member_names)` pairs exercising auto vs review confidence, common-surname filter, exact-short-circuit. Same pipeline.
**Out of scope for real seeds**: `normalize`, `_build_name_variants`, `reconcile`. These either don't benefit from real data (synthetic exhaustively covers `normalize`, `_build_name_variants`) or have surgical-input requirements that real data can't reliably hit (`reconcile`'s 10 branches).
## Go parity-test layout
One file per function, one Go package per function, mirroring the fixture tree. Each file is short (~30 lines):
```go
//go:build parity
package normalize_parity_test
import (
"fuj-management/go/internal/domain/czech"
"fuj-management/go/tests/parity"
"testing"
)
func TestNormalizeParity(t *testing.T) {
t.Parallel()
parity.RunAll(t, "../../../fixtures/pure/normalize",
func(in parity.NormalizeIn) parity.NormalizeOut {
return parity.NormalizeOut{Text: czech.Normalize(in.Text)}
})
}
```
The shared [go/tests/parity/parityio.go](../../srv/personal/fuj-management/go/tests/parity/parityio.go) (also `//go:build parity`) provides:
- `Case[I, O any]` generic loader: walks a fixture directory, decodes each `.json`, returns `(name, input, want)` triples.
- `RunAll[I, O any](t, dir, fn func(I) O)`: invokes `fn`, compares against `want` with `reflect.DeepEqual` (sorted-slice normalisation for the few sets-cast-to-lists Python returns); for floats uses `math.Abs(got-want) <= 0.01`.
- One typed `<Func>In` / `<Func>Out` struct pair per function (10 pairs), mirroring §3's JSON shape exactly. Envelope decoder helpers (`AmountEnvelope`, `ValueEnvelope`) live here.
**Reconcile is bespoke**`reconcile/reconcile_parity_test.go` doesn't use `RunAll` because it needs cell-by-cell tolerant float compare across nested maps. It walks the fixture dir directly.
**Why one-file-per-function** (instead of an umbrella runner): each function lives in a different domain package, so tests must `import` a different package; an umbrella would obscure which package is being checked. Split also enables `go test -tags=parity ./tests/parity/pure/normalize/` to iterate on a single port.
**Why a separate test tree** (instead of co-located parity tests): the M2 unit tests are co-located by convention (e.g. [go/internal/domain/czech/normalize_test.go](../../srv/personal/fuj-management/go/internal/domain/czech/normalize_test.go)). The progress tracker explicitly says fixtures live at `go/tests/fixtures/` and the gate is `go test -tags=parity ./tests/parity/pure/...`. Co-location would scatter fixtures across packages — messy. Separate tree wins.
## Build tag + Makefile
Every parity test file starts with `//go:build parity`. Default `make go-test` excludes them; `make go-parity` runs them:
```makefile
go-parity:
cd $(GO_SRC) && go test -tags=parity ./tests/parity/...
go-test-all: go-test go-parity
capture-fixtures:
@bash scripts/capture_all_fixtures.sh # invokes capture | scrub for every seed
```
Parity is **not** folded into default `go-test`: keeps the M2 unit-test loop fast, and a missing-fixture failure shouldn't block routine work. CI runs both targets independently so a parity break is a distinct red signal from a unit-test break.
## README content (`go/tests/fixtures/README.md`)
Six sections, ~120 lines:
1. **What's in this tree** — directory map; one line per fixture function explaining what it validates.
2. **Fixture format** — link to schemas in §3; worked example for `parse_month_references` and one for `reconcile`.
3. **Refresh workflow**`make capture-fixtures` regenerates everything; single-file recipe for incremental updates. Always diff before committing.
4. **When to refresh** — bullet list (schema change, new Czech declension, new fee tier, new reconcile branch). **Do not refresh to "fix" a parity failure** without first proving the Python behaviour is the intended one.
5. **Verifying scrubbing**`git diff` should show only `Member_<hex>`-shaped names, `<scrubbed>` notes, structurally-preserved account/VS digits. Audit grep: `git ls-files go/tests/fixtures | xargs grep -l '<your real name>'` should return zero before commit.
6. **Adding a new fixture** — three steps (add to `_fixture_seeds.py`, run capture, add `In/Out` Go struct fields if needed).
## Parity concerns
- **Float arithmetic in reconcile proportional phase**: ordering-sensitive, may diverge between Python and Go due to FMA. Tolerance `0.01` already in [go/internal/domain/reconcile/reconcile_test.go](../../srv/personal/fuj-management/go/internal/domain/reconcile/reconcile_test.go); parity uses the same tolerance.
- **Sync-ID float-vs-int stringification**: handled by the envelope (§3). Capture two paired cases per amount value (`amount_750_int.json`, `amount_750_float.json`) so any Go-side conflation surfaces immediately.
- **NFKD edge cases**: capture set must include rare characters from real names. The handcrafted `normalize` seeds enumerate every distinct character observed in the live member roster (extracted once from `tmp/attendance_regular_cache.json`, hardcoded into `_fixture_seeds.py` as a single-character-per-case sweep).
- **Czech month declensions**: the real-message seeds for `parse_month_references` cover the wild; handcrafted seeds cover the corner cases (`prosinec-leden` wrap, `m >= 10` heuristic).
- **Insertion-order determinism in `reconcile`**: Python 3.7+ dict iteration is insertion-ordered; the seed registry preserves order. Go side iterates `sortedMonths` slice explicitly; the parity test verifies this.
- **`infer_transaction_details` default_year**: Python signature defaults to 2026; capture passes `default_year` as an explicit input. Go side reads it from the fixture.
## Out of scope (explicitly DO NOT touch)
- Real Google Sheets / Drive / Fio loader implementations — M4.1M4.6.
- Web routes / handlers — M5.
- `fuj sync` and `fuj infer` subcommands — M4.7/M4.8.
- Tier-2 JSON-API parity (`cmd/parity/main.go`) — M5.4.
- Any change to existing Python code (capture is read-only against the production scripts).
- Any change to existing Go production code under `go/internal/`.
## Verification
1. `make go-build` — clean build (parity tests excluded by default tag).
2. `make go-test` — all M2 unit tests still green; no parity test runs.
3. `make go-parity` — every fixture in `go/tests/fixtures/pure/` and `go/tests/fixtures/reconcile/` deserialises and passes its parity assertion.
4. `make go-lint` — clean (parity test files lint-clean under `-tags=parity` since `golangci-lint` honours build tags via `.golangci.yml`).
5. **Capture round-trip**: pick one fixture (e.g. `parse_month_references/range_wrap_nov_to_jan.json`), regenerate via `python scripts/capture_fixtures.py --func parse_month_references --case range_wrap_nov_to_jan --input-seed range_wrap_nov_to_jan | python scripts/scrub_fixtures.py`, confirm byte-identical to the committed file.
6. **Scrubbing audit**: run the README §5 grep against any name from the live roster — zero hits.
7. **Reconcile branch coverage**: read each of the 10 reconcile fixture files, confirm the `output` field shows the expected branch (e.g. `02_greedy_overpayment_credit.json` has a non-zero `credits` entry; `04_even_split_prepayment.json` has equal `paid` across all months).
8. Append CHANGELOG entry per [CLAUDE.md](../../srv/personal/fuj-management/CLAUDE.md) (timestamp via `date "+%Y-%m-%d %H:%M %Z"`).
9. Tick M3.1M3.6 in [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md) with the merge SHA. Update the M3 milestone summary line if M3 is now fully closed.
10. Push branch, open MR via `tea pr create --title "feat(go): fixture capture + characterization framework (M3)" --base main --head feat/m3-fixture-capture`, print URL, leave merge to user.
## Critical files
- **Read for parity** — [scripts/czech_utils.py:22](../../srv/personal/fuj-management/scripts/czech_utils.py#L22), [scripts/czech_utils.py:28](../../srv/personal/fuj-management/scripts/czech_utils.py#L28), [scripts/attendance.py:91](../../srv/personal/fuj-management/scripts/attendance.py#L91), [scripts/attendance.py:100](../../srv/personal/fuj-management/scripts/attendance.py#L100), [scripts/infer_payments.py:17](../../srv/personal/fuj-management/scripts/infer_payments.py#L17), [scripts/sync_fio_to_sheets.py:62](../../srv/personal/fuj-management/scripts/sync_fio_to_sheets.py#L62), [scripts/match_payments.py:33](../../srv/personal/fuj-management/scripts/match_payments.py#L33), [scripts/match_payments.py:65](../../srv/personal/fuj-management/scripts/match_payments.py#L65), [scripts/match_payments.py:144](../../srv/personal/fuj-management/scripts/match_payments.py#L144), [scripts/match_payments.py:187](../../srv/personal/fuj-management/scripts/match_payments.py#L187), [scripts/match_payments.py:304](../../srv/personal/fuj-management/scripts/match_payments.py#L304).
- **Reuse** — `domain/czech.{Normalize, ParseMonthReferences}`, `domain/fees.{CalculateFee, CalculateJuniorFee, Expected}`, `domain/money.ParseCZK`, `domain/synch.GenerateSyncID`, `domain/matching.{BuildNameVariants, MatchMembers, InferTransactionDetails, FormatDate}`, `domain/reconcile.{Member, Transaction, ExceptionKey, Exception, Result, Reconcile}`.
- **Mirror conventions** — package layout from [go/internal/domain/matching/](../../srv/personal/fuj-management/go/internal/domain/matching/) (one symbol per file, top-of-test provenance comments, `t.Parallel()`, `// [Go]` markers for Go-only cases).
- **New** — `scripts/{capture_fixtures,scrub_fixtures,_fixture_seeds}.py`; `go/tests/fixtures/README.md` + the corpus; `go/tests/parity/parityio.go` + 10 parity test files + 1 reconcile parity test file.
- **Modify** — `Makefile` (3 new targets), `CHANGELOG.md` (1 entry), `docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md` (tick M3.1M3.6).