feat(go): fixture capture + characterization framework (M3)
All checks were successful
Deploy to K8s / deploy (push) Successful in 7s

Closes M3.1–M3.6.  Parity safety net proving Go output matches Python
for every ported pure-domain function (M2.1–M2.9) and reconcile (M2.10).

Capture pipeline:
- scripts/capture_fixtures.py: calls each Python function with seeded
  inputs, emits JSON fixtures to stdout (never writes files directly).
- scripts/scrub_fixtures.py: deterministic PII scrubber — SHA-256
  pseudonyms for member names, digit-preserving hashes for VS/account/
  bank_id, name-sweep in message text.  Idempotent; no salt.
- scripts/_fixture_seeds.py: handcrafted seeds for all 11 functions;
  synthetic names throughout (no real roster members).
- scripts/capture_all_fixtures.sh: convenience wrapper for full corpus
  regeneration outside of make.

Fixture corpus (98 files, all PII-free):
- go/tests/fixtures/pure/<func>/<case>.json — 10 function directories.
- go/tests/fixtures/reconcile/<NN>_<case>.json — 10 branch-coverage
  cases: greedy, overpayment credit, proportional remainder, even-split,
  out-of-window, exception override, other: purpose, junior ?, multi-
  person+month fan-out, unmatched.

Go parity tests (//go:build parity):
- go/tests/parity/parityio.go: generic LoadDir/RunAll helpers + typed
  In/Out struct pairs for all 10 pure functions; Envelope decoder for
  int/float/none disambiguation.
- 10 pure-function test packages + bespoke reconcile test with per-cell
  float tolerance (math.Abs <= 0.01 for `paid` values).

Makefile: go-parity, go-test-all, capture-fixtures targets.
go/tests/fixtures/README.md: refresh workflow + PII audit guide.

Gate: make go-test green, make go-parity green (11/11 packages),
      make go-lint clean (parity tag), make go-build clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-06 23:26:24 +02:00
parent 28f0e468f7
commit 67d2f11d7c
119 changed files with 4931 additions and 10 deletions

128
go/tests/fixtures/README.md vendored Normal file
View File

@@ -0,0 +1,128 @@
# Parity Fixtures
Captured outputs from the live Python implementation used as ground truth for
the Go parity test suite. All 98 files are committed and PII-free.
## Directory layout
```
fixtures/
pure/
normalize/ # scripts.czech_utils.normalize
parse_month_references/ # scripts.czech_utils.parse_month_references
calculate_fee/ # scripts.attendance.calculate_fee
calculate_junior_fee/ # scripts.attendance.calculate_junior_fee
parse_czk_amount/ # scripts.infer_payments.parse_czk_amount
generate_sync_id/ # scripts.sync_fio_to_sheets.generate_sync_id
build_name_variants/ # scripts.match_payments._build_name_variants
match_members/ # scripts.match_payments.match_members
infer_transaction_details/ # scripts.match_payments.infer_transaction_details
format_date/ # scripts.match_payments.format_date
reconcile/ # scripts.match_payments.reconcile (10 branch-coverage cases)
```
## Fixture format
One JSON object per file:
```json
{
"case": "range_wrap_nov_to_jan",
"func": "scripts.czech_utils.parse_month_references",
"captured_at": "2026-05-06",
"input": { "text": "...", "default_year": 2026 },
"output": { "months": ["2025-11", "2025-12", "2026-01"] }
}
```
`captured_at` is date-only so same-day re-runs produce byte-identical files.
### Amount type envelope
Four fields carry a type envelope to distinguish Python `int` / `float` / `None`:
```json
{"type": "int", "value": 750}
{"type": "float", "value": 750.0}
{"type": "string", "value": "..."}
{"type": "none"}
```
Fields that use envelopes: `generate_sync_id.tx.amount`, `parse_czk_amount.val`,
`format_date.val`, `infer_transaction_details.tx.date`.
### Reconcile member format
Reconcile input members use a named dict to allow consistent PII scrubbing:
```json
{"name": "Member_d035d9f9", "tier": "A", "fees": {"2026-01": [750, 3]}}
```
## Running the parity tests
```bash
make go-parity # run all parity tests
make go-test-all # unit tests + parity tests
```
Or directly:
```bash
cd go && go test -tags=parity ./tests/parity/...
cd go && go test -tags=parity -v -run TestReconcileParity ./tests/parity/reconcile/
```
## Refresh workflow
Regenerate the entire corpus from the live Python implementation:
```bash
make capture-fixtures
git diff go/tests/fixtures/ # review changes before committing
```
To refresh a single function:
```bash
PYTHONPATH=scripts:. python3 scripts/capture_fixtures.py --func normalize --all \
| while IFS= read -r line; do
id=$(echo "$line" | python3 -c "import sys,json; print(json.load(sys.stdin)['case'])")
echo "$line" | python3 scripts/scrub_fixtures.py \
> go/tests/fixtures/pure/normalize/${id}.json
done
```
## When to refresh
- A ported function is intentionally changed to match updated Python behaviour.
- A new Czech declension or fee tier is added to the Python implementation.
- A new reconcile code path needs fixture coverage.
**Do not refresh to silence a failing parity test** without first confirming that
the Python behaviour is the correct reference. A parity failure means either the
Go port diverges or the Python implementation changed — diagnose before regenerating.
## PII scrubbing audit
No real member names should appear in committed fixtures. Before committing any
regenerated fixtures, verify with:
```bash
# Replace with names from the real roster to check:
git ls-files go/tests/fixtures | xargs grep -l "Real Name Here" | head
```
The scrubber applies deterministic SHA-256 pseudonyms (`Member_<8hex>`) to all
PII fields. `match_members` and `infer_transaction_details` fixtures use a
synthetic roster of fictional names and are exempt from field-key scrubbing;
verify that no real roster names appear in their `member_names` arrays.
## Adding a new fixture
1. Add a seed to `scripts/_fixture_seeds.py` under `SEEDS[("func_name", "case_id")]`.
2. Add `In`/`Out` struct fields to `go/tests/parity/parityio.go` if the function
is new.
3. Run the single-file capture recipe above and review the diff.
4. The parity test picks up new fixtures automatically — no test code changes needed
(unless the function itself is new).