Files
Jan Novak 67d2f11d7c
All checks were successful
Deploy to K8s / deploy (push) Successful in 7s
feat(go): fixture capture + characterization framework (M3)
Closes M3.1–M3.6.  Parity safety net proving Go output matches Python
for every ported pure-domain function (M2.1–M2.9) and reconcile (M2.10).

Capture pipeline:
- scripts/capture_fixtures.py: calls each Python function with seeded
  inputs, emits JSON fixtures to stdout (never writes files directly).
- scripts/scrub_fixtures.py: deterministic PII scrubber — SHA-256
  pseudonyms for member names, digit-preserving hashes for VS/account/
  bank_id, name-sweep in message text.  Idempotent; no salt.
- scripts/_fixture_seeds.py: handcrafted seeds for all 11 functions;
  synthetic names throughout (no real roster members).
- scripts/capture_all_fixtures.sh: convenience wrapper for full corpus
  regeneration outside of make.

Fixture corpus (98 files, all PII-free):
- go/tests/fixtures/pure/<func>/<case>.json — 10 function directories.
- go/tests/fixtures/reconcile/<NN>_<case>.json — 10 branch-coverage
  cases: greedy, overpayment credit, proportional remainder, even-split,
  out-of-window, exception override, other: purpose, junior ?, multi-
  person+month fan-out, unmatched.

Go parity tests (//go:build parity):
- go/tests/parity/parityio.go: generic LoadDir/RunAll helpers + typed
  In/Out struct pairs for all 10 pure functions; Envelope decoder for
  int/float/none disambiguation.
- 10 pure-function test packages + bespoke reconcile test with per-cell
  float tolerance (math.Abs <= 0.01 for `paid` values).

Makefile: go-parity, go-test-all, capture-fixtures targets.
go/tests/fixtures/README.md: refresh workflow + PII audit guide.

Gate: make go-test green, make go-parity green (11/11 packages),
      make go-lint clean (parity tag), make go-build clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 23:26:24 +02:00

4.2 KiB

Parity Fixtures

Captured outputs from the live Python implementation used as ground truth for the Go parity test suite. All 98 files are committed and PII-free.

Directory layout

fixtures/
  pure/
    normalize/                 # scripts.czech_utils.normalize
    parse_month_references/    # scripts.czech_utils.parse_month_references
    calculate_fee/             # scripts.attendance.calculate_fee
    calculate_junior_fee/      # scripts.attendance.calculate_junior_fee
    parse_czk_amount/          # scripts.infer_payments.parse_czk_amount
    generate_sync_id/          # scripts.sync_fio_to_sheets.generate_sync_id
    build_name_variants/       # scripts.match_payments._build_name_variants
    match_members/             # scripts.match_payments.match_members
    infer_transaction_details/ # scripts.match_payments.infer_transaction_details
    format_date/               # scripts.match_payments.format_date
  reconcile/                   # scripts.match_payments.reconcile (10 branch-coverage cases)

Fixture format

One JSON object per file:

{
  "case":        "range_wrap_nov_to_jan",
  "func":        "scripts.czech_utils.parse_month_references",
  "captured_at": "2026-05-06",
  "input":       { "text": "...", "default_year": 2026 },
  "output":      { "months": ["2025-11", "2025-12", "2026-01"] }
}

captured_at is date-only so same-day re-runs produce byte-identical files.

Amount type envelope

Four fields carry a type envelope to distinguish Python int / float / None:

{"type": "int",    "value": 750}
{"type": "float",  "value": 750.0}
{"type": "string", "value": "..."}
{"type": "none"}

Fields that use envelopes: generate_sync_id.tx.amount, parse_czk_amount.val, format_date.val, infer_transaction_details.tx.date.

Reconcile member format

Reconcile input members use a named dict to allow consistent PII scrubbing:

{"name": "Member_d035d9f9", "tier": "A", "fees": {"2026-01": [750, 3]}}

Running the parity tests

make go-parity          # run all parity tests
make go-test-all        # unit tests + parity tests

Or directly:

cd go && go test -tags=parity ./tests/parity/...
cd go && go test -tags=parity -v -run TestReconcileParity ./tests/parity/reconcile/

Refresh workflow

Regenerate the entire corpus from the live Python implementation:

make capture-fixtures
git diff go/tests/fixtures/   # review changes before committing

To refresh a single function:

PYTHONPATH=scripts:. python3 scripts/capture_fixtures.py --func normalize --all \
  | while IFS= read -r line; do
      id=$(echo "$line" | python3 -c "import sys,json; print(json.load(sys.stdin)['case'])")
      echo "$line" | python3 scripts/scrub_fixtures.py \
        > go/tests/fixtures/pure/normalize/${id}.json
    done

When to refresh

  • A ported function is intentionally changed to match updated Python behaviour.
  • A new Czech declension or fee tier is added to the Python implementation.
  • A new reconcile code path needs fixture coverage.

Do not refresh to silence a failing parity test without first confirming that the Python behaviour is the correct reference. A parity failure means either the Go port diverges or the Python implementation changed — diagnose before regenerating.

PII scrubbing audit

No real member names should appear in committed fixtures. Before committing any regenerated fixtures, verify with:

# Replace with names from the real roster to check:
git ls-files go/tests/fixtures | xargs grep -l "Real Name Here" | head

The scrubber applies deterministic SHA-256 pseudonyms (Member_<8hex>) to all PII fields. match_members and infer_transaction_details fixtures use a synthetic roster of fictional names and are exempt from field-key scrubbing; verify that no real roster names appear in their member_names arrays.

Adding a new fixture

  1. Add a seed to scripts/_fixture_seeds.py under SEEDS[("func_name", "case_id")].
  2. Add In/Out struct fields to go/tests/parity/parityio.go if the function is new.
  3. Run the single-file capture recipe above and review the diff.
  4. The parity test picks up new fixtures automatically — no test code changes needed (unless the function itself is new).