Compare commits

..

1 Commits

Author SHA1 Message Date
6f36225187 feat(go): M5.4 — parity diff binary + make parity
All checks were successful
Deploy to K8s / deploy (push) Successful in 10s
Adds cmd/parity/main.go: a standalone Go binary that GETs
/api/version, /api/adults, /api/juniors, /api/payments from both
the Python (:5001) and Go (:8080) backends, scrubs an allowlist
(render_time.total, build_meta), and prints cmp.Diff for any
remaining differences.  Exits 0 on full match, 1 on diffs, 2 on
fetch/parse errors — CI-friendly for M7.2.

- go/cmd/parity/main.go: flags (-py, -go, -route, -timeout), fetch
  helper, allowlist scrubber (dotted-path aware), exit-code logic.
- go/cmd/parity/scrub_test.go: 4 unit tests for the scrubber.
- go/go.mod: promote github.com/google/go-cmp to direct dep.
- Makefile: parity target + help entry.
- Progress tracker: M5.4 ticked; milestone updated to M5 complete.
- Plan archived to docs/plans/2026-05-07-2254-m5-4-parity-binary.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-07 22:57:30 +02:00
9 changed files with 15 additions and 113 deletions

View File

@@ -1,31 +1,8 @@
# Changelog
## 2026-05-07 23:51 CEST — feat(py): M5.4 fix #2 — add vs and sync_id to payments tx projection
- `scripts/match_payments.py`: `fetch_sheet_data` now reads `VS` and `Sync ID` columns and includes `vs`/`sync_id` keys in every tx dict. Previously only 9 columns were projected, causing `make parity` to report extra `vs`/`sync_id` fields on every raw payment row emitted by the Go backend. Values flow through `group_payments_by_person``_unwrap_view_model_for_api` to `raw_payments` (adults/juniors) and `grouped_payments` (payments) automatically.
- `tests/test_app.py`: updated `/api/*` mock fixtures to include `vs`/`sync_id` keys for realism.
- **Cache note**: after deploying, hit `POST /flush-cache` once so the in-process cache is cleared and the next request picks up the new column lookups.
## 2026-05-07 23:37 CEST — fix(go): accept single-digit day/month in attendance date headers
- `go/internal/services/membership/sources.go`: `parseDates` now uses Go time formats `2.1.2006` and `1/2/2006` (single-digit reference forms, which accept both padded and unpadded inputs) instead of `02.01.2006` and `01/02/2006`. The Czech attendance sheet headers contain dates like `1.6.2026`, `23.3.2026`, `6.4.2026` — Go silently dropped those columns under the strict zero-padded format, while Python's `strptime("%d.%m.%Y")` accepted them. Effect was a missing `2026-06` month entirely on `/api/juniors` plus undercounted attendance for any month with single-digit columns; both surfaced as diffs in `make parity`.
- `sources_test.go::TestParseDates_SingleDigitDayMonth` added as a regression guard covering both Czech and US format flavours with and without leading zeros.
## 2026-05-07 23:17 CEST — fix(go): pass raw value to FormatDate so numeric serial-day dates format
- `go/internal/services/membership/sources.go`: transaction-row parser now passes `row[idxDate]` directly to `matching.FormatDate` (via a new `getRaw` helper) instead of stringifying first via `getVal`. The Sheets API returns numeric serial-day values as `float64` for date-formatted cells; pre-stringifying them defeated `FormatDate`'s `case float64:` dispatch, causing all numeric dates to leak through as `"46147"` style strings instead of `"2026-05-05"`.
- Surfaced by `make parity` (M5.4): every `transactions[].date` field on `/api/adults` and `/api/juniors` differed between Python and Go.
- `sources_test.go::TestLoadTransactions` extended with a numeric-serial-day row covering the regression.
## 2026-05-07 23:05 CEST — fix(go): default CacheDir to `tmp/go` to avoid Python collision
- `go/internal/config/config.go`: `CacheDir` default changed from `tmp` to `tmp/go`. Override via `CACHE_DIR` env var still works.
- Why: both backends used `tmp/<key>_cache.json` with the same keys (`attendance_regular`, `attendance_juniors`, `payments_transactions`, `exceptions_dict`) but different shapes — Python caches post-processed view-model tuples, Go caches raw rows. Whichever wrote last poisoned the cache; running both in parallel produced `ValueError: too many values to unpack (expected 2, got 68)` on Python's `/adults` after the Go server populated `attendance_regular_cache.json` with raw CSV rows.
- After upgrading: stop the Go server, hit `/flush-cache` on the Python side once (rewrites `tmp/*.json` with correct shapes), then restart `make web-go` — it will use `tmp/go/` going forward. Required for the M5.4 `make parity` workflow which assumes both backends run side-by-side.
## 2026-05-07 22:55 CEST — feat(go): M5.4 — parity diff binary + `make parity`
- `go/cmd/parity/main.go`: new standalone binary that GETs `/api/adults`, `/api/juniors`, `/api/payments` from both Python (:5001) and Go (:8080) backends, scrubs an allowlist (`render_time.total`), and prints `cmp.Diff` for any remaining differences. Exits 0 on full match, 1 on diffs, 2 on fetch/parse errors — CI-friendly for M7.2. `/api/version` is excluded by design (returns binary identity — tag/commit/build_date — which differs between independently built backends); still accessible via `make parity ARGS="-route /api/version"`.
- `go/cmd/parity/main.go`: new standalone binary that GETs `/api/version`, `/api/adults`, `/api/juniors`, `/api/payments` from both Python (:5001) and Go (:8080) backends, scrubs an allowlist (`render_time.total`, `build_meta`), and prints `cmp.Diff` for any remaining differences. Exits 0 on full match, 1 on diffs, 2 on fetch/parse errors — CI-friendly for M7.2.
- `go/cmd/parity/scrub_test.go`: 4 unit tests covering top-level delete, nested delete, missing path, and non-map parent.
- `go/go.mod`: `github.com/google/go-cmp` promoted to direct dependency.
- `Makefile`: `parity` target added (`.PHONY`, help, `cd go && go run ./cmd/parity`).

View File

@@ -154,7 +154,6 @@ Goal: Go is the one true backend.
(Add entries as you go. Format: `YYYY-MM-DD — short note`.)
- 2026-05-07 — `/api/version` excluded from parity diff by design. Each binary's tag/commit/build_date is identity, not a data contract — diffing it would always flag a diff between independently built backends. Route remains reachable via `make parity ARGS="-route /api/version"` for manual inspection.
- 2026-05-04 — Plan approved. Versioning policy: latest stable for Go and all libs at the time M1 starts. Frontends explicitly allowed to diverge between Python and Go; only the JSON API contract is parity-locked. No reverse proxy — both backends run on different ports via `make web-py` / `make web-go`.
- 2026-05-07 — M4 complete. Chose fakes-only unit tests (no live integration tests) and CSV-via-public-URL for attendance (no Sheets API auth required for read-only). golangci-lint gofumpt extra-rules differ slightly from standalone gofumpt; used `golangci-lint run --fix --enable-only gofumpt` to auto-resolve formatting.
- 2026-05-04 — M1 complete. Dockerfile base changed from `distroless/static:nonroot` → `alpine:3` for debuggability (can tighten later). CLI dispatcher uses stdlib `flag`; module path `fuj-management/go`. golangci-lint v1 embedded gofumpt merges all imports into one group (no stdlib/local split) — accepted as the project style.

View File

@@ -14,17 +14,13 @@ import (
"github.com/google/go-cmp/cmp/cmpopts"
)
// /api/version is intentionally excluded — it returns each binary's own build
// identity (tag/commit/build_date), which differs by design between independently
// built backends. Pass -route /api/version to inspect it manually.
var allRoutes = []string{"/api/adults", "/api/juniors", "/api/payments"}
var allRoutes = []string{"/api/version", "/api/adults", "/api/juniors", "/api/payments"}
// defaultAllowlist holds dotted key paths to strip before diffing.
// render_time.total is forward-compatible insurance: today it lives in the Jinja
// template context only (app.py inject_render_time) and is logged via
// middleware/timer.go on the Go side, so it isn't in any JSON response. If either
// side ever surfaces it under a render_time envelope, the scrubber handles it.
var defaultAllowlist = []string{"render_time.total"}
// These fields are expected to differ between backends (e.g. build tags, timing)
// or may appear on one side only. Today both are absent from the JSON — this is
// forward-compatible insurance for if either is added later.
var defaultAllowlist = []string{"render_time.total", "build_meta"}
func main() {
pyURL := flag.String("py", "http://localhost:5001", "Python backend base URL")

View File

@@ -50,7 +50,7 @@ func Load() Config {
return Config{
CredentialsPath: env("CREDENTIALS_PATH", ".secret/fuj-management-bot-credentials.json"),
BankAccount: env("BANK_ACCOUNT", "CZ8520100000002800359168"),
CacheDir: env("CACHE_DIR", "tmp/go"),
CacheDir: env("CACHE_DIR", "tmp"),
CacheTTL: envDuration("CACHE_TTL_SECONDS", 300),
CacheAPICheckTTL: envDuration("CACHE_API_CHECK_TTL_SECONDS", 300),
DriveTimeout: envDuration("DRIVE_TIMEOUT_SECONDS", 10),

View File

@@ -142,13 +142,7 @@ func parseDates(header []string) []struct {
}
var dt time.Time
var err error
// Use the unpadded reference forms ("2.1" and "1/2"): Go's time.Parse
// accepts both single-digit and zero-padded inputs against them, so
// "1.6.2026", "01.06.2026", "23.3.2026" all parse. Czech sheet authors
// drop the leading zero on dates ≤ 9 — Python's strptime is lenient
// the same way; the previous "02.01.2006" form silently dropped those
// columns and undercounted attendance.
for _, fmt_ := range []string{"2.1.2006", "1/2/2006"} {
for _, fmt_ := range []string{"02.01.2006", "01/02/2006"} {
dt, err = time.Parse(fmt_, raw)
if err == nil {
break
@@ -400,19 +394,9 @@ func parseTransactionRows(rows [][]any) ([]reconcile.Transaction, error) {
return fmt.Sprint(row[i])
}
// getRaw returns row[i] without stringifying — needed for FormatDate to
// dispatch on the underlying numeric type (Sheets returns serial-day
// numbers as float64). Stringifying first defeats that dispatch.
getRaw := func(row []any, i int) any {
if i < 0 || i >= len(row) {
return nil
}
return row[i]
}
var txns []reconcile.Transaction
for _, row := range rows[1:] {
dateStr := matching.FormatDate(getRaw(row, idxDate))
dateStr := matching.FormatDate(getVal(row, idxDate))
amountRaw := row[idxAmount]
if idxAmount < 0 || idxAmount >= len(row) {
amountRaw = ""

View File

@@ -114,15 +114,12 @@ func TestLoadJuniors(t *testing.T) {
func TestLoadTransactions(t *testing.T) {
// Sheets fake keyed by "<spreadsheetID>/<range>" — use the real constant.
// Row 1 uses a pre-formatted date string; row 2 uses the numeric Sheets
// serial-day form (float64) — the API returns either depending on cell
// formatting, and FormatDate must handle both.
paymentsKey := config.PaymentsSheetID + "/A1:Z"
sh := &sheets.Fake{Values: map[string][][]any{
paymentsKey: {
{"Date", "Amount", "manual fix", "Person", "Purpose", "Inferred Amount", "Sender", "VS", "Message", "Bank ID", "Sync ID"},
{"2026-04-01", 700.0, "", "Alice", "2026-04", "", "Alice Bank", "", "fee", "", "abc"},
{46147.0, 500.0, "", "", "", "", "Bob Bank", "", "platba", "", "def"}, // 46147 serial-day = 2026-05-05
{"2026-05-01", 500.0, "", "", "", "", "Bob Bank", "", "platba", "", "def"},
},
}}
s := buildSources(t, &attendance.Fake{}, sh)
@@ -140,12 +137,6 @@ func TestLoadTransactions(t *testing.T) {
if txns[0].Amount != 700 {
t.Errorf("txn[0].Amount: want 700, got %v", txns[0].Amount)
}
if txns[0].Date != "2026-04-01" {
t.Errorf("txn[0].Date: want 2026-04-01, got %q", txns[0].Date)
}
if txns[1].Date != "2026-05-05" {
t.Errorf("txn[1].Date (numeric serial-day): want 2026-05-05, got %q", txns[1].Date)
}
}
func TestLoadExceptions(t *testing.T) {
@@ -174,28 +165,6 @@ func TestLoadExceptions(t *testing.T) {
}
}
// TestParseDates_SingleDigitDayMonth covers the regression where Go's strict
// "02.01.2006" format dropped header cells written without leading zeros
// (e.g. "1.6.2026", "23.3.2026"), causing attendance undercounts and missing
// months on the /api/juniors response. Czech sheet authors drop the zero
// pad freely; Python's strptime tolerates it, so the parsers must match.
func TestParseDates_SingleDigitDayMonth(t *testing.T) {
// Czech form ("DD.MM.YYYY", with leading zeros optional) is the primary
// path. The "M/D/YYYY" fallback mirrors Python's %m/%d/%Y secondary
// strptime branch — month-first, day-second.
header := []string{"Jméno", "Tier", "", "01.06.2026", "1.6.2026", "23.3.2026", "6.4.2026", "01/02/2026", "1/2/2026"}
got := parseDates(header)
want := []string{"2026-06", "2026-06", "2026-03", "2026-04", "2026-01", "2026-01"}
if len(got) != len(want) {
t.Fatalf("parseDates: got %d entries, want %d (%v)", len(got), len(want), got)
}
for i, e := range got {
if e.month != want[i] {
t.Errorf("parseDates[%d].month = %q, want %q (raw=%q)", i, e.month, want[i], header[e.col])
}
}
}
// TTL smoke test: second call within TTL must not call fetch again.
func TestLoadAdults_CacheHit(t *testing.T) {
dir := t.TempDir()

View File

@@ -236,8 +236,6 @@ def fetch_sheet_data(spreadsheet_id: str, credentials_path: str) -> list[dict]:
idx_sender = get_col_index("Sender")
idx_message = get_col_index("Message")
idx_bank_id = get_col_index("Bank ID")
idx_vs = get_col_index("VS")
idx_sync_id = get_col_index("Sync ID")
required = {"Date": idx_date, "Amount": idx_amount, "Person": idx_person, "Purpose": idx_purpose}
missing = [name for name, idx in required.items() if idx == -1]
@@ -249,33 +247,16 @@ def fetch_sheet_data(spreadsheet_id: str, credentials_path: str) -> list[dict]:
def get_val(idx):
return row[idx] if idx != -1 and idx < len(row) else ""
def get_str(idx):
v = get_val(idx)
if isinstance(v, float) and v.is_integer():
return str(int(v))
return str(v)
def get_float(idx):
v = get_val(idx)
if isinstance(v, (int, float)):
return float(v)
try:
return float(str(v).strip())
except (ValueError, TypeError):
return 0.0
tx = {
"date": format_date(get_val(idx_date)),
"amount": get_float(idx_amount),
"amount": get_val(idx_amount),
"manual_fix": get_val(idx_manual),
"person": get_val(idx_person),
"purpose": get_val(idx_purpose),
"inferred_amount": get_val(idx_inferred_amount),
"sender": get_val(idx_sender),
"vs": get_str(idx_vs),
"message": get_str(idx_message),
"message": get_val(idx_message),
"bank_id": get_val(idx_bank_id),
"sync_id": get_val(idx_sync_id),
}
transactions.append(tx)

View File

@@ -310,9 +310,8 @@ def build_juniors_view_model(
cell_text = "-"
amount_to_pay = 0
is_unknown = original_expected == "?"
if is_unknown or (isinstance(expected, int) and expected > 0):
if is_unknown:
if expected == "?" or (isinstance(expected, int) and expected > 0):
if expected == "?":
status = "empty"
cell_text = f"?{count_str}"
elif paid >= expected:
@@ -340,7 +339,7 @@ def build_juniors_view_model(
status = "surplus"
cell_text = f"PAID {paid}"
if (not is_unknown and isinstance(expected, int) and expected > 0) or paid > 0:
if (isinstance(expected, int) and expected > 0) or paid > 0:
tooltip = f"Received: {paid}, Expected: {expected}"
else:
tooltip = ""

View File

@@ -129,7 +129,6 @@ class TestWebApp(unittest.TestCase):
'date': '2026-01-01', 'amount': 750, 'person': 'Test Member',
'purpose': '2026-01', 'message': 'test payment',
'sender': 'External Bank User', 'inferred_amount': 750,
'vs': '', 'sync_id': 'abc123',
}]
response = self.client.get('/api/adults')
self.assertEqual(response.status_code, 200)
@@ -156,7 +155,6 @@ class TestWebApp(unittest.TestCase):
mock_fetch_sheet.return_value = [{
'date': '2026-01-15', 'amount': 500, 'person': 'Junior One',
'purpose': '2026-01', 'message': '', 'sender': 'Parent', 'inferred_amount': 500,
'vs': '', 'sync_id': 'def456',
}]
response = self.client.get('/api/juniors')
self.assertEqual(response.status_code, 200)
@@ -174,7 +172,6 @@ class TestWebApp(unittest.TestCase):
mock_fetch_sheet.return_value = [{
'date': '2026-01-01', 'amount': 750, 'person': 'Test Member',
'purpose': '2026-01', 'message': 'test', 'sender': 'Someone',
'vs': '', 'sync_id': 'ghi789',
}]
response = self.client.get('/api/payments')
self.assertEqual(response.status_code, 200)