# M2.7 + M2.8 + M2.9 — Port `matching` package to Go > On approval: copy this plan to `docs/plans/2026-05-06-1305-go-m2-7-2-9-matching.md` per [CLAUDE.md](../../srv/personal/fuj-management/CLAUDE.md) plan-location convention. ## Context The Go rewrite (tracked in [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md)) is in milestone M2 — porting pure-domain helpers leaf-first from Python to Go. M2.1 through M2.6 are complete (`czech.Normalize`, `czech.ParseMonthReferences`, `fees.CalculateFee`, `fees.CalculateJuniorFee`, `money.ParseCZK`, `synch.GenerateSyncID`). M2.7, M2.8, and M2.9 cover three helpers from [scripts/match_payments.py](../../srv/personal/fuj-management/scripts/match_payments.py) that form a tight chain: `InferTransactionDetails` calls `MatchMembers` which calls `BuildNameVariants` and the same Sheets-serial date logic that `FormatDate` uses. The user requested they be done together because the dependency graph makes per-milestone commits awkward — `MatchMembers` would either reference an unexported helper not yet committed or commit dead code. This unblocks M2.10 (`reconcile`, the load-bearing function) and M5 parity tests, since reconciliation consumes `InferTransactionDetails` output. ## Approach **One commit, one branch, one MR.** Branch: `feat/m2-7-2-9-matching-package`. The three milestone checkboxes get ticked together on merge. ### Package layout New package `go/internal/domain/matching/` mirroring the existing `go/internal/domain/{czech,fees,money,synch}` convention (one file per public symbol, tests alongside as `*_test.go`): | File | Contents | |---|---| | `doc.go` | `// Package matching ports name/member matching from scripts/match_payments.py.` | | `name_variants.go` | `BuildNameVariants` + unexported `wordIn` helper (mirrors Python's `_word_in` co-location at [match_payments.py:60-62](../../srv/personal/fuj-management/scripts/match_payments.py#L60)) | | `match_members.go` | `Confidence` typed string + constants, `Match` struct, `MatchMembers` | | `infer.go` | `Transaction`, `InferredDetails`, `InferTransactionDetails` | | `format_date.go` | `FormatDate` | | `name_variants_test.go`, `match_members_test.go`, `infer_test.go`, `format_date_test.go` | table-driven tests, each with a top-of-file comment quoting the live Python one-liner used to verify expected values (mirrors [synch_test.go:7-20](../../srv/personal/fuj-management/go/internal/domain/synch/synch_test.go#L7)) | ### Public API ```go type Confidence string const ( ConfidenceAuto Confidence = "auto" ConfidenceReview Confidence = "review" ) type Match struct { Name string Confidence Confidence } func BuildNameVariants(name string) []string func MatchMembers(text string, memberNames []string) []Match type Transaction struct { Sender string Message string UserID string Date any // string | int | float64 — see "Parity concerns" } type InferredDetails struct { Members []Match Months []string SearchText string // matches Python's "search_text" key, not the misleading "matched_text" docstring } func InferTransactionDetails(tx Transaction, memberNames []string, defaultYear int) InferredDetails func FormatDate(val any) string ``` ### Algorithms (port verbatim — these are the load-bearing details) **`BuildNameVariants`** ([match_payments.py:33-57](../../srv/personal/fuj-management/scripts/match_payments.py#L33)): extract `(nickname)` regex, strip parens for `base`, normalize via `czech.Normalize`, append last + first when ≥2 parts, **filter <3 chars**. `variants[0]` must always be the full normalized base — `MatchMembers` relies on this. **`MatchMembers`** ([match_payments.py:65-137](../../srv/personal/fuj-management/scripts/match_payments.py#L65)): 1. **Exact short-circuit** ([:77-84](../../srv/personal/fuj-management/scripts/match_payments.py#L77)): if any member's `variants[0]` whole-word matches in `Normalize(text)`, return ONLY those `(name, auto)`. Prevents nickname `tov` matching inside `ottova`. 2. Otherwise per-member first-match-wins: full-name substring → `\b first \b` AND `\b last \b` (any order) → `\b nickname \b` — each yields `auto` and continues. 3. **Review tier** ([:113-129](../../srv/personal/fuj-management/scripts/match_payments.py#L113)): ≥2-part names → last name `len ≥ 4` AND not in `{"novak","novakova","prach"}` → review; else first name `len ≥ 3` → review. 1-part names → `len ≥ 4` → review. 4. **Final filter** ([:131-137](../../srv/personal/fuj-management/scripts/match_payments.py#L131)): if ANY auto exists, drop ALL review. Two-pass — don't try to fuse with the loop. **`InferTransactionDetails`** ([match_payments.py:144-184](../../srv/personal/fuj-management/scripts/match_payments.py#L144)): `search_text = sender + " " + message + " " + user_id`; month parse uses `message + " " + user_id` (excludes sender); fallback 1 retries members on sender alone; fallback 2 derives months from `tx.Date` (Sheets serial or `YYYY-MM-DD`). **`FormatDate`** ([match_payments.py:187-206](../../srv/personal/fuj-management/scripts/match_payments.py#L187)): nil/empty → `""`; int/float → Sheets serial since 1899-12-30 formatted `YYYY-MM-DD`; pre-formatted `YYYY-MM-DD` (length 10, dashes at idx 4/7) → as-is; else `strings.TrimSpace(fmt.Sprint(v))`. **No raise on bad input** — parity contract. ## Parity concerns - **RE2 `\b`**: Equivalent to Python `\b` on ASCII-folded input (`Normalize` strips diacritics + lowercases). Use `regexp.QuoteMeta` for `re.escape`. - **Sheets epoch**: 1899-12-30 (NOT 1900-01-01). `time.Date(1899, 12, 30, 0, 0, 0, 0, time.UTC)`. - **Fractional serials**: Python `timedelta(days=44197.5)` adds 12 hours, then `.strftime("%Y-%m-%d")` discards time. To match exactly use `base.Add(time.Duration(val * 24 * float64(time.Hour)))` then `Format("2006-01-02")`. **Do NOT** use `base.AddDate(0, 0, int(val))` — that silently drops fractional days from real Sheets exports of timestamped cells. - **`Transaction.Date any`**: Python `tx["date"]` accepts int/float/string transparently. Sheets API returns serial dates as `float64` from JSON; FIO scraper returns `string`. `any` is the faithful port; type-switch inside `FormatDate` and the date fallback in `InferTransactionDetails`. - **`SearchText` vs `MatchedText`**: Python docstring says `matched_text`, code returns `"search_text"`. Port the code, not the docstring. - **Default year plumbing**: Go's `czech.ParseMonthReferences(text, defaultYear)` requires explicit year. Python defaults to 2026. Plumb `defaultYear` as the third arg to `InferTransactionDetails`. - **Empty slices not nil**: Python `match_members` returns `[]` when nothing matches; ensure Go returns `[]Match{}` not `nil` so consumers don't have to nil-check (matches `synch` package style). ## Tests Port all 6 cases from [tests/test_match_members.py](../../srv/personal/fuj-management/tests/test_match_members.py) verbatim into `match_members_test.go` as one table-driven `TestMatchMembers`. Each row: `name`, `text`, `wantContains []string`, `wantExcludes []string`, `wantAllAuto bool`. Add table cases for: - `BuildNameVariants` — docstring example `František Vrbík (Štrúdl)` → 4 variants; nickname filtered (len<3); single-part name; whitespace inside parens - `FormatDate` — `nil` → `""`, `""` → `""`, `int(44197)` → `"2020-12-31"`, `float64(44197.5)` → `"2020-12-31"`, `"2026-04-15"` → `"2026-04-15"`, `"garbage"` → `"garbage"`, `" 2026-04-15 "` → `"2026-04-15"` - `InferTransactionDetails` — members from search_text, members from sender fallback, months from date-string fallback, months from serial-date fallback, both-paths-fail returns empty slices Verify expectations against live Python and quote the one-liner in a top-of-file comment, e.g.: ``` PYTHONPATH=scripts:. python -c ' from match_payments import format_date for v in [None, "", 44197, 44197.5, "2026-04-15", "garbage", " 2026-04-15 "]: print(repr(format_date(v))) ' ``` ## Critical files - **Read for parity** — [scripts/match_payments.py:33-206](../../srv/personal/fuj-management/scripts/match_payments.py#L33), [tests/test_match_members.py](../../srv/personal/fuj-management/tests/test_match_members.py) - **Reuse** — `czech.Normalize` ([go/internal/domain/czech/normalize.go](../../srv/personal/fuj-management/go/internal/domain/czech/normalize.go#L15)), `czech.ParseMonthReferences` ([parse_month_references.go:61](../../srv/personal/fuj-management/go/internal/domain/czech/parse_month_references.go#L61)) - **Mirror conventions** — [go/internal/domain/synch/synch.go](../../srv/personal/fuj-management/go/internal/domain/synch/synch.go), [go/internal/domain/synch/synch_test.go](../../srv/personal/fuj-management/go/internal/domain/synch/synch_test.go) - **New** — `go/internal/domain/matching/{doc,name_variants,match_members,infer,format_date}.go` + `*_test.go` ## Out of scope (M2.10 / M4 territory — DO NOT touch) - `canonical_member_key` ([match_payments.py:20](../../srv/personal/fuj-management/scripts/match_payments.py#L20)) - `reconcile`, `fetch_sheet_data`, `fetch_exceptions` — M2.10 / M4 - Sheets/Drive/FIO I/O glue - Fixture capture (`tests/fixtures/pure/`) — M3.3 separately ## Verification 1. `cd go && make go-build` — clean build. 2. `cd go && make go-test ./internal/domain/matching/...` — all table tests green. 3. `cd go && make go-lint` — clean (govet, staticcheck, errcheck, gofumpt, unused). 4. Spot-check: pick 2–3 random non-trivial cases (e.g. `MatchMembers` with mixed auto/review, `FormatDate(44197.5)`) and run the live Python one-liner from each test's comment block to confirm bytes match. 5. Append CHANGELOG entry per [CLAUDE.md](../../srv/personal/fuj-management/CLAUDE.md) (timestamp via `date "+%Y-%m-%d %H:%M %Z"`). 6. Tick M2.7, M2.8, M2.9 in [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md) with the merge SHA. 7. Push branch, open MR via `tea pr create --title "feat(go): port matching helpers (M2.7-2.9)" --base main --head feat/m2-7-2-9-matching-package`, print URL, leave merge to user.