New go/internal/domain/matching package porting three helpers from scripts/match_payments.py: - BuildNameVariants: normalized ASCII variants from a member name (nickname in parens, last/first split, len<3 filtered); variants[0] is always the full base name — MatchMembers relies on this invariant. - MatchMembers: auto/review confidence matching with an exact-name short-circuit pass that prevents nickname substrings (tov) from firing inside longer surnames (ottova); common-surname filter for review tier. - FormatDate: nil/empty/""/serial int/float64 (since 1899-12-30, fractional days supported)/YYYY-MM-DD passthrough/garbage → never errors. - InferTransactionDetails: composes BuildNameVariants+MatchMembers+ ParseMonthReferences; falls back to sender-only member match and date-derived month when text carries no signal. 21 table-driven tests; all expected values verified against live Python on 2026-05-06. go-build, go-test, go-lint all clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
10 KiB
M2.7 + M2.8 + M2.9 — Port matching package to Go
On approval: copy this plan to
docs/plans/2026-05-06-1305-go-m2-7-2-9-matching.mdper CLAUDE.md plan-location convention.
Context
The Go rewrite (tracked in docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md) is in milestone M2 — porting pure-domain helpers leaf-first from Python to Go. M2.1 through M2.6 are complete (czech.Normalize, czech.ParseMonthReferences, fees.CalculateFee, fees.CalculateJuniorFee, money.ParseCZK, synch.GenerateSyncID).
M2.7, M2.8, and M2.9 cover three helpers from scripts/match_payments.py that form a tight chain: InferTransactionDetails calls MatchMembers which calls BuildNameVariants and the same Sheets-serial date logic that FormatDate uses. The user requested they be done together because the dependency graph makes per-milestone commits awkward — MatchMembers would either reference an unexported helper not yet committed or commit dead code.
This unblocks M2.10 (reconcile, the load-bearing function) and M5 parity tests, since reconciliation consumes InferTransactionDetails output.
Approach
One commit, one branch, one MR. Branch: feat/m2-7-2-9-matching-package. The three milestone checkboxes get ticked together on merge.
Package layout
New package go/internal/domain/matching/ mirroring the existing go/internal/domain/{czech,fees,money,synch} convention (one file per public symbol, tests alongside as *_test.go):
| File | Contents |
|---|---|
doc.go |
// Package matching ports name/member matching from scripts/match_payments.py. |
name_variants.go |
BuildNameVariants + unexported wordIn helper (mirrors Python's _word_in co-location at match_payments.py:60-62) |
match_members.go |
Confidence typed string + constants, Match struct, MatchMembers |
infer.go |
Transaction, InferredDetails, InferTransactionDetails |
format_date.go |
FormatDate |
name_variants_test.go, match_members_test.go, infer_test.go, format_date_test.go |
table-driven tests, each with a top-of-file comment quoting the live Python one-liner used to verify expected values (mirrors synch_test.go:7-20) |
Public API
type Confidence string
const (
ConfidenceAuto Confidence = "auto"
ConfidenceReview Confidence = "review"
)
type Match struct {
Name string
Confidence Confidence
}
func BuildNameVariants(name string) []string
func MatchMembers(text string, memberNames []string) []Match
type Transaction struct {
Sender string
Message string
UserID string
Date any // string | int | float64 — see "Parity concerns"
}
type InferredDetails struct {
Members []Match
Months []string
SearchText string // matches Python's "search_text" key, not the misleading "matched_text" docstring
}
func InferTransactionDetails(tx Transaction, memberNames []string, defaultYear int) InferredDetails
func FormatDate(val any) string
Algorithms (port verbatim — these are the load-bearing details)
BuildNameVariants (match_payments.py:33-57): extract (nickname) regex, strip parens for base, normalize via czech.Normalize, append last + first when ≥2 parts, filter <3 chars. variants[0] must always be the full normalized base — MatchMembers relies on this.
MatchMembers (match_payments.py:65-137):
- Exact short-circuit (:77-84): if any member's
variants[0]whole-word matches inNormalize(text), return ONLY those(name, auto). Prevents nicknametovmatching insideottova. - Otherwise per-member first-match-wins: full-name substring →
\b first \bAND\b last \b(any order) →\b nickname \b— each yieldsautoand continues. - Review tier (:113-129): ≥2-part names → last name
len ≥ 4AND not in{"novak","novakova","prach"}→ review; else first namelen ≥ 3→ review. 1-part names →len ≥ 4→ review. - Final filter (:131-137): if ANY auto exists, drop ALL review. Two-pass — don't try to fuse with the loop.
InferTransactionDetails (match_payments.py:144-184): search_text = sender + " " + message + " " + user_id; month parse uses message + " " + user_id (excludes sender); fallback 1 retries members on sender alone; fallback 2 derives months from tx.Date (Sheets serial or YYYY-MM-DD).
FormatDate (match_payments.py:187-206): nil/empty → ""; int/float → Sheets serial since 1899-12-30 formatted YYYY-MM-DD; pre-formatted YYYY-MM-DD (length 10, dashes at idx 4/7) → as-is; else strings.TrimSpace(fmt.Sprint(v)). No raise on bad input — parity contract.
Parity concerns
- RE2
\b: Equivalent to Python\bon ASCII-folded input (Normalizestrips diacritics + lowercases). Useregexp.QuoteMetaforre.escape. - Sheets epoch: 1899-12-30 (NOT 1900-01-01).
time.Date(1899, 12, 30, 0, 0, 0, 0, time.UTC). - Fractional serials: Python
timedelta(days=44197.5)adds 12 hours, then.strftime("%Y-%m-%d")discards time. To match exactly usebase.Add(time.Duration(val * 24 * float64(time.Hour)))thenFormat("2006-01-02"). Do NOT usebase.AddDate(0, 0, int(val))— that silently drops fractional days from real Sheets exports of timestamped cells. Transaction.Date any: Pythontx["date"]accepts int/float/string transparently. Sheets API returns serial dates asfloat64from JSON; FIO scraper returnsstring.anyis the faithful port; type-switch insideFormatDateand the date fallback inInferTransactionDetails.SearchTextvsMatchedText: Python docstring saysmatched_text, code returns"search_text". Port the code, not the docstring.- Default year plumbing: Go's
czech.ParseMonthReferences(text, defaultYear)requires explicit year. Python defaults to 2026. PlumbdefaultYearas the third arg toInferTransactionDetails. - Empty slices not nil: Python
match_membersreturns[]when nothing matches; ensure Go returns[]Match{}notnilso consumers don't have to nil-check (matchessynchpackage style).
Tests
Port all 6 cases from tests/test_match_members.py verbatim into match_members_test.go as one table-driven TestMatchMembers. Each row: name, text, wantContains []string, wantExcludes []string, wantAllAuto bool.
Add table cases for:
BuildNameVariants— docstring exampleFrantišek Vrbík (Štrúdl)→ 4 variants; nickname filtered (len<3); single-part name; whitespace inside parensFormatDate—nil→"",""→"",int(44197)→"2020-12-31",float64(44197.5)→"2020-12-31","2026-04-15"→"2026-04-15","garbage"→"garbage"," 2026-04-15 "→"2026-04-15"InferTransactionDetails— members from search_text, members from sender fallback, months from date-string fallback, months from serial-date fallback, both-paths-fail returns empty slices
Verify expectations against live Python and quote the one-liner in a top-of-file comment, e.g.:
PYTHONPATH=scripts:. python -c '
from match_payments import format_date
for v in [None, "", 44197, 44197.5, "2026-04-15", "garbage", " 2026-04-15 "]: print(repr(format_date(v)))
'
Critical files
- Read for parity — scripts/match_payments.py:33-206, tests/test_match_members.py
- Reuse —
czech.Normalize(go/internal/domain/czech/normalize.go),czech.ParseMonthReferences(parse_month_references.go:61) - Mirror conventions — go/internal/domain/synch/synch.go, go/internal/domain/synch/synch_test.go
- New —
go/internal/domain/matching/{doc,name_variants,match_members,infer,format_date}.go+*_test.go
Out of scope (M2.10 / M4 territory — DO NOT touch)
canonical_member_key(match_payments.py:20)reconcile,fetch_sheet_data,fetch_exceptions— M2.10 / M4- Sheets/Drive/FIO I/O glue
- Fixture capture (
tests/fixtures/pure/) — M3.3 separately
Verification
cd go && make go-build— clean build.cd go && make go-test ./internal/domain/matching/...— all table tests green.cd go && make go-lint— clean (govet, staticcheck, errcheck, gofumpt, unused).- Spot-check: pick 2–3 random non-trivial cases (e.g.
MatchMemberswith mixed auto/review,FormatDate(44197.5)) and run the live Python one-liner from each test's comment block to confirm bytes match. - Append CHANGELOG entry per CLAUDE.md (timestamp via
date "+%Y-%m-%d %H:%M %Z"). - Tick M2.7, M2.8, M2.9 in docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md with the merge SHA.
- Push branch, open MR via
tea pr create --title "feat(go): port matching helpers (M2.7-2.9)" --base main --head feat/m2-7-2-9-matching-package, print URL, leave merge to user.