Compare commits

...

7 Commits

Author SHA1 Message Date
c5a8a4e7b1 fix: include juniors in payment-inference roster
Some checks failed
Deploy to K8s / deploy (push) Successful in 10s
Build and Push / build (push) Successful in 6s
Build and Push / build-go (push) Failing after 12m23s
infer_payments was building member_names from get_members_with_fees()
(adults sheet only). Junior-only members were invisible to the matcher,
so a payment message containing an exact junior name would produce a
fuzzy review match against a different adult sharing the same first name.

Fix: union the adult and junior rosters (deduped via canonical_member_key)
so all members are candidates. The existing exact-name short-circuit in
match_members then handles precedence correctly.

Two regression tests added for the Jáchym Kubík / Jáchym Hrušák case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 16:38:21 +02:00
3e597242eb Merge pull request 'feat(go): port matching helpers (M2.7-2.9)' (#9) from feat/m2-7-2-9-matching-package into main
All checks were successful
Deploy to K8s / deploy (push) Successful in 6s
Reviewed-on: #9
2026-05-06 13:58:26 +00:00
7232697e9c chore: tick M2.7-2.9 in progress tracker + CHANGELOG entry
All checks were successful
Deploy to K8s / deploy (push) Successful in 7s
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 13:19:58 +02:00
e596f0000e feat(go/M2.7-2.9): port domain/matching package
New go/internal/domain/matching package porting three helpers from
scripts/match_payments.py:

- BuildNameVariants: normalized ASCII variants from a member name (nickname
  in parens, last/first split, len<3 filtered); variants[0] is always the
  full base name — MatchMembers relies on this invariant.
- MatchMembers: auto/review confidence matching with an exact-name
  short-circuit pass that prevents nickname substrings (tov) from firing
  inside longer surnames (ottova); common-surname filter for review tier.
- FormatDate: nil/empty/""/serial int/float64 (since 1899-12-30, fractional
  days supported)/YYYY-MM-DD passthrough/garbage → never errors.
- InferTransactionDetails: composes BuildNameVariants+MatchMembers+
  ParseMonthReferences; falls back to sender-only member match and
  date-derived month when text carries no signal.

21 table-driven tests; all expected values verified against live Python
on 2026-05-06. go-build, go-test, go-lint all clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 13:19:42 +02:00
c2bffed1b8 Merge pull request 'feat(go/M2.6): port domain/synch.GenerateSyncID' (#8) from feat/m2-6-synch-generate-sync-id into main
All checks were successful
Deploy to K8s / deploy (push) Successful in 8s
Reviewed-on: #8
2026-05-06 11:01:43 +00:00
54a783ea00 feat(go/M2.6): port domain/synch.GenerateSyncID
All checks were successful
Deploy to K8s / deploy (push) Successful in 6s
SHA-256 dedup hash from sync_fio_to_sheets.py generate_sync_id.
Key subtlety: Python str(float) emits "500.0" for whole-valued floats
and switches to scientific notation at |f|>=1e16 or |f|<1e-4 —
replicated via formatAmount using 'f'/'e' format selection.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 12:43:41 +02:00
84a5d177e9 Merge pull request 'feat(go/M2.5): port domain/money.ParseCZK' (#7) from feat/m2-5-money-parse-czk into main
All checks were successful
Deploy to K8s / deploy (push) Successful in 6s
Reviewed-on: #7
2026-05-06 07:39:42 +00:00
18 changed files with 1460 additions and 9 deletions

View File

@@ -1,5 +1,27 @@
# Changelog
## 2026-05-06 16:38 CEST — fix: include juniors in payment-inference roster
- `scripts/infer_payments.py`: union adults + junior rosters so junior-only members are visible to the matcher.
- Root cause: `get_members_with_fees()` reads only the adults sheet; junior-only kids like Jáchym Kubík were absent from `member_names`, causing the exact-match short-circuit to never fire and a different adult sharing the first name to win via fuzzy review.
- Two regression tests added to `tests/test_match_members.py`.
## 2026-05-06 13:18 CEST — feat(go/M2.7-2.9): port domain/matching package
- New `go/internal/domain/matching` package porting three helpers from `scripts/match_payments.py`.
- `BuildNameVariants` — extracts normalized ASCII search variants from a member name, including nickname (from parens) and separate first/last; filters variants shorter than 3 chars; `variants[0]` is always the full normalized base name.
- `MatchMembers` — finds members in free text with `"auto"` or `"review"` confidence; exact-name short-circuit prevents nickname substrings (e.g. `tov`) from matching inside surnames (e.g. `ottova`).
- `FormatDate` — normalizes Google Sheets date values: handles nil, empty, int/float64 serial-days since 1899-12-30 (supports fractional serials), pre-formatted `YYYY-MM-DD` strings, and garbage input — never errors.
- `InferTransactionDetails` — composes name + month matching over sender/message/user_id; falls back to sender-only member match and date-derived month when text gives no signal.
- 21 table-driven tests; all expected values verified against live Python on 2026-05-06.
## 2026-05-06 12:43 CEST — feat(go/M2.6): port domain/synch.GenerateSyncID
- New `go/internal/domain/synch` package with `GenerateSyncID(Transaction) string` ported from `scripts/sync_fio_to_sheets.py` `generate_sync_id`.
- Byte-stable SHA-256 hash over `date|amount|currency|sender|vs|message|bank_id` (lowercased, UTF-8); `Currency: ""` defaults to `"CZK"` matching the Python missing-key fallback.
- Key subtlety: Python's `str(float)` emits `"500.0"` for whole-valued floats and switches to scientific notation at `|f| >= 1e16` or `|f| < 1e-4` — replicated in `formatAmount` using `'f'`/`'e'` format selection.
- 6 table-driven hash tests + 9 `formatAmount` tests; all expected values verified against live Python on 2026-05-06.
## 2026-05-06 09:38 CEST — feat(go/M2.5): port domain/money.ParseCZK
- New `go/internal/domain/money` package with `ParseCZK(string) (float64, error)` ported from `scripts/infer_payments.py` `parse_czk_amount`.

View File

@@ -49,10 +49,10 @@ Each task: port the function, write Go unit tests for fresh cases, hook into the
- [x] **M2.3** `domain/fees.CalculateFee` — port [attendance.py](scripts/attendance.py) `calculate_fee` (constants table) — `0fc3b6d`
- [x] **M2.4** `domain/fees.CalculateJuniorFee` — port `calculate_junior_fee` with `Expected{Value int; Unknown bool}` for the `"?"` sentinel — `0fc3b6d`
- [x] **M2.5** `domain/money.ParseCZK` — port [infer_payments.py](scripts/infer_payments.py) `parse_czk_amount` (Czech locale: comma decimal, dot/space thousand separators) — `d24d205`
- [ ] **M2.6** `domain/synch.GenerateSyncID` — port [sync_fio_to_sheets.py](scripts/sync_fio_to_sheets.py) `generate_sync_id` (SHA-256, byte-stable hash; verify float string format against real sheet rows)
- [ ] **M2.7** `domain/matching.BuildNameVariants` + `MatchMembers` — port `_build_name_variants` and `match_members` from [match_payments.py](scripts/match_payments.py) (auto vs review confidence, common-surname filter)
- [ ] **M2.8** `domain/matching.InferTransactionDetails` — port `infer_transaction_details` (composes name + month parsing)
- [ ] **M2.9** `domain/matching.FormatDate` — port `format_date` (handles Google Sheets serial-day numbers since 1899-12-30)
- [x] **M2.6** `domain/synch.GenerateSyncID` — port [sync_fio_to_sheets.py](scripts/sync_fio_to_sheets.py) `generate_sync_id` (SHA-256, byte-stable hash; verify float string format against real sheet rows)
- [x] **M2.7** `domain/matching.BuildNameVariants` + `MatchMembers` — port `_build_name_variants` and `match_members` from [match_payments.py](scripts/match_payments.py) (auto vs review confidence, common-surname filter) — `e596f00`
- [x] **M2.8** `domain/matching.InferTransactionDetails` — port `infer_transaction_details` (composes name + month parsing) — `e596f00`
- [x] **M2.9** `domain/matching.FormatDate` — port `format_date` (handles Google Sheets serial-day numbers since 1899-12-30) — `e596f00`
- [ ] **M2.10** `domain/reconcile.Reconcile` — port `reconcile` (three-phase allocation: greedy / proportional with float-remainder absorption / even-split fallback). The single most load-bearing function; budget extra time.
- [ ] **M2.11** `fuj fees` subcommand wired up via `domain/fees` + (M4-stub) attendance loader — fail gracefully on missing IO until M4 lands
- [ ] **M2.12** `fuj reconcile` subcommand similarly stubbed

View File

@@ -0,0 +1,265 @@
## Context
Continuing the Go backend rewrite tracked in
[2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md).
M2.1M2.5 are landed. Next leaf-level pure function is `generate_sync_id`
from [scripts/sync_fio_to_sheets.py:62-77](../../srv/personal/fuj-management/scripts/sync_fio_to_sheets.py#L62-L77).
It computes a SHA-256 hash over a fixed seven-field projection of a Fio
transaction (`date|amount|currency|sender|vs|message|bank_id`) and is
the deduplication key written into column K (`Sync ID`) of the payments
sheet. The Go port must produce a **byte-identical** digest for the same
transaction; otherwise the Go-side sync (M4.7) would re-append rows
already written by the Python sync, double-counting payments.
The non-trivial part is the `amount` field's string serialisation:
upstream `fio_utils.py` always supplies `amount` as a Python `float`
(API path: `float(val(1) or 0)`; HTML path: `parse_czech_amount(...)`
which returns `float`). Python's `str(float)` produces `"500.0"` for
whole-valued floats; Go's `strconv.FormatFloat(f, 'g', -1, 64)` produces
`"500"`. This is the gotcha called out in the M2.6 line of the progress
tracker.
## Python behaviour (the spec)
```py
def generate_sync_id(tx: dict) -> str:
components = [
str(tx.get("date", "")),
str(tx.get("amount", "")),
str(tx.get("currency", "CZK")),
str(tx.get("sender", "")),
str(tx.get("vs", "")),
str(tx.get("message", "")),
str(tx.get("bank_id", "")),
]
raw_str = "|".join(components).lower()
return hashlib.sha256(raw_str.encode("utf-8")).hexdigest()
```
Behavioural notes for the Go port:
1. **Field order is load-bearing.** `date|amount|currency|sender|vs|message|bank_id` exactly.
2. **Separator is `"|"`.**
3. **Whole string is `.lower()`-ed before hashing** (so e.g. "ABC" sender vs "abc" hash identically). Unicode lower; in practice Fio data is ASCII + Czech diacritics.
4. **`currency` defaults to `"CZK"`** when missing from the dict (HTML scraper path never sets it). Other fields default to `""`.
5. **`amount` is a `float`.** Always. Real Fio data is `500.0`, `1234.56`, etc. — no NaN/Inf, but parity test must pin the format.
6. **Output is `hashlib.sha256(...).hexdigest()`** — 64-char lowercase hex.
7. **Encoding is UTF-8.**
### `str(float)` cases observed in real Fio amounts
| float64 | Python `str(f)` | Go `strconv.FormatFloat(f,'g',-1,64)` | Need |
|---|---|---|---|
| `500.0` | `"500.0"` | `"500"` | append `.0` |
| `1234.56` | `"1234.56"` | `"1234.56"` | matches |
| `0.0` | `"0.0"` | `"0"` | append `.0` |
| `-500.0` | `"-500.0"` | `"-500"` | append `.0` |
| `0.1` | `"0.1"` | `"0.1"` | matches |
| `99999.99` | `"99999.99"` | `"99999.99"` | matches |
For the Fio amount domain (signed CZK, ≤ ~7 digits, ≤2 decimal places),
the rule "`'g'` with prec -1, then append `.0` if result has no `.` and
no `e`/`E`" is exact. We do not need to handle Python's
scientific-notation crossover (`>= 1e16`) for real data, but the
implementation should still cope with it correctly via the same rule.
## Approach
Create new package `internal/domain/synch` mirroring the layout of
`internal/domain/money` (single-file module + test file alongside).
### Package + signature
```go
// Package synch ports the bank-sync deduplication helper from
// scripts/sync_fio_to_sheets.py.
package synch
// Transaction is the projection of a Fio transaction that participates
// in the Sync ID hash. Other fields (ks, ss, sender_account, …) are
// intentionally excluded — they are not part of the Python hash.
//
// Currency: leave "" to inherit the Python default of "CZK" (matches
// the HTML scraper path which omits the key entirely).
type Transaction struct {
Date string
Amount float64
Currency string
Sender string
VS string
Message string
BankID string
}
// GenerateSyncID returns the lowercase SHA-256 hex digest of
// "date|amount|currency|sender|vs|message|bank_id" (lower-cased), used
// as the dedup key in column K of the payments sheet.
//
// Byte-stable with scripts/sync_fio_to_sheets.py generate_sync_id.
func GenerateSyncID(tx Transaction) string
```
### `Currency` default
In Go every struct field is always present, so we lose Python's
"missing key vs empty string" distinction. Real-world data either sets
`currency = "CZK"` (API path) or omits the key (HTML path → `"CZK"`
default). Empty string never occurs in practice. The Go port collapses
the two by treating `Currency == ""` as "use `CZK`":
```go
currency := tx.Currency
if currency == "" {
currency = "CZK"
}
```
This is byte-equal to Python for every input we will ever see in
production, and avoids forcing callers to pass a `*string`.
### Float formatter
Internal helper, unexported:
```go
// formatAmount mimics Python's str(float) for the float values that
// appear in Fio transactions. For mundane decimal amounts the rule
// is: format with 'g' precision -1, then append ".0" if the result
// has no decimal point and no exponent.
func formatAmount(f float64) string {
s := strconv.FormatFloat(f, 'g', -1, 64)
if !strings.ContainsAny(s, ".eE") {
s += ".0"
}
return s
}
```
Tested explicitly (see Tests below) so the edge cases (`0`, whole
numbers, negatives, large/small with exponent) stay locked.
### Hash composition
```go
func GenerateSyncID(tx Transaction) string {
currency := tx.Currency
if currency == "" {
currency = "CZK"
}
raw := strings.ToLower(strings.Join([]string{
tx.Date,
formatAmount(tx.Amount),
currency,
tx.Sender,
tx.VS,
tx.Message,
tx.BankID,
}, "|"))
sum := sha256.Sum256([]byte(raw))
return hex.EncodeToString(sum[:])
}
```
(`crypto/sha256` + `encoding/hex` — both stdlib, no `go.mod` change.)
## Tests
`synch_test.go` mirrors `money_test.go`'s table-driven style with the
verification snippet at the top of the function. Two test functions:
### 1. `TestGenerateSyncID`
Each row's expected digest is computed from the Python source:
```sh
PYTHONPATH=scripts:. python -c '
from sync_fio_to_sheets import generate_sync_id
cases = [
{"date":"2026-01-15","amount":500.0,"currency":"CZK","sender":"Jan Novak","vs":"123","message":"clenske 1/2026","bank_id":"abc123"},
{"date":"2026-01-15","amount":500.0,"sender":"Jan Novak","vs":"123","message":"clenske 1/2026","bank_id":"abc123"}, # currency missing → CZK
{"date":"2026-02-10","amount":1234.56,"currency":"CZK","sender":"ABC SRO","vs":"","message":"FAKTURA 42","bank_id":"xyz"}, # mixed case → lowercased
{"date":"2026-03-01","amount":-500.0,"currency":"CZK","sender":"refund","vs":"","message":"","bank_id":""}, # negative
{"date":"2026-04-01","amount":0.0,"currency":"CZK","sender":"","vs":"","message":"","bank_id":""}, # zero amount
{}, # empty dict — every field falls back to default
]
for c in cases:
print(repr(c), "->", generate_sync_id(c))
'
```
Cases (one row per dict above), each asserting the exact 64-char hex
digest the snippet prints. Cover:
- Happy path with all fields set.
- `Currency: ""``"CZK"` default (parity with missing key).
- Mixed-case sender/message → lowercased before hashing.
- Negative amount.
- Zero amount.
- Zero-value `Transaction{}` — every field at Go zero, currency defaults
to `"CZK"`, hash matches Python `generate_sync_id({})`.
### 2. `TestFormatAmount`
Pin the float formatter against Python's `str(float)`:
```sh
PYTHONPATH=scripts:. python -c '
for v in [0.0, 500.0, -500.0, 0.1, 1234.56, 99999.99, 1500000.0, 1e16, 1e-5]:
print(repr(v), "->", repr(str(v)))
'
```
Table of `(float64, expected string)` pairs. Whole numbers must end in
`.0`; existing decimal representations pass through unchanged;
exponent-form floats (`1e16`, `1e-5`) keep their format.
## Files to create
- `go/internal/domain/synch/synch.go` — package, `Transaction`,
`GenerateSyncID`, internal `formatAmount`.
- `go/internal/domain/synch/synch_test.go``TestGenerateSyncID` +
`TestFormatAmount`.
No existing Go files need editing.
## Verification
```sh
cd go && go test ./internal/domain/synch/...
make go-lint
make go-build # sanity: nothing else broke
```
Plus run the two Python snippets in the Tests section and diff their
output against the test tables to confirm parity.
## Out of scope (explicit non-goals)
- **Hooking into the Tier-1 parity runner.** That comes with M3.5
(`-tags=parity` build constraint and `tests/fixtures/pure/`). M2.6
ships with hand-written, Python-verified test tables — same approach
used by M2.1M2.5.
- **A richer `Transaction` struct** covering ks/ss/note/sender_account.
Those fields aren't part of the hash. M4.4 (Fio IO adapter) will
decide whether to reuse `synch.Transaction` or define its own struct
and convert at the boundary.
- **Polymorphic input** (e.g. accepting a `map[string]any`). Python's
duck-typing is a non-goal in Go.
- **Any Python callsite migration.** `sync_fio_to_sheets.py` keeps using
its own `generate_sync_id` until M4.7 ports the sync service.
## Progress tracker + changelog
After the commit lands:
- Tick `M2.6` in
[docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md)
with the commit SHA, mirroring the M2.5 entry style.
- Add a `CHANGELOG.md` entry at top:
`## YYYY-MM-DD HH:MM TZ — feat(go/M2.6): port domain/synch.GenerateSyncID`.
Branch: `feat/m2-6-synch-generate-sync-id` (per CLAUDE.md
branch-per-feature workflow). Push, open MR via `tea pr create`, leave
merge to the user.

View File

@@ -0,0 +1,126 @@
# M2.7 + M2.8 + M2.9 — Port `matching` package to Go
> On approval: copy this plan to `docs/plans/2026-05-06-1305-go-m2-7-2-9-matching.md` per [CLAUDE.md](../../srv/personal/fuj-management/CLAUDE.md) plan-location convention.
## Context
The Go rewrite (tracked in [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md)) is in milestone M2 — porting pure-domain helpers leaf-first from Python to Go. M2.1 through M2.6 are complete (`czech.Normalize`, `czech.ParseMonthReferences`, `fees.CalculateFee`, `fees.CalculateJuniorFee`, `money.ParseCZK`, `synch.GenerateSyncID`).
M2.7, M2.8, and M2.9 cover three helpers from [scripts/match_payments.py](../../srv/personal/fuj-management/scripts/match_payments.py) that form a tight chain: `InferTransactionDetails` calls `MatchMembers` which calls `BuildNameVariants` and the same Sheets-serial date logic that `FormatDate` uses. The user requested they be done together because the dependency graph makes per-milestone commits awkward — `MatchMembers` would either reference an unexported helper not yet committed or commit dead code.
This unblocks M2.10 (`reconcile`, the load-bearing function) and M5 parity tests, since reconciliation consumes `InferTransactionDetails` output.
## Approach
**One commit, one branch, one MR.** Branch: `feat/m2-7-2-9-matching-package`. The three milestone checkboxes get ticked together on merge.
### Package layout
New package `go/internal/domain/matching/` mirroring the existing `go/internal/domain/{czech,fees,money,synch}` convention (one file per public symbol, tests alongside as `*_test.go`):
| File | Contents |
|---|---|
| `doc.go` | `// Package matching ports name/member matching from scripts/match_payments.py.` |
| `name_variants.go` | `BuildNameVariants` + unexported `wordIn` helper (mirrors Python's `_word_in` co-location at [match_payments.py:60-62](../../srv/personal/fuj-management/scripts/match_payments.py#L60)) |
| `match_members.go` | `Confidence` typed string + constants, `Match` struct, `MatchMembers` |
| `infer.go` | `Transaction`, `InferredDetails`, `InferTransactionDetails` |
| `format_date.go` | `FormatDate` |
| `name_variants_test.go`, `match_members_test.go`, `infer_test.go`, `format_date_test.go` | table-driven tests, each with a top-of-file comment quoting the live Python one-liner used to verify expected values (mirrors [synch_test.go:7-20](../../srv/personal/fuj-management/go/internal/domain/synch/synch_test.go#L7)) |
### Public API
```go
type Confidence string
const (
ConfidenceAuto Confidence = "auto"
ConfidenceReview Confidence = "review"
)
type Match struct {
Name string
Confidence Confidence
}
func BuildNameVariants(name string) []string
func MatchMembers(text string, memberNames []string) []Match
type Transaction struct {
Sender string
Message string
UserID string
Date any // string | int | float64 — see "Parity concerns"
}
type InferredDetails struct {
Members []Match
Months []string
SearchText string // matches Python's "search_text" key, not the misleading "matched_text" docstring
}
func InferTransactionDetails(tx Transaction, memberNames []string, defaultYear int) InferredDetails
func FormatDate(val any) string
```
### Algorithms (port verbatim — these are the load-bearing details)
**`BuildNameVariants`** ([match_payments.py:33-57](../../srv/personal/fuj-management/scripts/match_payments.py#L33)): extract `(nickname)` regex, strip parens for `base`, normalize via `czech.Normalize`, append last + first when ≥2 parts, **filter <3 chars**. `variants[0]` must always be the full normalized base — `MatchMembers` relies on this.
**`MatchMembers`** ([match_payments.py:65-137](../../srv/personal/fuj-management/scripts/match_payments.py#L65)):
1. **Exact short-circuit** ([:77-84](../../srv/personal/fuj-management/scripts/match_payments.py#L77)): if any member's `variants[0]` whole-word matches in `Normalize(text)`, return ONLY those `(name, auto)`. Prevents nickname `tov` matching inside `ottova`.
2. Otherwise per-member first-match-wins: full-name substring → `\b first \b` AND `\b last \b` (any order) → `\b nickname \b` — each yields `auto` and continues.
3. **Review tier** ([:113-129](../../srv/personal/fuj-management/scripts/match_payments.py#L113)): ≥2-part names → last name `len ≥ 4` AND not in `{"novak","novakova","prach"}` → review; else first name `len ≥ 3` → review. 1-part names → `len ≥ 4` → review.
4. **Final filter** ([:131-137](../../srv/personal/fuj-management/scripts/match_payments.py#L131)): if ANY auto exists, drop ALL review. Two-pass — don't try to fuse with the loop.
**`InferTransactionDetails`** ([match_payments.py:144-184](../../srv/personal/fuj-management/scripts/match_payments.py#L144)): `search_text = sender + " " + message + " " + user_id`; month parse uses `message + " " + user_id` (excludes sender); fallback 1 retries members on sender alone; fallback 2 derives months from `tx.Date` (Sheets serial or `YYYY-MM-DD`).
**`FormatDate`** ([match_payments.py:187-206](../../srv/personal/fuj-management/scripts/match_payments.py#L187)): nil/empty → `""`; int/float → Sheets serial since 1899-12-30 formatted `YYYY-MM-DD`; pre-formatted `YYYY-MM-DD` (length 10, dashes at idx 4/7) → as-is; else `strings.TrimSpace(fmt.Sprint(v))`. **No raise on bad input** — parity contract.
## Parity concerns
- **RE2 `\b`**: Equivalent to Python `\b` on ASCII-folded input (`Normalize` strips diacritics + lowercases). Use `regexp.QuoteMeta` for `re.escape`.
- **Sheets epoch**: 1899-12-30 (NOT 1900-01-01). `time.Date(1899, 12, 30, 0, 0, 0, 0, time.UTC)`.
- **Fractional serials**: Python `timedelta(days=44197.5)` adds 12 hours, then `.strftime("%Y-%m-%d")` discards time. To match exactly use `base.Add(time.Duration(val * 24 * float64(time.Hour)))` then `Format("2006-01-02")`. **Do NOT** use `base.AddDate(0, 0, int(val))` — that silently drops fractional days from real Sheets exports of timestamped cells.
- **`Transaction.Date any`**: Python `tx["date"]` accepts int/float/string transparently. Sheets API returns serial dates as `float64` from JSON; FIO scraper returns `string`. `any` is the faithful port; type-switch inside `FormatDate` and the date fallback in `InferTransactionDetails`.
- **`SearchText` vs `MatchedText`**: Python docstring says `matched_text`, code returns `"search_text"`. Port the code, not the docstring.
- **Default year plumbing**: Go's `czech.ParseMonthReferences(text, defaultYear)` requires explicit year. Python defaults to 2026. Plumb `defaultYear` as the third arg to `InferTransactionDetails`.
- **Empty slices not nil**: Python `match_members` returns `[]` when nothing matches; ensure Go returns `[]Match{}` not `nil` so consumers don't have to nil-check (matches `synch` package style).
## Tests
Port all 6 cases from [tests/test_match_members.py](../../srv/personal/fuj-management/tests/test_match_members.py) verbatim into `match_members_test.go` as one table-driven `TestMatchMembers`. Each row: `name`, `text`, `wantContains []string`, `wantExcludes []string`, `wantAllAuto bool`.
Add table cases for:
- `BuildNameVariants` — docstring example `František Vrbík (Štrúdl)` → 4 variants; nickname filtered (len<3); single-part name; whitespace inside parens
- `FormatDate``nil``""`, `""``""`, `int(44197)``"2020-12-31"`, `float64(44197.5)``"2020-12-31"`, `"2026-04-15"``"2026-04-15"`, `"garbage"``"garbage"`, `" 2026-04-15 "``"2026-04-15"`
- `InferTransactionDetails` — members from search_text, members from sender fallback, months from date-string fallback, months from serial-date fallback, both-paths-fail returns empty slices
Verify expectations against live Python and quote the one-liner in a top-of-file comment, e.g.:
```
PYTHONPATH=scripts:. python -c '
from match_payments import format_date
for v in [None, "", 44197, 44197.5, "2026-04-15", "garbage", " 2026-04-15 "]: print(repr(format_date(v)))
'
```
## Critical files
- **Read for parity** — [scripts/match_payments.py:33-206](../../srv/personal/fuj-management/scripts/match_payments.py#L33), [tests/test_match_members.py](../../srv/personal/fuj-management/tests/test_match_members.py)
- **Reuse** — `czech.Normalize` ([go/internal/domain/czech/normalize.go](../../srv/personal/fuj-management/go/internal/domain/czech/normalize.go#L15)), `czech.ParseMonthReferences` ([parse_month_references.go:61](../../srv/personal/fuj-management/go/internal/domain/czech/parse_month_references.go#L61))
- **Mirror conventions** — [go/internal/domain/synch/synch.go](../../srv/personal/fuj-management/go/internal/domain/synch/synch.go), [go/internal/domain/synch/synch_test.go](../../srv/personal/fuj-management/go/internal/domain/synch/synch_test.go)
- **New** — `go/internal/domain/matching/{doc,name_variants,match_members,infer,format_date}.go` + `*_test.go`
## Out of scope (M2.10 / M4 territory — DO NOT touch)
- `canonical_member_key` ([match_payments.py:20](../../srv/personal/fuj-management/scripts/match_payments.py#L20))
- `reconcile`, `fetch_sheet_data`, `fetch_exceptions` — M2.10 / M4
- Sheets/Drive/FIO I/O glue
- Fixture capture (`tests/fixtures/pure/`) — M3.3 separately
## Verification
1. `cd go && make go-build` — clean build.
2. `cd go && make go-test ./internal/domain/matching/...` — all table tests green.
3. `cd go && make go-lint` — clean (govet, staticcheck, errcheck, gofumpt, unused).
4. Spot-check: pick 23 random non-trivial cases (e.g. `MatchMembers` with mixed auto/review, `FormatDate(44197.5)`) and run the live Python one-liner from each test's comment block to confirm bytes match.
5. Append CHANGELOG entry per [CLAUDE.md](../../srv/personal/fuj-management/CLAUDE.md) (timestamp via `date "+%Y-%m-%d %H:%M %Z"`).
6. Tick M2.7, M2.8, M2.9 in [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md) with the merge SHA.
7. Push branch, open MR via `tea pr create --title "feat(go): port matching helpers (M2.7-2.9)" --base main --head feat/m2-7-2-9-matching-package`, print URL, leave merge to user.

View File

@@ -0,0 +1,129 @@
# Include junior members in payment inference roster
## Context
A bank payment from sender `JIŘÍ KUBÍK` with the message
`Jáchym Kubík: 01/2026+03/2026+04/2026` is being inferred as
`[?] Jáchym Hrušák (G)` instead of the obvious `Jáchym Kubík`, even though
the message contains his exact full name.
**Root cause** (confirmed with the user): `Jáchym Kubík` is in the **junior**
attendance sheet only — he does not appear on the main/adults sheet. But
[scripts/infer_payments.py:101-102](scripts/infer_payments.py#L101-L102)
builds `member_names` by calling `get_members_with_fees()`
([scripts/attendance.py:170](scripts/attendance.py#L170)), which reads only
`EXPORT_URL` (the adults sheet). Junior-only members are therefore invisible
to the matcher.
With Kubík absent from `member_names`, the matcher in
[scripts/match_payments.py:65](scripts/match_payments.py#L65) processes the
combined text `jiri kubik jachym kubik: 01/2026+03/2026+04/2026` against an
adults-only roster:
- The exact-full-name short-circuit (`match_payments.py:75-84`) finds nothing —
no adult's full name is in the text.
- Hrušák `(G)` is the only adult with first name `Jáchym`. He fails the
auto-rules (his surname isn't in the text) but hits the partial-first-name
review rule (`match_payments.py:123-125`) → returned as `("Jáchym Hrušák (G)",
"review")`, rendered as `[?] Jáchym Hrušák (G)`.
The user's original framing — "exact match in message should win over
everything" — is already implemented for any candidate that **is** in the
roster (the May-04 short-circuit). The bug is upstream: the right candidate
was never even considered.
**Goal:** make `infer_payments` consider junior members as candidates, so
junior-only names like `Jáchym Kubík` get matched correctly.
## Approach
Single-file change in [scripts/infer_payments.py](scripts/infer_payments.py).
Replace the adults-only roster lookup with a union of the adult and junior
rosters. `attendance.py` already exposes both:
[`get_members_with_fees()`](scripts/attendance.py#L170) for adults (and tier-J
juniors who train with adults) and
[`get_junior_members_with_fees()`](scripts/attendance.py#L208) for everyone in
the junior sheet.
### Edit at [scripts/infer_payments.py:15](scripts/infer_payments.py#L15)
```python
from attendance import get_members_with_fees, get_junior_members_with_fees
```
### Edit at [scripts/infer_payments.py:99-102](scripts/infer_payments.py#L99-L102)
```python
print("Fetching member list for matching...")
adult_members, _ = get_members_with_fees()
junior_members, _ = get_junior_members_with_fees()
# Union rosters, preserving first-seen order, deduping by canonical key
seen: set[str] = set()
member_names: list[str] = []
for m in adult_members + junior_members:
name = m[0]
key = canonical_member_key(name)
if key in seen:
continue
seen.add(key)
member_names.append(name)
```
`canonical_member_key` already lives in
[scripts/match_payments.py:20](scripts/match_payments.py#L20) — import it
alongside `infer_transaction_details`. It normalizes diacritics/case/whitespace,
so `"Maria Maco"` and `"Mária Maco"` collapse to the same key.
### Why downstream reconciliation still works
`reconcile()` is invoked twice per page — once with the adults roster
([app.py:200](app.py#L200)) and once with the juniors roster
([app.py:384](app.py#L384)). Each call resolves the `Person` cell against its
own roster; a junior name resolves cleanly in the juniors call and lands in
"unmatched" in the adults call. That's already the existing behavior for any
junior payment manually entered into the `Person` column, so no further
changes are needed.
### Files to modify
- [scripts/infer_payments.py](scripts/infer_payments.py) — only the
import + roster construction. ~10-line change.
### Files to read for confidence (no edits)
- [scripts/attendance.py:208-289](scripts/attendance.py#L208-L289) —
`get_junior_members_with_fees` returns `(name, tier, …)` tuples just like
the adults version, so `m[0]` works for both.
- [scripts/match_payments.py:65-137](scripts/match_payments.py#L65-L137) —
`match_members` already handles the precedence the user wants (exact full-name
short-circuit), so once Kubík is in `member_names`, the case will be auto-matched
with no `[?]`.
## Verification
1. **Manual sanity** — re-run inference on the offending row:
- Clear `Person`/`Purpose` for the Kubík row in the payments sheet.
- `make infer`.
- Expect `Person = Jáchym Kubík`, `Purpose = 2026-01, 2026-03, 2026-04`,
no `[?]`.
2. **Unit test** — extend
[tests/test_match_members.py](tests/test_match_members.py) (or add a small
`tests/test_infer_payments.py`) to assert that, given a roster that
includes `Jáchym Hrušák (G)` and `Jáchym Kubík`, the message
`Jáchym Kubík: 01/2026+03/2026+04/2026` resolves to
`[("Jáchym Kubík", "auto")]` only. This is really a regression test for
the May-04 short-circuit — the new behavior under test is just that
`infer_payments` now feeds in juniors.
3. **Run the suite**: `make test`.
4. **Dashboard smoke**`make web`, open `/payments`, confirm the row now
shows the correct member; open `/juniors`, confirm the payment is
credited to Kubík for the three months listed.
5. **Changelog** — once the user confirms the fix, append an entry to
[CHANGELOG.md](CHANGELOG.md) per [CLAUDE.md](CLAUDE.md):
`## YYYY-MM-DD HH:MM TZ — fix: include juniors in payment-inference roster`.

View File

@@ -0,0 +1,2 @@
// Package matching ports name/member matching from scripts/match_payments.py.
package matching

View File

@@ -0,0 +1,41 @@
package matching
import (
"fmt"
"strings"
"time"
)
var sheetsEpoch = time.Date(1899, 12, 30, 0, 0, 0, 0, time.UTC)
// FormatDate normalizes a date value from Google Sheets.
//
// Accepts nil, empty string, int/float64 Sheets serial days since 1899-12-30,
// a pre-formatted "YYYY-MM-DD" string (returned as-is), or any other value
// (returned as fmt.Sprint(v).TrimSpace). Never returns an error.
//
// Ports scripts/match_payments.py format_date.
func FormatDate(val any) string {
if val == nil {
return ""
}
switch v := val.(type) {
case int:
return sheetsEpoch.Add(time.Duration(float64(v) * 24 * float64(time.Hour))).Format("2006-01-02")
case int64:
return sheetsEpoch.Add(time.Duration(float64(v) * 24 * float64(time.Hour))).Format("2006-01-02")
case float64:
return sheetsEpoch.Add(time.Duration(v * 24 * float64(time.Hour))).Format("2006-01-02")
case string:
s := strings.TrimSpace(v)
if s == "" {
return ""
}
if len(s) == 10 && s[4] == '-' && s[7] == '-' {
return s
}
return s
default:
return strings.TrimSpace(fmt.Sprint(v))
}
}

View File

@@ -0,0 +1,49 @@
package matching
// Expected values verified against scripts/match_payments.py on 2026-05-06:
//
// PYTHONPATH=scripts:. python3 -c '
// from match_payments import format_date
// for v in [None, "", 44197, 44197.5, "2026-04-15", "garbage", " 2026-04-15 "]:
// print(repr(format_date(v)))
// '
//
// Output:
//
// ''
// ''
// '2021-01-01'
// '2021-01-01'
// '2026-04-15'
// 'garbage'
// '2026-04-15'
import "testing"
func TestFormatDate(t *testing.T) {
t.Parallel()
cases := []struct {
name string
input any
want string
}{
{name: "nil", input: nil, want: ""},
{name: "empty string", input: "", want: ""},
{name: "serial int", input: int(44197), want: "2021-01-01"},
{name: "serial float fractional", input: float64(44197.5), want: "2021-01-01"},
{name: "already formatted", input: "2026-04-15", want: "2026-04-15"},
{name: "garbage string", input: "garbage", want: "garbage"},
{name: "padded date string trimmed", input: " 2026-04-15 ", want: "2026-04-15"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := FormatDate(tc.input)
if got != tc.want {
t.Errorf("FormatDate(%v) = %q, want %q", tc.input, got, tc.want)
}
})
}
}

View File

@@ -0,0 +1,89 @@
package matching
import (
"fmt"
"fuj-management/go/internal/domain/czech"
"time"
)
// Transaction is the subset of a payment row used by InferTransactionDetails.
// Date accepts string ("YYYY-MM-DD"), float64 (Sheets serial), or int — matching
// the heterogeneous types returned by the Sheets API and the FIO scraper.
type Transaction struct {
Sender string
Message string
UserID string
Date any
}
// InferredDetails is the result of InferTransactionDetails.
type InferredDetails struct {
Members []Match
Months []string
SearchText string
}
// InferTransactionDetails infers which member(s) and month(s) a transaction belongs to.
//
// Search text for member matching: sender + message + user_id.
// Month search text: message + user_id only (sender excluded, matching Python).
// Fallback 1: if no members found, retry match on sender alone.
// Fallback 2: if no months found, derive from tx.Date (Sheets serial or YYYY-MM-DD).
//
// defaultYear seeds czech.ParseMonthReferences (Python defaulted to the current year;
// callers should pass time.Now().Year() or a fixed year for deterministic tests).
//
// Ports scripts/match_payments.py infer_transaction_details.
func InferTransactionDetails(tx Transaction, memberNames []string, defaultYear int) InferredDetails {
searchText := fmt.Sprintf("%s %s %s", tx.Sender, tx.Message, tx.UserID)
members := MatchMembers(searchText, memberNames)
months := czech.ParseMonthReferences(tx.Message+" "+tx.UserID, defaultYear)
if len(members) == 0 {
members = MatchMembers(tx.Sender, memberNames)
}
if len(months) == 0 && tx.Date != nil && tx.Date != "" {
if ym := inferMonthFromDate(tx.Date); ym != "" {
months = []string{ym}
}
}
if months == nil {
months = []string{}
}
return InferredDetails{
Members: members,
Months: months,
SearchText: searchText,
}
}
// inferMonthFromDate converts a date value to "YYYY-MM" for the month fallback.
// Returns "" on any error, matching Python's bare except pass.
func inferMonthFromDate(val any) string {
switch v := val.(type) {
case int:
dt := sheetsEpoch.Add(time.Duration(float64(v) * 24 * float64(time.Hour)))
return dt.Format("2006-01")
case int64:
dt := sheetsEpoch.Add(time.Duration(float64(v) * 24 * float64(time.Hour)))
return dt.Format("2006-01")
case float64:
dt := sheetsEpoch.Add(time.Duration(v * 24 * float64(time.Hour)))
return dt.Format("2006-01")
case string:
if v == "" {
return ""
}
dt, err := time.Parse("2006-01-02", v)
if err != nil {
return ""
}
return dt.Format("2006-01")
default:
return ""
}
}

View File

@@ -0,0 +1,108 @@
package matching
// Expected values verified against scripts/match_payments.py on 2026-05-06:
//
// PYTHONPATH=scripts:. python3 << 'EOF'
// from match_payments import infer_transaction_details
// MEMBERS = ["Tomáš Němeček (Tov)", "Jana Nováková"]
// cases = [
// ({"sender":"Tomas Nemecek","message":"clenske 04/2026","user_id":"","date":"2026-04-15"}, "full match"),
// ({"sender":"Tomas Nemecek","message":"","user_id":"","date":"2026-04-15"}, "sender fallback month"),
// ({"sender":"Jana Novakova","message":"","user_id":"","date":44197}, "serial int date"),
// ({"sender":"neznamy","message":"","user_id":"","date":""}, "no match"),
// ({"sender":"Tomas Nemecek","message":"","user_id":"","date":44197.5}, "serial float date"),
// ]
// for tx, label in cases:
// r = infer_transaction_details(tx, MEMBERS)
// print(label + ": members=" + repr(r["members"]) + " months=" + repr(r["months"]) + " search_text=" + repr(r["search_text"]))
// EOF
//
// Output:
//
// full match: members=[('Tomáš Němeček (Tov)', 'auto')] months=['2026-04'] search_text='Tomas Nemecek clenske 04/2026 '
// sender fallback month: members=[('Tomáš Němeček (Tov)', 'auto')] months=['2026-04'] search_text='Tomas Nemecek '
// serial int date: members=[('Jana Nováková', 'auto')] months=['2021-01'] search_text='Jana Novakova '
// no match: members=[] months=[] search_text='neznamy '
// serial float date: members=[('Tomáš Němeček (Tov)', 'auto')] months=['2021-01'] search_text='Tomas Nemecek '
import (
"reflect"
"testing"
)
var inferMembers = []string{"Tomáš Němeček (Tov)", "Jana Nováková"}
func TestInferTransactionDetails(t *testing.T) {
t.Parallel()
cases := []struct {
name string
tx Transaction
defaultYear int
wantMembers []Match
wantMonths []string
wantSearchText string
}{
{
name: "full match — members and months from search text",
tx: Transaction{Sender: "Tomas Nemecek", Message: "clenske 04/2026", UserID: "", Date: "2026-04-15"},
defaultYear: 2026,
wantMembers: []Match{{Name: "Tomáš Němeček (Tov)", Confidence: ConfidenceAuto}},
wantMonths: []string{"2026-04"},
// Python: sender + " " + message + " " + user_id (no trim)
wantSearchText: "Tomas Nemecek clenske 04/2026 ",
},
{
// months not in message → fall back to date string
name: "months fall back to date string",
tx: Transaction{Sender: "Tomas Nemecek", Message: "", UserID: "", Date: "2026-04-15"},
defaultYear: 2026,
wantMembers: []Match{{Name: "Tomáš Němeček (Tov)", Confidence: ConfidenceAuto}},
wantMonths: []string{"2026-04"},
wantSearchText: "Tomas Nemecek ",
},
{
// months fall back to Sheets serial int date
name: "months fall back to serial int date",
tx: Transaction{Sender: "Jana Novakova", Message: "", UserID: "", Date: int(44197)},
defaultYear: 2026,
wantMembers: []Match{{Name: "Jana Nováková", Confidence: ConfidenceAuto}},
wantMonths: []string{"2021-01"},
wantSearchText: "Jana Novakova ",
},
{
// months fall back to Sheets serial float64 date
name: "months fall back to serial float date",
tx: Transaction{Sender: "Tomas Nemecek", Message: "", UserID: "", Date: float64(44197.5)},
defaultYear: 2026,
wantMembers: []Match{{Name: "Tomáš Němeček (Tov)", Confidence: ConfidenceAuto}},
wantMonths: []string{"2021-01"},
wantSearchText: "Tomas Nemecek ",
},
{
name: "no match — both slices empty not nil",
tx: Transaction{Sender: "neznamy", Message: "", UserID: "", Date: ""},
defaultYear: 2026,
wantMembers: []Match{},
wantMonths: []string{},
wantSearchText: "neznamy ",
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := InferTransactionDetails(tc.tx, inferMembers, tc.defaultYear)
if !reflect.DeepEqual(got.Members, tc.wantMembers) {
t.Errorf("Members\n got %v\n want %v", got.Members, tc.wantMembers)
}
if !reflect.DeepEqual(got.Months, tc.wantMonths) {
t.Errorf("Months\n got %v\n want %v", got.Months, tc.wantMonths)
}
if got.SearchText != tc.wantSearchText {
t.Errorf("SearchText\n got %q\n want %q", got.SearchText, tc.wantSearchText)
}
})
}
}

View File

@@ -0,0 +1,131 @@
package matching
import (
"fuj-management/go/internal/domain/czech"
"strings"
)
// Confidence indicates how certain a member match is.
type Confidence string
const (
ConfidenceAuto Confidence = "auto"
ConfidenceReview Confidence = "review"
)
// Match pairs a canonical member name with the confidence of the match.
type Match struct {
Name string
Confidence Confidence
}
var commonSurnames = map[string]bool{
"novak": true,
"novakova": true,
"prach": true,
}
// MatchMembers finds members mentioned in text and returns them with a
// confidence level of "auto" (reliable) or "review" (needs human verification).
//
// Algorithm (ported verbatim from scripts/match_payments.py match_members):
// 1. Exact short-circuit: if any member's full normalized name appears as whole
// words in normalize(text), return ONLY those matches as auto. This prevents
// nickname "tov" from matching inside surname "ottova".
// 2. Per-member first-match-wins: full-name substring → first+last both present
// (any order) → nickname whole-word. Each yields auto.
// 3. Review tier: last name (len≥4, not a common surname) → first name (len≥3)
// → single-part name (len≥4). Each yields review.
// 4. Final filter: if any auto exists, drop all review.
func MatchMembers(text string, memberNames []string) []Match {
normalizedText := czech.Normalize(text)
// Pass 1: exact short-circuit
var exactMatches []Match
for _, name := range memberNames {
variants := BuildNameVariants(name)
if len(variants) == 0 {
continue
}
fullName := variants[0]
if fullName != "" && wordIn(fullName, normalizedText) {
exactMatches = append(exactMatches, Match{Name: name, Confidence: ConfidenceAuto})
}
}
if len(exactMatches) > 0 {
return exactMatches
}
// Pass 2 + 3: fuzzy matching
var matches []Match
for _, name := range memberNames {
variants := BuildNameVariants(name)
fullName := ""
if len(variants) > 0 {
fullName = variants[0]
}
parts := strings.Fields(fullName)
// Auto tier
if fullName != "" && strings.Contains(normalizedText, fullName) {
matches = append(matches, Match{Name: name, Confidence: ConfidenceAuto})
continue
}
if len(parts) >= 2 {
if wordIn(parts[0], normalizedText) && wordIn(parts[len(parts)-1], normalizedText) {
matches = append(matches, Match{Name: name, Confidence: ConfidenceAuto})
continue
}
}
// Nickname check
if m := nicknameRe.FindStringSubmatch(name); m != nil {
nick := czech.Normalize(m[1])
if nick != "" && wordIn(nick, normalizedText) {
matches = append(matches, Match{Name: name, Confidence: ConfidenceAuto})
continue
}
}
// Review tier
if len(parts) >= 2 {
lastName := parts[len(parts)-1]
firstName := parts[0]
if len(lastName) >= 4 && !commonSurnames[lastName] && wordIn(lastName, normalizedText) {
matches = append(matches, Match{Name: name, Confidence: ConfidenceReview})
continue
}
if len(firstName) >= 3 && wordIn(firstName, normalizedText) {
matches = append(matches, Match{Name: name, Confidence: ConfidenceReview})
continue
}
} else if len(parts) == 1 {
if len(parts[0]) >= 4 && wordIn(parts[0], normalizedText) {
matches = append(matches, Match{Name: name, Confidence: ConfidenceReview})
continue
}
}
}
// Final filter: drop review if any auto exists
hasAuto := false
for _, m := range matches {
if m.Confidence == ConfidenceAuto {
hasAuto = true
break
}
}
if hasAuto {
filtered := matches[:0]
for _, m := range matches {
if m.Confidence == ConfidenceAuto {
filtered = append(filtered, m)
}
}
return filtered
}
if matches == nil {
return []Match{}
}
return matches
}

View File

@@ -0,0 +1,156 @@
package matching
// Expected values verified against scripts/match_payments.py and
// tests/test_match_members.py on 2026-05-06:
//
// PYTHONPATH=scripts:. python3 -c '
// from match_payments import match_members
// MEMBERS = ["Henrietta Ottová", "Tomáš Němeček (Tov)", "František Vrbík (Štrúdl)", "Jana Nováková"]
// cases = [
// ("Henrietta Ottová (Heny): 04/2026", "full name guard"),
// ("platba ottova 04/2026", "ottova surname"),
// ("Henrietta Ottová a Tomáš Němeček 04/2026", "two full names"),
// ("Tov platba 04/2026", "nickname alone"),
// ("Henrietta Ottova 04/2026", "no diacritics"),
// ("Platba od Nemeček Tomas 04/2026", "reversed first+last"),
// ("vrbik clenske", "last name only review"),
// ("jana platba", "first name review"),
// ("neznamy platebce", "no match"),
// ]
// for text, label in cases: print(label + ":", match_members(text, MEMBERS))
// '
//
// Output:
//
// full name guard: [('Henrietta Ottová', 'auto')]
// ottova surname: [('Henrietta Ottová', 'review')]
// two full names: [('Henrietta Ottová', 'auto'), ('Tomáš Němeček (Tov)', 'auto')]
// nickname alone: [('Tomáš Němeček (Tov)', 'auto')]
// no diacritics: [('Henrietta Ottová', 'auto')]
// reversed first+last: [('Tomáš Němeček (Tov)', 'auto')]
// last name only review: [('František Vrbík (Štrúdl)', 'review')]
// first name review: [('Jana Nováková', 'review')]
// no match: []
import (
"testing"
)
var testMembers = []string{
"Henrietta Ottová",
"Tomáš Němeček (Tov)",
"František Vrbík (Štrúdl)",
"Jana Nováková",
}
func TestMatchMembers(t *testing.T) {
t.Parallel()
cases := []struct {
name string
text string
wantContains []string
wantExcludes []string
wantAllAuto bool
}{
{
// Short-circuit: full name matches → "tov" inside "ottova" must NOT fire
name: "full name in message returns only that member",
text: "Henrietta Ottová (Heny): 04/2026",
wantContains: []string{"Henrietta Ottová"},
wantExcludes: []string{"Tomáš Němeček (Tov)"},
wantAllAuto: true,
},
{
// "tov" is a substring of "ottova" — nickname must not match inside a surname
name: "nickname tov not matched inside ottova",
text: "platba ottova 04/2026",
wantExcludes: []string{"Tomáš Němeček (Tov)"},
wantAllAuto: false,
},
{
name: "two full names both auto",
text: "Henrietta Ottová a Tomáš Němeček 04/2026",
wantContains: []string{"Henrietta Ottová", "Tomáš Němeček (Tov)"},
wantAllAuto: true,
},
{
name: "nickname alone matches correctly",
text: "Tov platba 04/2026",
wantContains: []string{"Tomáš Němeček (Tov)"},
wantAllAuto: true,
},
{
name: "full name without diacritics auto",
text: "Henrietta Ottova 04/2026",
wantContains: []string{"Henrietta Ottová"},
wantExcludes: []string{"Tomáš Němeček (Tov)"},
wantAllAuto: true,
},
{
name: "first and last name reversed auto",
text: "Platba od Nemeček Tomas 04/2026",
wantContains: []string{"Tomáš Němeček (Tov)"},
wantAllAuto: true,
},
{
// Last name alone (len≥4, not a common surname) → review confidence
name: "last name only yields review",
text: "vrbik clenske",
wantContains: []string{"František Vrbík (Štrúdl)"},
wantAllAuto: false,
},
{
// First name alone (len≥3) → review confidence
name: "first name only yields review",
text: "jana platba",
wantContains: []string{"Jana Nováková"},
wantAllAuto: false,
},
{
name: "no match returns empty slice",
text: "neznamy platebce",
wantContains: nil,
wantAllAuto: false,
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := MatchMembers(tc.text, testMembers)
// Check required members are present
for _, want := range tc.wantContains {
found := false
for _, m := range got {
if m.Name == want {
found = true
break
}
}
if !found {
t.Errorf("MatchMembers(%q): want %q in result, got %v", tc.text, want, got)
}
}
// Check excluded members are absent
for _, exclude := range tc.wantExcludes {
for _, m := range got {
if m.Name == exclude {
t.Errorf("MatchMembers(%q): %q should not be in result, got %v", tc.text, exclude, got)
}
}
}
// Check all-auto constraint
if tc.wantAllAuto {
for _, m := range got {
if m.Confidence != ConfidenceAuto {
t.Errorf("MatchMembers(%q): expected all auto, got %v", tc.text, got)
}
}
}
})
}
}

View File

@@ -0,0 +1,59 @@
package matching
import (
"fuj-management/go/internal/domain/czech"
"regexp"
"strings"
)
var (
nicknameRe = regexp.MustCompile(`\(([^)]+)\)`)
nicknameStripRe = regexp.MustCompile(`\s*\([^)]*\)\s*`)
)
// BuildNameVariants returns searchable lowercase ASCII variants of a member name.
//
// Example: "František Vrbík (Štrúdl)" → ["frantisek vrbik", "strudl", "vrbik", "frantisek"]
//
// variants[0] is always the full normalized base name (no nickname). MatchMembers relies on
// this invariant for the exact short-circuit pass. Variants shorter than 3 characters are
// dropped.
//
// Ports scripts/match_payments.py _build_name_variants.
func BuildNameVariants(name string) []string {
var nickname string
if m := nicknameRe.FindStringSubmatch(name); m != nil {
nickname = m[1]
}
base := strings.TrimSpace(nicknameStripRe.ReplaceAllString(name, " "))
normalizedBase := czech.Normalize(base)
normalizedNick := czech.Normalize(nickname)
variants := []string{normalizedBase}
if normalizedNick != "" {
variants = append(variants, normalizedNick)
}
parts := strings.Fields(normalizedBase)
if len(parts) >= 2 {
variants = append(variants, parts[len(parts)-1]) // last name
variants = append(variants, parts[0]) // first name
}
filtered := variants[:0]
for _, v := range variants {
if len(v) >= 3 {
filtered = append(filtered, v)
}
}
return filtered
}
// wordIn returns true if needle appears as a whole word in haystack.
// Both needle and haystack must already be ASCII-folded (via czech.Normalize).
func wordIn(needle, haystack string) bool {
pattern := `\b` + regexp.QuoteMeta(needle) + `\b`
matched, _ := regexp.MatchString(pattern, haystack)
return matched
}

View File

@@ -0,0 +1,62 @@
package matching
// Expected values verified against scripts/match_payments.py on 2026-05-06:
//
// PYTHONPATH=scripts:. python3 -c '
// from match_payments import _build_name_variants
// for n in ["František Vrbík (Štrúdl)", "Tov (St)", "Jana", " Petr Novák ( Jenda ) "]:
// print(repr(n), "->", _build_name_variants(n))
// '
//
// Output:
//
// 'František Vrbík (Štrúdl)' -> ['frantisek vrbik', 'strudl', 'vrbik', 'frantisek']
// 'Tov (St)' -> ['tov']
// 'Jana' -> ['jana']
// ' Petr Novák ( Jenda ) ' -> ['petr novak', ' jenda ', 'novak', 'petr']
import (
"reflect"
"testing"
)
func TestBuildNameVariants(t *testing.T) {
t.Parallel()
cases := []struct {
name string
input string
want []string
}{
{
name: "full name with nickname",
input: "František Vrbík (Štrúdl)",
want: []string{"frantisek vrbik", "strudl", "vrbik", "frantisek"},
},
{
name: "nickname too short filtered out",
input: "Tov (St)",
want: []string{"tov"},
},
{
name: "single-part name no nickname",
input: "Jana",
want: []string{"jana"},
},
{
name: "extra whitespace inside parens preserved by normalize",
input: " Petr Novák ( Jenda ) ",
want: []string{"petr novak", " jenda ", "novak", "petr"},
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := BuildNameVariants(tc.input)
if !reflect.DeepEqual(got, tc.want) {
t.Errorf("BuildNameVariants(%q)\n got %q\n want %q", tc.input, got, tc.want)
}
})
}
}

View File

@@ -0,0 +1,65 @@
// Package synch ports the bank-sync deduplication helper from
// scripts/sync_fio_to_sheets.py.
package synch
import (
"crypto/sha256"
"encoding/hex"
"math"
"strconv"
"strings"
)
// Transaction is the projection of a Fio transaction that participates
// in the Sync ID hash. Other fields (ks, ss, sender_account, …) are
// intentionally excluded — they are not part of the Python hash.
//
// Currency: leave "" to inherit the Python default of "CZK" (matches
// the HTML scraper path which omits the key entirely).
type Transaction struct {
Date string
Amount float64
Currency string
Sender string
VS string
Message string
BankID string
}
// GenerateSyncID returns the lowercase SHA-256 hex digest of
// "date|amount|currency|sender|vs|message|bank_id" (lower-cased), used
// as the dedup key in column K of the payments sheet.
//
// Byte-stable with scripts/sync_fio_to_sheets.py generate_sync_id.
func GenerateSyncID(tx Transaction) string {
currency := tx.Currency
if currency == "" {
currency = "CZK"
}
raw := strings.ToLower(strings.Join([]string{
tx.Date,
formatAmount(tx.Amount),
currency,
tx.Sender,
tx.VS,
tx.Message,
tx.BankID,
}, "|"))
sum := sha256.Sum256([]byte(raw))
return hex.EncodeToString(sum[:])
}
// formatAmount mimics Python's str(float) for Fio transaction amounts.
// Python uses decimal notation for abs(f) in [1e-4, 1e16) and scientific
// notation outside that range, always adding ".0" to whole-valued decimals.
func formatAmount(f float64) string {
abs := math.Abs(f)
if abs != 0 && (abs < 1e-4 || abs >= 1e16) {
return strconv.FormatFloat(f, 'e', -1, 64)
}
s := strconv.FormatFloat(f, 'f', -1, 64)
if !strings.ContainsRune(s, '.') {
s += ".0"
}
return s
}

View File

@@ -0,0 +1,119 @@
package synch
import (
"testing"
)
// All expected digests verified against the live Python implementation on 2026-05-06:
//
// PYTHONPATH=scripts:. python -c '
// from sync_fio_to_sheets import generate_sync_id
// cases = [
// {"date":"2026-01-15","amount":500.0,"currency":"CZK","sender":"Jan Novak","vs":"123","message":"clenske 1/2026","bank_id":"abc123"},
// {"date":"2026-01-15","amount":500.0,"sender":"Jan Novak","vs":"123","message":"clenske 1/2026","bank_id":"abc123"},
// {"date":"2026-02-10","amount":1234.56,"currency":"CZK","sender":"ABC SRO","vs":"","message":"FAKTURA 42","bank_id":"xyz"},
// {"date":"2026-03-01","amount":-500.0,"currency":"CZK","sender":"refund","vs":"","message":"","bank_id":""},
// {"date":"2026-04-01","amount":0.0,"currency":"CZK","sender":"","vs":"","message":"","bank_id":""},
// {"date":"","amount":0.0,"currency":"CZK","sender":"","vs":"","message":"","bank_id":""},
// ]
// for c in cases: print(generate_sync_id(c))
// '
func TestGenerateSyncID(t *testing.T) {
t.Parallel()
cases := []struct {
name string
tx Transaction
want string
}{
{
name: "all fields set",
tx: Transaction{
Date: "2026-01-15", Amount: 500.0, Currency: "CZK",
Sender: "Jan Novak", VS: "123", Message: "clenske 1/2026", BankID: "abc123",
},
want: "4ac26598b6f23965380690172156a438a7e97a97dcedf222e5afe1afbe2c1bc4",
},
{
name: "currency empty defaults to CZK",
tx: Transaction{
Date: "2026-01-15", Amount: 500.0, Currency: "",
Sender: "Jan Novak", VS: "123", Message: "clenske 1/2026", BankID: "abc123",
},
want: "4ac26598b6f23965380690172156a438a7e97a97dcedf222e5afe1afbe2c1bc4",
},
{
name: "mixed-case fields lowercased before hashing",
tx: Transaction{
Date: "2026-02-10", Amount: 1234.56, Currency: "CZK",
Sender: "ABC SRO", VS: "", Message: "FAKTURA 42", BankID: "xyz",
},
want: "d40fa224d4fa572ffcd58e308e5c6508c4d5ca087b24ef6ff9284528fc128250",
},
{
name: "negative amount",
tx: Transaction{
Date: "2026-03-01", Amount: -500.0, Currency: "CZK",
Sender: "refund", VS: "", Message: "", BankID: "",
},
want: "0c630a407160367c396a2beec08efb94c319b4d84a8b90cc2be89e6ea10c391f",
},
{
name: "zero amount",
tx: Transaction{
Date: "2026-04-01", Amount: 0.0, Currency: "CZK",
Sender: "", VS: "", Message: "", BankID: "",
},
want: "6a23ce53717cd539064d550d2c2ec5de2e9bf81016d16852820ca9b8e259331f",
},
{
// Python equivalent: {"date":"","amount":0.0,"currency":"CZK","sender":"","vs":"","message":"","bank_id":""}
// Note: Python generate_sync_id({}) hashes "" for missing amount, not "0.0".
name: "zero-value Transaction",
tx: Transaction{},
want: "d33d7e391f5a43f0192bb5a34c0ec15715139125678ecef8e1324af7d943b21d",
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := GenerateSyncID(tc.tx)
if got != tc.want {
t.Errorf("GenerateSyncID(%+v) = %q, want %q", tc.tx, got, tc.want)
}
})
}
}
// All expected strings verified against the live Python implementation on 2026-05-06:
//
// PYTHONPATH=scripts:. python -c '
// for v in [0.0, 500.0, -500.0, 0.1, 1234.56, 99999.99, 1500000.0, 1e16, 1e-5]:
// print(repr(v), "->", repr(str(v)))
// '
func TestFormatAmount(t *testing.T) {
t.Parallel()
cases := []struct {
in float64
want string
}{
{0.0, "0.0"},
{500.0, "500.0"},
{-500.0, "-500.0"},
{0.1, "0.1"},
{1234.56, "1234.56"},
{99999.99, "99999.99"},
{1500000.0, "1500000.0"},
{1e16, "1e+16"},
{1e-5, "1e-05"},
}
for _, tc := range cases {
got := formatAmount(tc.in)
if got != tc.want {
t.Errorf("formatAmount(%v) = %q, want %q", tc.in, got, tc.want)
}
}
}

View File

@@ -11,8 +11,8 @@ sys.path.append(os.path.dirname(os.path.abspath(__file__)))
from googleapiclient.discovery import build
from sync_fio_to_sheets import get_sheets_service, DEFAULT_SPREADSHEET_ID
from match_payments import infer_transaction_details
from attendance import get_members_with_fees
from match_payments import infer_transaction_details, canonical_member_key
from attendance import get_members_with_fees, get_junior_members_with_fees
def parse_czk_amount(val) -> float:
"""Parse Czech currency string or handle raw numeric value."""
@@ -96,10 +96,19 @@ def infer_payments(spreadsheet_id: str, credentials_path: str, dry_run: bool = F
print(f"Current header: {header}")
return
# 2. Fetch members for matching
# 2. Fetch members for matching — union adults + juniors so junior-only
# members (e.g. kids not on the adult sheet) are visible to the matcher.
print("Fetching member list for matching...")
members_data, _ = get_members_with_fees()
member_names = [m[0] for m in members_data]
adult_members, _ = get_members_with_fees()
junior_members, _ = get_junior_members_with_fees()
seen: set[str] = set()
member_names: list[str] = []
for m in adult_members + junior_members:
key = canonical_member_key(m[0])
if key not in seen:
seen.add(key)
member_names.append(m[0])
# 3. Process rows
print("Inferring details for empty rows...")

View File

@@ -48,6 +48,25 @@ class TestMatchMembersExact(unittest.TestCase):
names = [r[0] for r in result]
self.assertIn("Tomáš Němeček (Tov)", names)
def test_shared_first_name_junior_in_roster_wins_exact(self):
# Regression: two members share first name "Jáchym"; message has full name
# of the junior-only member → exact match must win, no [?] on the adult.
roster = ["Jáchym Hrušák (G)", "Jáchym Kubík"]
result = match_members(
"JIŘÍ KUBÍK Jáchym Kubík: 01/2026+03/2026+04/2026", roster
)
self.assertEqual(result, [("Jáchym Kubík", "auto")])
def test_shared_first_name_without_junior_in_roster_falls_back(self):
# Without Kubík in the roster (old behaviour), Hrušák wins via first-name
# partial match — confirms the roster-expansion fix is the real solution.
roster = ["Jáchym Hrušák (G)"]
result = match_members(
"JIŘÍ KUBÍK Jáchym Kubík: 01/2026+03/2026+04/2026", roster
)
names = [r[0] for r in result]
self.assertIn("Jáchym Hrušák (G)", names)
if __name__ == "__main__":
unittest.main()