Compare commits

..

14 Commits

Author SHA1 Message Date
54a783ea00 feat(go/M2.6): port domain/synch.GenerateSyncID
All checks were successful
Deploy to K8s / deploy (push) Successful in 6s
SHA-256 dedup hash from sync_fio_to_sheets.py generate_sync_id.
Key subtlety: Python str(float) emits "500.0" for whole-valued floats
and switches to scientific notation at |f|>=1e16 or |f|<1e-4 —
replicated via formatAmount using 'f'/'e' format selection.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 12:43:41 +02:00
84a5d177e9 Merge pull request 'feat(go/M2.5): port domain/money.ParseCZK' (#7) from feat/m2-5-money-parse-czk into main
All checks were successful
Deploy to K8s / deploy (push) Successful in 6s
Reviewed-on: #7
2026-05-06 07:39:42 +00:00
1a63bfd313 chore: tick M2.5 in progress tracker + CHANGELOG entry
All checks were successful
Deploy to K8s / deploy (push) Successful in 11s
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 09:39:01 +02:00
d24d20553a feat(go/M2.5): port domain/money.ParseCZK
Port scripts/infer_payments.py parse_czk_amount to Go as
internal/domain/money.ParseCZK. Preserves the Czech-locale heuristic
(comma = decimal sep; 2+ dots = thousand seps; single dot = decimal)
and returns (float64, error) so callers can opt into Python's
silent-zero contract via v, _ := money.ParseCZK(s).
All expected values verified against live Python on 2026-05-06.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 09:38:28 +02:00
fa853780db chore: tick M2.3 + M2.4 in progress tracker + CHANGELOG entry
All checks were successful
Deploy to K8s / deploy (push) Successful in 8s
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 09:25:45 +02:00
0fc3b6dd9a Merge pull request 'feat(go/M2.3+M2.4): port domain/fees.CalculateFee and CalculateJuniorFee' (#6) from feat/m2-3-m2-4-domain-fees into main
All checks were successful
Deploy to K8s / deploy (push) Successful in 10s
Reviewed-on: #6
2026-05-06 07:23:02 +00:00
57ec817044 feat(go/M2.3+M2.4): port domain/fees.CalculateFee and CalculateJuniorFee
All checks were successful
Deploy to K8s / deploy (push) Successful in 6s
Ports calculate_fee and calculate_junior_fee from scripts/attendance.py
into a new go/internal/domain/fees package. Introduces the Expected type
(Value int, Unknown bool) for the junior "?" sentinel, keeping the Go
API strictly typed instead of mirroring Python's str|int return.

All 20 table-driven tests pass with -race; golangci-lint clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 00:38:09 +02:00
6cf83a01e3 docs(claude): correct stale adult fee defaults
ADULT_FEE_DEFAULT is 700 CZK, not 750. The 750 appears in
ADULT_FEE_MONTHLY_RATE for most current months but is not the fallback.
Rephrase the member-tiers bullet to point at the dict rather than a
number that drifts each season; update the fee-calc bullet to match
the junior line's style (default 700 vs default 500).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 00:29:19 +02:00
98f401c149 chore: tick M2.2 in progress tracker + CHANGELOG entry
All checks were successful
Deploy to K8s / deploy (push) Successful in 9s
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 00:10:44 +02:00
0a8017fffa Merge pull request 'feat(go/M2.2): port czech.ParseMonthReferences' (#5) from feat/m2-2-parse-month-references into main
All checks were successful
Deploy to K8s / deploy (push) Successful in 10s
Reviewed-on: #5
2026-05-05 22:07:15 +00:00
6d971b61d4 feat(go/M2.2): port czech.ParseMonthReferences
All checks were successful
Deploy to K8s / deploy (push) Successful in 8s
Three-pass regex parser matching python/czech_utils.py parse_month_references:
1. Numeric slash notation — "11+12/2025", "01/26"; 2-digit year → +2000
2. Dot notation — "12.2025" (4-digit year only)
3. Czech month names — range walk (listopad-leden wrap logic) then
   standalone with m≥10 → defaultYear-1 heuristic; longest-match
   alternation (sorted desc by name length) handles cervenec vs cerven

35 table-driven tests, all expected outputs verified against live Python
on 2026-05-05 before locking. Plan at
docs/plans/2026-05-05-2337-go-rewrite-m2-2-parse-month-references.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 00:05:40 +02:00
3460f57c62 chore: tick M2.1 in progress tracker + CHANGELOG entry
All checks were successful
Deploy to K8s / deploy (push) Successful in 9s
go/internal/domain/czech.Normalize merged as 20ade6d (PR #4).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 23:34:00 +02:00
6ca35e2112 docs: Encourage tea CLI for opening MRs
Replaces the "do not use tea/gh/Gitea API" rule with explicit guidance to
run `tea pr create` and print the resulting PR URL. tea is already
authenticated on this machine. Merging stays a manual user action in
Gitea — neither tea nor git CLI may merge or delete branches.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 23:33:31 +02:00
20ade6de3e Merge pull request 'feat(go/M2.1): port czech.Normalize' (#4) from feat/m2-1-czech-normalize into main
All checks were successful
Deploy to K8s / deploy (push) Successful in 10s
Reviewed-on: #4
2026-05-05 21:26:55 +00:00
16 changed files with 1570 additions and 15 deletions

View File

@@ -1,5 +1,37 @@
# Changelog # Changelog
## 2026-05-06 12:43 CEST — feat(go/M2.6): port domain/synch.GenerateSyncID
- New `go/internal/domain/synch` package with `GenerateSyncID(Transaction) string` ported from `scripts/sync_fio_to_sheets.py` `generate_sync_id`.
- Byte-stable SHA-256 hash over `date|amount|currency|sender|vs|message|bank_id` (lowercased, UTF-8); `Currency: ""` defaults to `"CZK"` matching the Python missing-key fallback.
- Key subtlety: Python's `str(float)` emits `"500.0"` for whole-valued floats and switches to scientific notation at `|f| >= 1e16` or `|f| < 1e-4` — replicated in `formatAmount` using `'f'`/`'e'` format selection.
- 6 table-driven hash tests + 9 `formatAmount` tests; all expected values verified against live Python on 2026-05-06.
## 2026-05-06 09:38 CEST — feat(go/M2.5): port domain/money.ParseCZK
- New `go/internal/domain/money` package with `ParseCZK(string) (float64, error)` ported from `scripts/infer_payments.py` `parse_czk_amount`.
- Preserves the Czech-locale heuristic: comma → decimal sep; 2+ dots → thousand seps; single dot → decimal (so `"1.500"``1.5`).
- Returns `(0, ErrInvalidAmount)` on parse failure; callers wanting Python's silent-zero contract use `v, _ := ParseCZK(s)`.
- 15 table-driven tests plus a silent-zero contract test; all expected values verified against live Python on 2026-05-06.
## 2026-05-06 09:24 CEST — feat(go/M2.3+M2.4): port domain/fees.CalculateFee and CalculateJuniorFee
- New `go/internal/domain/fees` package with adult and junior fee calculators ported from `scripts/attendance.py`.
- `CalculateFee(count, monthKey) int``0→0`, `1→200`, `2+→AdultFeeMonthlyRate[month]` (fallback 700 CZK).
- `CalculateJuniorFee(count, monthKey) Expected``0→{0}`, `1→{Unknown:true}` (the `"?"` sentinel, now strictly typed), `2+→JuniorFeeMonthlyRate[month]` (fallback 500 CZK).
- 20 table-driven tests, all verified against live Python; `-race` clean; `golangci-lint` clean.
## 2026-05-06 00:07 CEST — feat(go/M2.2): port czech.ParseMonthReferences
- `internal/domain/czech.ParseMonthReferences`: three-pass regex (numeric slash, dot, Czech month names) with range wrap-around and `m≥10 → previousYear` heuristic, byte-equivalent to Python.
- 35 table-driven tests; all expected outputs verified against live Python before locking (addresses risk #4 from the rewrite plan).
## 2026-05-05 23:33 CEST — feat(go/M2.1): port czech.Normalize
- First M2 pure-domain task: `internal/domain/czech.Normalize` (NFKD + Mn-strip + lowercase), byte-equivalent to Python `czech_utils.normalize`.
- Adds `golang.org/x/text v0.36.0` as first external Go dependency.
- 13-case table-driven test, all spot-checked against Python before locking.
## 2026-05-04 23:08 CEST — fix: payment inference exact-match short-circuit ## 2026-05-04 23:08 CEST — fix: payment inference exact-match short-circuit
- `match_members()` now short-circuits on whole-word full-name hits; nickname/partial checks only run when no full name is present. - `match_members()` now short-circuits on whole-word full-name hits; nickname/partial checks only run when no full name is present.

View File

@@ -64,13 +64,13 @@ Fio Bank API ──► sync_fio_to_sheets.py ──► Google Shee
### Member tiers ### Member tiers
Tiers are set in column B of the attendance sheet: Tiers are set in column B of the attendance sheet:
- `A` — Adult, pays fees (750 CZK/month for 2+ sessions, 200 CZK for exactly 1) - `A` — Adult, pays fees (per-month rate from `ADULT_FEE_MONTHLY_RATE`, fallback 700 CZK for 2+ sessions; 200 CZK for exactly 1)
- `J` — Junior attending adult practices; their attendance is merged with the junior sheet - `J` — Junior attending adult practices; their attendance is merged with the junior sheet
- `X` — Excluded from junior fee calculation (coaches, etc.) - `X` — Excluded from junior fee calculation (coaches, etc.)
### Fee calculation ### Fee calculation
- Adults: 0 sessions → 0, 1 session → 200 CZK, 2+ sessions → monthly rate (default 750 CZK) - Adults: 0 sessions → 0, 1 session → 200 CZK, 2+ sessions → monthly rate (default 700 CZK)
- Juniors: 0 → 0, 1 → `"?"` (manual review required), 2+ → monthly rate (default 500 CZK) - Juniors: 0 → 0, 1 → `"?"` (manual review required), 2+ → monthly rate (default 500 CZK)
- Per-member per-month overrides live in the `exceptions` tab of the payments sheet (columns: Name, Period YYYY-MM, Amount, Note). Exceptions are keyed by `(normalize(name), normalize(period))`. - Per-member per-month overrides live in the `exceptions` tab of the payments sheet (columns: Name, Period YYYY-MM, Amount, Note). Exceptions are keyed by `(normalize(name), normalize(period))`.
@@ -105,12 +105,23 @@ request flow:
2. **Commit on the branch** following the existing commit conventions 2. **Commit on the branch** following the existing commit conventions
(Co-Authored-By trailer, etc.). (Co-Authored-By trailer, etc.).
3. **Push the branch** to `origin` with `-u` so it tracks. 3. **Push the branch** to `origin` with `-u` so it tracks.
4. **Print the Gitea compare URL** so the user can open the MR in the browser: 4. **Open the MR with `tea`** rather than printing a compare URL:
`https://gitea.home.hrajfrisbee.cz/kacerr/fuj-management/compare/main...<branch>`
Do **not** use `tea`, `gh`, or call the Gitea API — the user opens and ```bash
merges the MR themselves. tea pr create \
5. **Do not merge or delete the branch** from the CLI. The user does that --title "<short title>" \
in Gitea. --description "<body>" \
--base main \
--head <branch>
```
`tea` is already authenticated against the Gitea instance; just run it.
Print the resulting PR URL for the user. If `tea` is unavailable for
some reason, fall back to printing the compare URL
(`https://gitea.home.hrajfrisbee.cz/kacerr/fuj-management/compare/main...<branch>`)
and let the user open the MR manually.
5. **Do not merge or delete the branch** from the CLI — neither via `tea`,
`gh`, nor `git push --delete`. The user does that in Gitea.
**Exceptions — when committing straight to `main` is fine:** **Exceptions — when committing straight to `main` is fine:**
- Small bug fixes / hotfixes the user describes as such. - Small bug fixes / hotfixes the user describes as such.

View File

@@ -4,7 +4,7 @@ Companion to [2026-05-03-2349-go-backend-rewrite.md](2026-05-03-2349-go-backend-
**Current milestone:** M2 — Pure-domain helpers **Current milestone:** M2 — Pure-domain helpers
**Started:** 2026-05-04 **Started:** 2026-05-04
**Last updated:** 2026-05-04 **Last updated:** 2026-05-06
## How to use ## How to use
@@ -44,12 +44,12 @@ Goal: every pure function from the Python backend exists in Go with a parity tes
Each task: port the function, write Go unit tests for fresh cases, hook into the Tier-1 parity runner. Each task: port the function, write Go unit tests for fresh cases, hook into the Tier-1 parity runner.
- [ ] **M2.1** `domain/czech.Normalize` — port [czech_utils.py](scripts/czech_utils.py) `normalize` (NFKD + combining-mark strip + lowercase) - [x] **M2.1** `domain/czech.Normalize` — port [czech_utils.py](scripts/czech_utils.py) `normalize` (NFKD + combining-mark strip + lowercase) — `20ade6d`
- [ ] **M2.2** `domain/czech.ParseMonthReferences` — port `parse_month_references` (45 month declensions, range wrap, year inference) - [x] **M2.2** `domain/czech.ParseMonthReferences` — port `parse_month_references` (45 month declensions, range wrap, year inference) — `0a8017f`
- [ ] **M2.3** `domain/fees.CalculateFee` — port [attendance.py](scripts/attendance.py) `calculate_fee` (constants table) - [x] **M2.3** `domain/fees.CalculateFee` — port [attendance.py](scripts/attendance.py) `calculate_fee` (constants table) — `0fc3b6d`
- [ ] **M2.4** `domain/fees.CalculateJuniorFee` — port `calculate_junior_fee` with `Expected{Value int; Unknown bool}` for the `"?"` sentinel - [x] **M2.4** `domain/fees.CalculateJuniorFee` — port `calculate_junior_fee` with `Expected{Value int; Unknown bool}` for the `"?"` sentinel — `0fc3b6d`
- [ ] **M2.5** `domain/money.ParseCZK` — port [infer_payments.py](scripts/infer_payments.py) `parse_czk_amount` (Czech locale: comma decimal, dot/space thousand separators) - [x] **M2.5** `domain/money.ParseCZK` — port [infer_payments.py](scripts/infer_payments.py) `parse_czk_amount` (Czech locale: comma decimal, dot/space thousand separators) — `d24d205`
- [ ] **M2.6** `domain/synch.GenerateSyncID` — port [sync_fio_to_sheets.py](scripts/sync_fio_to_sheets.py) `generate_sync_id` (SHA-256, byte-stable hash; verify float string format against real sheet rows) - [x] **M2.6** `domain/synch.GenerateSyncID` — port [sync_fio_to_sheets.py](scripts/sync_fio_to_sheets.py) `generate_sync_id` (SHA-256, byte-stable hash; verify float string format against real sheet rows)
- [ ] **M2.7** `domain/matching.BuildNameVariants` + `MatchMembers` — port `_build_name_variants` and `match_members` from [match_payments.py](scripts/match_payments.py) (auto vs review confidence, common-surname filter) - [ ] **M2.7** `domain/matching.BuildNameVariants` + `MatchMembers` — port `_build_name_variants` and `match_members` from [match_payments.py](scripts/match_payments.py) (auto vs review confidence, common-surname filter)
- [ ] **M2.8** `domain/matching.InferTransactionDetails` — port `infer_transaction_details` (composes name + month parsing) - [ ] **M2.8** `domain/matching.InferTransactionDetails` — port `infer_transaction_details` (composes name + month parsing)
- [ ] **M2.9** `domain/matching.FormatDate` — port `format_date` (handles Google Sheets serial-day numbers since 1899-12-30) - [ ] **M2.9** `domain/matching.FormatDate` — port `format_date` (handles Google Sheets serial-day numbers since 1899-12-30)

View File

@@ -0,0 +1,205 @@
# Plan: Go rewrite — M2.2 `domain/czech.ParseMonthReferences`
## Context
M2.1 (`domain/czech.Normalize`) merged via PR #4 (`d9a61b3`) on
2026-05-05. Per the [progress tracker](2026-05-03-2349-go-backend-rewrite-progress.md),
**M2.2** is next: port `parse_month_references` from
[scripts/czech_utils.py](../../scripts/czech_utils.py) to Go as
`internal/domain/czech.ParseMonthReferences`.
This function is the second-most-load-bearing pure helper after
`reconcile`: every payment-message → month inference goes through it.
Risk #4 in the [parent plan](2026-05-03-2349-go-backend-rewrite.md)
specifically calls out its semantics — wrap-around year inference and
the `m >= 10 → previous year` standalone heuristic — as easy to mis-port.
This plan locks the test table against the live Python implementation
*before* coding, so the Go port has a verified parity baseline even
before the M3.1/M3.2 fixture infrastructure exists.
## Scope
- New file `go/internal/domain/czech/parse_month_references.go` in the
existing `czech` package (alongside [normalize.go](../../go/internal/domain/czech/normalize.go)).
- New file `go/internal/domain/czech/parse_month_references_test.go`
with the test table below.
- **Out of scope:** parity-fixture wiring (M3.1/M3.2); CLI hook-up
(M2.11/M2.12); any consumer call-sites.
- **No new dependencies** — stdlib `regexp`, `sort`, `strconv`, `strings`
plus the existing `czech.Normalize` cover everything.
## Recommended approach
### Python contract to mirror
Three regex passes, all run on `normalize(text)`:
1. `([\d+]+)\s*/\s*(\d{2,4})` — captures `"11+12/2025"`, `"01/26"`, `"1/26"`.
Split the months part on `+`, keep digit-only tokens, validate `1..12`.
Year < 100 → year + 2000.
2. `(\d{1,2})\s*\.\s*(\d{4})` — captures `"12.2025"`. **4-digit year only**
(so `"1.26"` does not match).
3. Czech month names. First the **range** sub-pass:
`(name)\s*-\s*(name)` finds pairs; walk start→end with `m % 12 + 1`,
stopping when `m == end_m`. Wrap rule: if `start_m > end_m`, months
`>= start_m` are `defaultYear - 1`, the rest are `defaultYear`. Both
matched names go into a `foundInRanges` set.
Then the **standalone** sub-pass: `\b(name)\b`, skipping any name in
`foundInRanges`. For each remaining match, `m >= 10 → defaultYear - 1`,
else `defaultYear`.
Output: sorted, deduplicated `[]string` of `"YYYY-MM"`.
### Go signature
```go
package czech
// ParseMonthReferences extracts YYYY-MM month references from Czech
// free text. defaultYear seeds two heuristics: standalone month names
// with m >= 10 are treated as defaultYear-1 (out-of-year backfill), and
// wrap-around ranges (e.g. listopad-leden) place months >= start in
// defaultYear-1.
func ParseMonthReferences(text string, defaultYear int) []string
```
Required `defaultYear` (no default value — Go convention).
### Implementation sketch
```go
var czechMonths = map[string]int{
"leden": 1, "ledna": 1, "lednu": 1,
"unor": 2, "unora": 2, "unoru": 2,
"brezen": 3, "brezna": 3, "breznu": 3,
"duben": 4, "dubna": 4, "dubnu": 4,
"kveten": 5, "kvetna": 5, "kvetnu": 5,
"cerven": 6, "cervna": 6, "cervnu": 6,
"cervenec": 7, "cervnce": 7, "cervenci": 7,
"srpen": 8, "srpna": 8, "srpnu": 8,
"zari": 9,
"rijen": 10, "rijna": 10, "rijnu": 10,
"listopad": 11, "listopadu": 11,
"prosinec": 12, "prosince": 12, "prosinci": 12,
}
// Sorted by descending length at init, so longer alternatives win in
// the regex (e.g. "cervenec" beats "cerven"). Mirrors Python's
// sorted(..., key=len, reverse=True).
var monthNameAlt = buildMonthNameAlt()
var (
numericRe = regexp.MustCompile(`([\d+]+)\s*/\s*(\d{2,4})`)
dotRe = regexp.MustCompile(`(\d{1,2})\s*\.\s*(\d{4})`)
rangeRe = regexp.MustCompile(`(` + monthNameAlt + `)\s*-\s*(` + monthNameAlt + `)`)
standRe = regexp.MustCompile(`\b(` + monthNameAlt + `)\b`)
)
```
Three Go-specific gotchas worth a code comment:
1. **RE2 alternation is leftmost-first**, same as Python `re`. Sorting
month names by descending length is therefore necessary (otherwise
`"cervenec"` matches as `"cerven"` + leftover `"ec"`). Mirror the
Python sort exactly.
2. **Map iteration is randomized in Go.** Build the alternation list
from a sorted slice of keys, not by iterating the map.
3. **`\d` and `\b`** in Go RE2 are ASCII-only, which matches the
effective behavior on `Normalize`'d input (NFKD already collapsed
any Unicode digits/letters that would matter; standalone Devanagari
digits in member messages aren't a real-world concern).
The walk loop uses a bounded counter (max 12 iterations) defensively in
Go; Python's `while True` is fine because every range terminates within
12 hops, but a future reader appreciates the bound.
### Test table (verified against live Python — `default_year=2026`)
Locked outputs from `PYTHONPATH=scripts:. python -c 'from czech_utils
import parse_month_references; print(parse_month_references(<input>, 2026))'`
on 2026-05-05.
| # | Input | Expected | Path exercised |
|---|---|---|---|
| 1 | `""` | `[]` | empty |
| 2 | `"11+12/2025"` | `["2025-11", "2025-12"]` | numeric, plus-split |
| 3 | `"1/2026"` | `["2026-01"]` | numeric, single |
| 4 | `"01/26"` | `["2026-01"]` | 2-digit year normalization |
| 5 | `"11+12/25"` | `["2025-11", "2025-12"]` | plus-split + 2-digit year |
| 6 | `"12+1+2/2026"` | `["2026-01", "2026-02", "2026-12"]` | sorting |
| 7 | `"12.2025"` | `["2025-12"]` | dot pattern |
| 8 | `"1.26"` | `[]` | dot pattern requires 4-digit year |
| 9 | `"leden"` | `["2026-01"]` | standalone, m<10 |
| 10 | `"prosinec"` | `["2025-12"]` | standalone, m≥10 → previous year |
| 11 | `"prosince"` | `["2025-12"]` | declension |
| 12 | `"lednu"` | `["2026-01"]` | declension |
| 13 | `"rijen"` | `["2025-10"]` | m≥10 boundary (10 itself) |
| 14 | `"zari"` | `["2026-09"]` | m<10 just below boundary |
| 15 | `"listopad-leden"` | `["2025-11", "2025-12", "2026-01"]` | wrap range Nov→Jan |
| 16 | `"rijen-leden"` | `["2025-10", "2025-11", "2025-12", "2026-01"]` | wrap from October |
| 17 | `"unor-kveten"` | `["2026-02", "2026-03", "2026-04", "2026-05"]` | non-wrap range |
| 18 | `"leden-leden"` | `["2026-01"]` | degenerate range |
| 19 | `"unor-listopad"` | `["2026-02", ..., "2026-11"]` (10 entries) | range spans m≥10 — heuristic does NOT fire (range exclusion) |
| 20 | `"cervenec-srpen"` | `["2026-07", "2026-08"]` | longest-match alt (`cervenec` not `cerven`+`ec`) |
| 21 | `"listopad-leden, prosinec"` | `["2025-11", "2025-12", "2026-01"]` | range + standalone, dedup |
| 22 | `"prosinec leden"` | `["2025-12", "2026-01"]` | two standalones, no range |
| 23 | `"11+12/2025, leden-brezen"` | `["2025-11", "2025-12", "2026-01", "2026-02", "2026-03"]` | numeric + range mix |
| 24 | `"11+12/25 a listopad"` | `["2025-11", "2025-12"]` | dedup across passes |
| 25 | `"prosince/2025"` | `["2025-12"]` | numeric pattern fails (no digits before `/`); standalone fires |
| 26 | `"listopad-prosinec/2025"` | `["2026-11", "2026-12"]` | range wins; numeric pattern fails |
| 27 | `"01.2026 / 02.2026"` | `["2026-01", "2026-02"]` | dot pattern only; numeric matches `(2026, 02)` but month 2026 is out of range |
| 28 | `"/12/2025"` | `["2025-12"]` | numeric matches at second `/` |
| 29 | `"PROSINEC"` | `["2025-12"]` | normalize lowercases |
| 30 | `"Žluťoučký prosinec"` | `["2025-12"]` | normalize strips diacritics |
| 31 | `"Únor - květen"` | `["2026-02", ..., "2026-05"]` | range tolerates spaces around `-`, diacritics survive normalize |
| 32 | `"platba 11/2025 a leden"` | `["2025-11", "2026-01"]` | mixed natural-language |
| 33 | `"December"` | `[]` | English month names not recognized |
| 34 | `"11+12/2025 11+12/2025"` | `["2025-11", "2025-12"]` | dedup of repeated input |
| 35 | `"leden 2026"` | `["2026-01"]` | trailing year is ignored unless dot/slash separator present |
35 cases is enough to lock semantics; the M3.x corpus will pile on
real-message fixtures later.
### Wire-up
- No `go.mod` changes (stdlib only).
- No CLI changes.
- `Normalize` is in the same package, so call it directly.
## Critical files
- New: [go/internal/domain/czech/parse_month_references.go](../../go/internal/domain/czech/parse_month_references.go)
- New: [go/internal/domain/czech/parse_month_references_test.go](../../go/internal/domain/czech/parse_month_references_test.go)
- Reference (read-only): [scripts/czech_utils.py](../../scripts/czech_utils.py) — the porting source
- Reference (read-only): [docs/plans/2026-05-03-2349-go-backend-rewrite.md](2026-05-03-2349-go-backend-rewrite.md) — risk #4
- Reuses: [go/internal/domain/czech/normalize.go](../../go/internal/domain/czech/normalize.go) — `Normalize` is called once at the top of `ParseMonthReferences`
## Verification
End-to-end checks before marking M2.2 done:
1. `cd go && go build ./...` — clean compile.
2. `cd go && go test ./internal/domain/czech/...` — all 35 table cases green.
3. `cd go && go test -race ./...` — race-clean (regex compiles are global; verify no init races).
4. `cd go && golangci-lint run` (or `make go-lint` from repo root) — clean, gofumpt-formatted.
5. **Spot parity** (manual, will be automated in M3.x): each test input has its expected output captured from the live Python implementation on 2026-05-05; the test table itself is the parity record. If any case diverges during implementation, re-run Python with the exact input to confirm the truth and update either the Go code or the test entry.
6. `make go-build && make go-test && make go-lint` from repo root — proves M1/M2.1 gate still passes.
## Branching & follow-up
Per [CLAUDE.md](../../CLAUDE.md), this is feature work → branch + Gitea MR via `tea`:
- Branch: `feat/m2-2-parse-month-references` off `main`.
- Single focused commit, Co-Authored-By trailer.
- Push with `-u`.
- Open MR with `tea pr create --title "feat(go/M2.2): port czech.ParseMonthReferences" --description ... --base main --head feat/m2-2-parse-month-references`. Print the MR URL for the user.
- User merges/deletes the branch in Gitea — never from the CLI.
After merge (small doc edits land straight on `main` per CLAUDE.md exception):
- Tick `M2.2` in the [progress tracker](2026-05-03-2349-go-backend-rewrite-progress.md) with the merge SHA.
- Add a one-line `CHANGELOG.md` entry (timestamp via `date "+%Y-%m-%d %H:%M %Z"`).
- Record any porting surprise (e.g. an unexpected diff between Go RE2 and Python `re`) in the tracker's "Notes & decisions" section.
Next task is **M2.3 `domain/fees.CalculateFee`** — straightforward constants table; no parser semantics to debate.

View File

@@ -0,0 +1,199 @@
# M2.5 — Port `parse_czk_amount` to `domain/money.ParseCZK`
> On execution, this plan should be moved to
> `docs/plans/2026-05-06-0928-go-m2-5-money-parse-czk.md` per project CLAUDE.md
> (`docs/plans/YYYY-MM-DD-HHMM-<slug>.md`). Plan mode forces it to live under
> `~/.claude/plans/` until then.
## Context
Continuing the Go backend rewrite tracked in
[2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md).
M2.1M2.4 are landed. Next leaf-level pure function is
`parse_czk_amount` from [scripts/infer_payments.py:17-45](../../srv/personal/fuj-management/scripts/infer_payments.py#L17-L45),
the Czech-locale amount parser used at [scripts/infer_payments.py:124](../../srv/personal/fuj-management/scripts/infer_payments.py#L124)
when reading the `Inferred Amount` column out of the payments sheet.
It's a small, isolated string→float helper, but its heuristic for
disambiguating `.` and `,` as decimal vs thousand separator is
non-obvious and needs to behave identically in Go to keep parity once
the Go infer pipeline lands in M4.8.
## Python behaviour (the spec)
```py
def parse_czk_amount(val) -> float:
if val is None or val == "":
return 0.0
if isinstance(val, (int, float)):
return float(val)
val = str(val)
val = val.replace("", "").replace("CZK", "").strip()
if "," in val:
# 1.500,00 -> 1500.00 — comma is decimal sep
val = val.replace(".", "").replace(" ", "").replace(",", ".")
else:
if val.count(".") > 1:
# 1.500.000 -> 1500000 — multiple dots = thousand sep
val = val.replace(".", "").replace(" ", "")
else:
# "1 500.00" -> "1500.00", "1.500" stays "1.500" (= 1.5)
val = val.replace(" ", "")
try:
return float(val)
except ValueError:
return 0.0
```
Key behavioural notes for the Go port:
1. Empty / None → 0, no error.
2. `"1.500"` (single dot, no comma) is parsed as **1.5**, not 1500.
The heuristic intentionally treats a lone dot as decimal.
3. `"1.500,00"` → 1500.0 (comma wins, dots are thousand seps).
4. `"1.500.000"` → 1500000.0 (multiple dots → all thousand seps).
5. `"1 500"` / `"1 500.00"` / `"500 Kč"` → spaces stripped.
6. Garbage → 0, no error in Python.
7. Strips literal substrings `"Kč"` and `"CZK"` (case-sensitive in Python).
## Approach
Create new package `internal/domain/money` mirroring the layout of
`internal/domain/fees` (single-file module + test file alongside).
### Signature
```go
// Package money ports Czech-locale currency parsing from
// scripts/infer_payments.py.
package money
// ParseCZK parses a Czech-locale amount string and returns the value
// in CZK as a float64.
//
// Mirrors scripts/infer_payments.py parse_czk_amount:
// - empty input → (0, nil)
// - "Kč"/"CZK" suffixes are stripped (case-sensitive, like Python)
// - if input contains ",", comma is the decimal separator and
// dots/spaces are thousand separators ("1.500,00" → 1500.0)
// - else if input contains 2+ dots, all dots are thousand seps
// ("1.500.000" → 1500000.0)
// - else single dot stays as the decimal point ("1.500" → 1.5,
// matching the Python heuristic)
// - on parse failure, returns (0, ErrInvalidAmount). Callers wanting
// Python-equivalent silent-zero behaviour can discard the error.
func ParseCZK(s string) (float64, error)
```
`ErrInvalidAmount` is a package-level sentinel:
```go
var ErrInvalidAmount = errors.New("money: invalid CZK amount")
```
Why `(float64, error)` instead of mirroring Python's silent zero:
- Go idiom prefers explicit errors.
- The single Python call site doesn't distinguish parse-fail from
empty-input (both → 0), so if we want byte-equal behaviour at the
Go infer site (M4.8), the caller can `v, _ := money.ParseCZK(s)`
and get exactly the Python result.
- Future callers (e.g. user-facing import flows) may want to surface
the error.
This matches the precedent set in M2.4 where we used
`Expected{Unknown bool}` rather than copying the Python `"?"` sentinel
verbatim — Go-idiomatic surface, parity-preserving semantics.
### Polymorphic input?
Python's `parse_czk_amount` also accepts raw int/float (passed through
unchanged) because Google Sheets API can return numeric cells as
`float64` rather than strings. **Skip this in Go.** The Sheets IO
adapter is M4.2, and that's where the `[]any` → string normalisation
will live. Keeping `ParseCZK` string-only keeps the leaf function tiny.
### Tests
`money_test.go` mirrors the existing `fees_test.go` table-driven style,
including the verification comment showing the Python command used to
confirm each expected value:
```sh
PYTHONPATH=scripts:. python -c '
from infer_payments import parse_czk_amount
for v in [None, "", "0", "500", "500 Kč", "500 CZK",
"1 500", "1500.00", "1 500.00",
"1.500,00", "1500,5", "1.500.000",
"1.500", "abc", " ", "100,5 Kč"]:
print(repr(v), "->", parse_czk_amount(v))
'
```
Cases to cover (all numeric outputs verified against the Python output
of the snippet above):
| input | expected |
|---|---|
| `""` | 0 |
| `"0"` | 0 |
| `"500"` | 500 |
| `"500 Kč"` | 500 |
| `"500 CZK"` | 500 |
| `"1 500"` | 1500 |
| `"1500.00"` | 1500 |
| `"1 500.00"` | 1500 |
| `"1.500,00"` | 1500 |
| `"1500,5"` | 1500.5 |
| `"1.500.000"` | 1500000 |
| `"1.500"` | 1.5 *(heuristic — single dot = decimal)* |
| `"100,5 Kč"` | 100.5 |
| `"abc"` | 0, returns `ErrInvalidAmount` |
| `" "` | 0, returns `ErrInvalidAmount` *(or 0 nil — confirm against Python; trim leaves `""`, then `float("")` raises → Python returns 0; Go test will assert whichever Python actually produces)* |
The `" "` row is the only one that needs the Python verification step
to settle — once verified, lock the behaviour in.
Also add a "documentation example" assertion in the test that
`v, _ := ParseCZK(s)` recovers the Python silent-zero contract for
every garbage input, so we don't lose that property at the Go infer
call site.
## Files to create
- `go/internal/domain/money/money.go` — package + `ParseCZK` + `ErrInvalidAmount`
- `go/internal/domain/money/money_test.go` — table-driven tests
No existing Go files need editing.
## Verification
```sh
cd go && go test ./internal/domain/money/...
make go-lint
make go-build # sanity: nothing else broke
```
Also run the Python snippet from the Tests section above and diff its
output against the test table to confirm parity.
## Out of scope (explicit non-goals)
- Polymorphic `any` input — leave for M4.2 IO adapter.
- Hooking into the Tier-1 parity runner — that comes with M3.5
(`-tags=parity` build constraint). M2.5 just needs unit tests.
- Any callsite migration — `infer_payments.py` keeps using its own
Python function until M4.8.
## Progress tracker + changelog
After the commit lands:
- Tick `M2.5` in [docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md)
with the commit SHA, mirroring the M2.4 entry style.
- Add a CHANGELOG.md entry at top:
`## YYYY-MM-DD HH:MM TZ — feat(go/M2.5): port domain/money.ParseCZK`.
Branch: `feat/m2-5-money-parse-czk` (per CLAUDE.md branch-per-feature
workflow). Push, open MR via `tea pr create`, leave merge to the user.

View File

@@ -0,0 +1,265 @@
## Context
Continuing the Go backend rewrite tracked in
[2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md).
M2.1M2.5 are landed. Next leaf-level pure function is `generate_sync_id`
from [scripts/sync_fio_to_sheets.py:62-77](../../srv/personal/fuj-management/scripts/sync_fio_to_sheets.py#L62-L77).
It computes a SHA-256 hash over a fixed seven-field projection of a Fio
transaction (`date|amount|currency|sender|vs|message|bank_id`) and is
the deduplication key written into column K (`Sync ID`) of the payments
sheet. The Go port must produce a **byte-identical** digest for the same
transaction; otherwise the Go-side sync (M4.7) would re-append rows
already written by the Python sync, double-counting payments.
The non-trivial part is the `amount` field's string serialisation:
upstream `fio_utils.py` always supplies `amount` as a Python `float`
(API path: `float(val(1) or 0)`; HTML path: `parse_czech_amount(...)`
which returns `float`). Python's `str(float)` produces `"500.0"` for
whole-valued floats; Go's `strconv.FormatFloat(f, 'g', -1, 64)` produces
`"500"`. This is the gotcha called out in the M2.6 line of the progress
tracker.
## Python behaviour (the spec)
```py
def generate_sync_id(tx: dict) -> str:
components = [
str(tx.get("date", "")),
str(tx.get("amount", "")),
str(tx.get("currency", "CZK")),
str(tx.get("sender", "")),
str(tx.get("vs", "")),
str(tx.get("message", "")),
str(tx.get("bank_id", "")),
]
raw_str = "|".join(components).lower()
return hashlib.sha256(raw_str.encode("utf-8")).hexdigest()
```
Behavioural notes for the Go port:
1. **Field order is load-bearing.** `date|amount|currency|sender|vs|message|bank_id` exactly.
2. **Separator is `"|"`.**
3. **Whole string is `.lower()`-ed before hashing** (so e.g. "ABC" sender vs "abc" hash identically). Unicode lower; in practice Fio data is ASCII + Czech diacritics.
4. **`currency` defaults to `"CZK"`** when missing from the dict (HTML scraper path never sets it). Other fields default to `""`.
5. **`amount` is a `float`.** Always. Real Fio data is `500.0`, `1234.56`, etc. — no NaN/Inf, but parity test must pin the format.
6. **Output is `hashlib.sha256(...).hexdigest()`** — 64-char lowercase hex.
7. **Encoding is UTF-8.**
### `str(float)` cases observed in real Fio amounts
| float64 | Python `str(f)` | Go `strconv.FormatFloat(f,'g',-1,64)` | Need |
|---|---|---|---|
| `500.0` | `"500.0"` | `"500"` | append `.0` |
| `1234.56` | `"1234.56"` | `"1234.56"` | matches |
| `0.0` | `"0.0"` | `"0"` | append `.0` |
| `-500.0` | `"-500.0"` | `"-500"` | append `.0` |
| `0.1` | `"0.1"` | `"0.1"` | matches |
| `99999.99` | `"99999.99"` | `"99999.99"` | matches |
For the Fio amount domain (signed CZK, ≤ ~7 digits, ≤2 decimal places),
the rule "`'g'` with prec -1, then append `.0` if result has no `.` and
no `e`/`E`" is exact. We do not need to handle Python's
scientific-notation crossover (`>= 1e16`) for real data, but the
implementation should still cope with it correctly via the same rule.
## Approach
Create new package `internal/domain/synch` mirroring the layout of
`internal/domain/money` (single-file module + test file alongside).
### Package + signature
```go
// Package synch ports the bank-sync deduplication helper from
// scripts/sync_fio_to_sheets.py.
package synch
// Transaction is the projection of a Fio transaction that participates
// in the Sync ID hash. Other fields (ks, ss, sender_account, …) are
// intentionally excluded — they are not part of the Python hash.
//
// Currency: leave "" to inherit the Python default of "CZK" (matches
// the HTML scraper path which omits the key entirely).
type Transaction struct {
Date string
Amount float64
Currency string
Sender string
VS string
Message string
BankID string
}
// GenerateSyncID returns the lowercase SHA-256 hex digest of
// "date|amount|currency|sender|vs|message|bank_id" (lower-cased), used
// as the dedup key in column K of the payments sheet.
//
// Byte-stable with scripts/sync_fio_to_sheets.py generate_sync_id.
func GenerateSyncID(tx Transaction) string
```
### `Currency` default
In Go every struct field is always present, so we lose Python's
"missing key vs empty string" distinction. Real-world data either sets
`currency = "CZK"` (API path) or omits the key (HTML path → `"CZK"`
default). Empty string never occurs in practice. The Go port collapses
the two by treating `Currency == ""` as "use `CZK`":
```go
currency := tx.Currency
if currency == "" {
currency = "CZK"
}
```
This is byte-equal to Python for every input we will ever see in
production, and avoids forcing callers to pass a `*string`.
### Float formatter
Internal helper, unexported:
```go
// formatAmount mimics Python's str(float) for the float values that
// appear in Fio transactions. For mundane decimal amounts the rule
// is: format with 'g' precision -1, then append ".0" if the result
// has no decimal point and no exponent.
func formatAmount(f float64) string {
s := strconv.FormatFloat(f, 'g', -1, 64)
if !strings.ContainsAny(s, ".eE") {
s += ".0"
}
return s
}
```
Tested explicitly (see Tests below) so the edge cases (`0`, whole
numbers, negatives, large/small with exponent) stay locked.
### Hash composition
```go
func GenerateSyncID(tx Transaction) string {
currency := tx.Currency
if currency == "" {
currency = "CZK"
}
raw := strings.ToLower(strings.Join([]string{
tx.Date,
formatAmount(tx.Amount),
currency,
tx.Sender,
tx.VS,
tx.Message,
tx.BankID,
}, "|"))
sum := sha256.Sum256([]byte(raw))
return hex.EncodeToString(sum[:])
}
```
(`crypto/sha256` + `encoding/hex` — both stdlib, no `go.mod` change.)
## Tests
`synch_test.go` mirrors `money_test.go`'s table-driven style with the
verification snippet at the top of the function. Two test functions:
### 1. `TestGenerateSyncID`
Each row's expected digest is computed from the Python source:
```sh
PYTHONPATH=scripts:. python -c '
from sync_fio_to_sheets import generate_sync_id
cases = [
{"date":"2026-01-15","amount":500.0,"currency":"CZK","sender":"Jan Novak","vs":"123","message":"clenske 1/2026","bank_id":"abc123"},
{"date":"2026-01-15","amount":500.0,"sender":"Jan Novak","vs":"123","message":"clenske 1/2026","bank_id":"abc123"}, # currency missing → CZK
{"date":"2026-02-10","amount":1234.56,"currency":"CZK","sender":"ABC SRO","vs":"","message":"FAKTURA 42","bank_id":"xyz"}, # mixed case → lowercased
{"date":"2026-03-01","amount":-500.0,"currency":"CZK","sender":"refund","vs":"","message":"","bank_id":""}, # negative
{"date":"2026-04-01","amount":0.0,"currency":"CZK","sender":"","vs":"","message":"","bank_id":""}, # zero amount
{}, # empty dict — every field falls back to default
]
for c in cases:
print(repr(c), "->", generate_sync_id(c))
'
```
Cases (one row per dict above), each asserting the exact 64-char hex
digest the snippet prints. Cover:
- Happy path with all fields set.
- `Currency: ""``"CZK"` default (parity with missing key).
- Mixed-case sender/message → lowercased before hashing.
- Negative amount.
- Zero amount.
- Zero-value `Transaction{}` — every field at Go zero, currency defaults
to `"CZK"`, hash matches Python `generate_sync_id({})`.
### 2. `TestFormatAmount`
Pin the float formatter against Python's `str(float)`:
```sh
PYTHONPATH=scripts:. python -c '
for v in [0.0, 500.0, -500.0, 0.1, 1234.56, 99999.99, 1500000.0, 1e16, 1e-5]:
print(repr(v), "->", repr(str(v)))
'
```
Table of `(float64, expected string)` pairs. Whole numbers must end in
`.0`; existing decimal representations pass through unchanged;
exponent-form floats (`1e16`, `1e-5`) keep their format.
## Files to create
- `go/internal/domain/synch/synch.go` — package, `Transaction`,
`GenerateSyncID`, internal `formatAmount`.
- `go/internal/domain/synch/synch_test.go``TestGenerateSyncID` +
`TestFormatAmount`.
No existing Go files need editing.
## Verification
```sh
cd go && go test ./internal/domain/synch/...
make go-lint
make go-build # sanity: nothing else broke
```
Plus run the two Python snippets in the Tests section and diff their
output against the test tables to confirm parity.
## Out of scope (explicit non-goals)
- **Hooking into the Tier-1 parity runner.** That comes with M3.5
(`-tags=parity` build constraint and `tests/fixtures/pure/`). M2.6
ships with hand-written, Python-verified test tables — same approach
used by M2.1M2.5.
- **A richer `Transaction` struct** covering ks/ss/note/sender_account.
Those fields aren't part of the hash. M4.4 (Fio IO adapter) will
decide whether to reuse `synch.Transaction` or define its own struct
and convert at the boundary.
- **Polymorphic input** (e.g. accepting a `map[string]any`). Python's
duck-typing is a non-goal in Go.
- **Any Python callsite migration.** `sync_fio_to_sheets.py` keeps using
its own `generate_sync_id` until M4.7 ports the sync service.
## Progress tracker + changelog
After the commit lands:
- Tick `M2.6` in
[docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md](../../srv/personal/fuj-management/docs/plans/2026-05-03-2349-go-backend-rewrite-progress.md)
with the commit SHA, mirroring the M2.5 entry style.
- Add a `CHANGELOG.md` entry at top:
`## YYYY-MM-DD HH:MM TZ — feat(go/M2.6): port domain/synch.GenerateSyncID`.
Branch: `feat/m2-6-synch-generate-sync-id` (per CLAUDE.md
branch-per-feature workflow). Push, open MR via `tea pr create`, leave
merge to the user.

View File

@@ -0,0 +1,154 @@
package czech
import (
"fmt"
"regexp"
"sort"
"strconv"
"strings"
)
var czechMonths = map[string]int{
"leden": 1, "ledna": 1, "lednu": 1,
"unor": 2, "unora": 2, "unoru": 2,
"brezen": 3, "brezna": 3, "breznu": 3,
"duben": 4, "dubna": 4, "dubnu": 4,
"kveten": 5, "kvetna": 5, "kvetnu": 5,
"cerven": 6, "cervna": 6, "cervnu": 6,
"cervenec": 7, "cervnce": 7, "cervenci": 7,
"srpen": 8, "srpna": 8, "srpnu": 8,
"zari": 9,
"rijen": 10, "rijna": 10, "rijnu": 10,
"listopad": 11, "listopadu": 11,
"prosinec": 12, "prosince": 12, "prosinci": 12,
}
var (
numericRe *regexp.Regexp
dotRe *regexp.Regexp
rangeRe *regexp.Regexp
standRe *regexp.Regexp
)
func init() {
// Sort by descending length so longer alternatives win in RE2 leftmost-first
// matching (e.g. "cervenec" is tried before "cerven").
names := make([]string, 0, len(czechMonths))
for name := range czechMonths {
names = append(names, name)
}
sort.Slice(names, func(i, j int) bool {
if len(names[i]) != len(names[j]) {
return len(names[i]) > len(names[j])
}
return names[i] < names[j]
})
alt := strings.Join(names, "|")
numericRe = regexp.MustCompile(`([\d+]+)\s*/\s*(\d{2,4})`)
dotRe = regexp.MustCompile(`(\d{1,2})\s*\.\s*(\d{4})`)
rangeRe = regexp.MustCompile(`(` + alt + `)\s*-\s*(` + alt + `)`)
standRe = regexp.MustCompile(`\b(` + alt + `)\b`)
}
// ParseMonthReferences extracts YYYY-MM month references from Czech free text.
//
// defaultYear seeds two heuristics: standalone month names with m >= 10 are
// treated as defaultYear-1 (out-of-year backfill), and wrap-around ranges
// (e.g. listopad-leden) place months >= start_m in defaultYear-1.
//
// Returns a sorted, deduplicated slice of "YYYY-MM" strings.
func ParseMonthReferences(text string, defaultYear int) []string {
normalized := Normalize(text)
seen := map[string]struct{}{}
add := func(year, m int) {
if m >= 1 && m <= 12 {
seen[fmt.Sprintf("%04d-%02d", year, m)] = struct{}{}
}
}
// Pass 1: numeric months — "11+12/2025", "01/26", "1/2026"
for _, groups := range numericRe.FindAllStringSubmatch(normalized, -1) {
monthsPart, yearStr := groups[1], groups[2]
year, err := strconv.Atoi(yearStr)
if err != nil {
continue
}
if year < 100 {
year += 2000
}
for mStr := range strings.SplitSeq(monthsPart, "+") {
mStr = strings.TrimSpace(mStr)
if mStr == "" {
continue
}
allDigits := true
for _, c := range mStr {
if c < '0' || c > '9' {
allDigits = false
break
}
}
if !allDigits {
continue
}
m, err := strconv.Atoi(mStr)
if err != nil {
continue
}
add(year, m)
}
}
// Pass 2: dot-separated month.year — "12.2025" (4-digit year only)
for _, groups := range dotRe.FindAllStringSubmatch(normalized, -1) {
m, _ := strconv.Atoi(groups[1])
year, _ := strconv.Atoi(groups[2])
add(year, m)
}
// Pass 3a: Czech month name ranges — "listopad-leden"
foundInRanges := map[string]struct{}{}
for _, groups := range rangeRe.FindAllStringSubmatch(normalized, -1) {
startName, endName := groups[1], groups[2]
foundInRanges[startName] = struct{}{}
foundInRanges[endName] = struct{}{}
startM := czechMonths[startName]
endM := czechMonths[endName]
wraps := startM > endM
m := startM
for range 12 {
year := defaultYear
if wraps && m >= startM {
year = defaultYear - 1
}
add(year, m)
if m == endM {
break
}
m = m%12 + 1
}
}
// Pass 3b: standalone Czech month names (not part of a range)
for _, groups := range standRe.FindAllStringSubmatch(normalized, -1) {
name := groups[1]
if _, inRange := foundInRanges[name]; inRange {
continue
}
m := czechMonths[name]
year := defaultYear
if m >= 10 {
year = defaultYear - 1
}
add(year, m)
}
result := make([]string, 0, len(seen))
for k := range seen {
result = append(result, k)
}
sort.Strings(result)
return result
}

View File

@@ -0,0 +1,244 @@
package czech
import (
"reflect"
"testing"
)
func TestParseMonthReferences(t *testing.T) {
t.Parallel()
// All expected outputs verified against live Python implementation on 2026-05-05:
// PYTHONPATH=scripts:. python -c 'from czech_utils import parse_month_references; print(parse_month_references("<input>", 2026))'
tests := []struct {
name string
input string
defaultYear int
want []string
}{
{
name: "empty",
input: "",
defaultYear: 2026,
want: []string{},
},
{
name: "numeric plus-split two months full year",
input: "11+12/2025",
defaultYear: 2026,
want: []string{"2025-11", "2025-12"},
},
{
name: "numeric single month full year",
input: "1/2026",
defaultYear: 2026,
want: []string{"2026-01"},
},
{
name: "numeric 2-digit year",
input: "01/26",
defaultYear: 2026,
want: []string{"2026-01"},
},
{
name: "numeric plus-split with 2-digit year",
input: "11+12/25",
defaultYear: 2026,
want: []string{"2025-11", "2025-12"},
},
{
name: "numeric three months sorted",
input: "12+1+2/2026",
defaultYear: 2026,
want: []string{"2026-01", "2026-02", "2026-12"},
},
{
name: "dot pattern",
input: "12.2025",
defaultYear: 2026,
want: []string{"2025-12"},
},
{
name: "dot pattern requires 4-digit year",
input: "1.26",
defaultYear: 2026,
want: []string{},
},
{
name: "standalone month below m10 threshold",
input: "leden",
defaultYear: 2026,
want: []string{"2026-01"},
},
{
name: "standalone month m10 heuristic",
input: "prosinec",
defaultYear: 2026,
want: []string{"2025-12"},
},
{
name: "declension prosince",
input: "prosince",
defaultYear: 2026,
want: []string{"2025-12"},
},
{
name: "declension lednu",
input: "lednu",
defaultYear: 2026,
want: []string{"2026-01"},
},
{
name: "standalone m10 boundary (rijen = October)",
input: "rijen",
defaultYear: 2026,
want: []string{"2025-10"},
},
{
name: "standalone m9 just below boundary (zari = September)",
input: "zari",
defaultYear: 2026,
want: []string{"2026-09"},
},
{
name: "range wrap Nov-Jan",
input: "listopad-leden",
defaultYear: 2026,
want: []string{"2025-11", "2025-12", "2026-01"},
},
{
name: "range wrap starting at October",
input: "rijen-leden",
defaultYear: 2026,
want: []string{"2025-10", "2025-11", "2025-12", "2026-01"},
},
{
name: "range no wrap",
input: "unor-kveten",
defaultYear: 2026,
want: []string{"2026-02", "2026-03", "2026-04", "2026-05"},
},
{
name: "degenerate range same month",
input: "leden-leden",
defaultYear: 2026,
want: []string{"2026-01"},
},
{
name: "range spanning m10 — heuristic does NOT fire for range members",
input: "unor-listopad",
defaultYear: 2026,
want: []string{"2026-02", "2026-03", "2026-04", "2026-05", "2026-06", "2026-07", "2026-08", "2026-09", "2026-10", "2026-11"},
},
{
name: "longest-match alternation cervenec beats cerven",
input: "cervenec-srpen",
defaultYear: 2026,
want: []string{"2026-07", "2026-08"},
},
{
name: "range plus standalone — range excludes, dedup",
input: "listopad-leden, prosinec",
defaultYear: 2026,
want: []string{"2025-11", "2025-12", "2026-01"},
},
{
name: "two standalones no range",
input: "prosinec leden",
defaultYear: 2026,
want: []string{"2025-12", "2026-01"},
},
{
name: "numeric plus range mix",
input: "11+12/2025, leden-brezen",
defaultYear: 2026,
want: []string{"2025-11", "2025-12", "2026-01", "2026-02", "2026-03"},
},
{
name: "dedup across numeric and standalone passes",
input: "11+12/25 a listopad",
defaultYear: 2026,
want: []string{"2025-11", "2025-12"},
},
{
name: "no digits before slash — standalone fires instead",
input: "prosince/2025",
defaultYear: 2026,
want: []string{"2025-12"},
},
{
name: "range with trailing slash-year — numeric fails, range wins",
input: "listopad-prosinec/2025",
defaultYear: 2026,
want: []string{"2026-11", "2026-12"},
},
{
name: "dot pattern only — numeric matches but month out of 1-12 range",
input: "01.2026 / 02.2026",
defaultYear: 2026,
want: []string{"2026-01", "2026-02"},
},
{
name: "leading slash — numeric matches at second slash",
input: "/12/2025",
defaultYear: 2026,
want: []string{"2025-12"},
},
{
name: "uppercase input normalized",
input: "PROSINEC",
defaultYear: 2026,
want: []string{"2025-12"},
},
{
name: "diacritics stripped by Normalize",
input: "Žluťoučký prosinec",
defaultYear: 2026,
want: []string{"2025-12"},
},
{
name: "diacritics in range with spaces around dash",
input: "Únor - květen",
defaultYear: 2026,
want: []string{"2026-02", "2026-03", "2026-04", "2026-05"},
},
{
name: "natural language mixed with numeric and standalone",
input: "platba 11/2025 a leden",
defaultYear: 2026,
want: []string{"2025-11", "2026-01"},
},
{
name: "English month name not recognized",
input: "December",
defaultYear: 2026,
want: []string{},
},
{
name: "duplicate input deduped",
input: "11+12/2025 11+12/2025",
defaultYear: 2026,
want: []string{"2025-11", "2025-12"},
},
{
name: "trailing year without separator ignored",
input: "leden 2026",
defaultYear: 2026,
want: []string{"2026-01"},
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := ParseMonthReferences(tc.input, tc.defaultYear)
if got == nil {
got = []string{}
}
if !reflect.DeepEqual(got, tc.want) {
t.Errorf("ParseMonthReferences(%q, %d)\n got %v\n want %v",
tc.input, tc.defaultYear, got, tc.want)
}
})
}
}

View File

@@ -0,0 +1,34 @@
// Package fees ports fee calculation from scripts/attendance.py.
package fees
const (
AdultFeeDefault = 700 // CZK fallback for 2+ practices when month not in AdultFeeMonthlyRate
AdultFeeSingle = 200 // CZK for exactly 1 practice
)
// AdultFeeMonthlyRate mirrors ADULT_FEE_MONTHLY_RATE in scripts/attendance.py.
// Months absent from this map fall back to AdultFeeDefault.
var AdultFeeMonthlyRate = map[string]int{
"2025-09": 750, "2025-10": 750, "2025-11": 750, "2025-12": 750,
"2026-01": 750, "2026-02": 750, "2026-03": 350,
"2026-04": 700, "2026-05": 700,
}
// CalculateFee returns the adult fee in CZK for attendanceCount practices in
// the given monthKey (format "YYYY-MM").
//
// 0 practices → 0
// 1 practice → AdultFeeSingle (200)
// 2+ → AdultFeeMonthlyRate[monthKey] or AdultFeeDefault
func CalculateFee(attendanceCount int, monthKey string) int {
if attendanceCount == 0 {
return 0
}
if attendanceCount == 1 {
return AdultFeeSingle
}
if rate, ok := AdultFeeMonthlyRate[monthKey]; ok {
return rate
}
return AdultFeeDefault
}

View File

@@ -0,0 +1,37 @@
package fees
import "testing"
func TestCalculateFee(t *testing.T) {
t.Parallel()
// All expected outputs verified against live Python implementation on 2026-05-06:
// PYTHONPATH=scripts:. python -c 'from attendance import calculate_fee; print([calculate_fee(c,m) for c,m in [(0,"2026-05"),(0,""),(1,"2026-05"),(1,"unknown"),(2,"2026-05"),(2,"2026-03"),(2,"2025-09"),(5,"2026-05"),(2,"2027-01"),(2,"")]])'
tests := []struct {
name string
count int
month string
want int
}{
{"zero short-circuits", 0, "2026-05", 0},
{"zero empty month", 0, "", 0},
{"single practice", 1, "2026-05", 200},
{"single ignores monthKey", 1, "unknown", 200},
{"two practices configured month", 2, "2026-05", 700},
{"two practices reduced march", 2, "2026-03", 350},
{"two practices early season", 2, "2025-09", 750},
{"high count same as two", 5, "2026-05", 700},
{"unknown future month falls back", 2, "2027-01", 700},
{"empty month falls back", 2, "", 700},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := CalculateFee(tc.count, tc.month)
if got != tc.want {
t.Errorf("CalculateFee(%d, %q) = %d, want %d", tc.count, tc.month, got, tc.want)
}
})
}
}

View File

@@ -0,0 +1,37 @@
package fees
const JuniorFeeDefault = 500 // CZK fallback for 2+ practices when month not in JuniorFeeMonthlyRate
// JuniorFeeMonthlyRate mirrors JUNIOR_MONTHLY_RATE in scripts/attendance.py.
// Months absent from this map fall back to JuniorFeeDefault.
var JuniorFeeMonthlyRate = map[string]int{
"2025-09": 250,
"2026-03": 250,
}
// Expected is the result of a junior fee calculation.
// When Unknown is true the fee requires manual review (Python returns "?");
// in that case Value is meaningless — always check Unknown first.
type Expected struct {
Value int
Unknown bool
}
// CalculateJuniorFee returns the junior fee for attendanceCount practices in
// the given monthKey (format "YYYY-MM").
//
// 0 practices → Expected{Value: 0}
// 1 practice → Expected{Unknown: true} (manual review; Python sentinel "?")
// 2+ → Expected{Value: JuniorFeeMonthlyRate[monthKey] or JuniorFeeDefault}
func CalculateJuniorFee(attendanceCount int, monthKey string) Expected {
if attendanceCount == 0 {
return Expected{Value: 0}
}
if attendanceCount == 1 {
return Expected{Unknown: true}
}
if rate, ok := JuniorFeeMonthlyRate[monthKey]; ok {
return Expected{Value: rate}
}
return Expected{Value: JuniorFeeDefault}
}

View File

@@ -0,0 +1,37 @@
package fees
import "testing"
func TestCalculateJuniorFee(t *testing.T) {
t.Parallel()
// All expected outputs verified against live Python implementation on 2026-05-06:
// PYTHONPATH=scripts:. python -c 'from attendance import calculate_junior_fee; print([calculate_junior_fee(c,m) for c,m in [(0,"2026-05"),(0,""),(1,"2026-05"),(1,"unknown"),(2,"2026-05"),(2,"2025-09"),(2,"2026-03"),(5,"2025-09"),(2,"2027-01"),(2,"")]])'
tests := []struct {
name string
count int
month string
want Expected
}{
{"zero short-circuits", 0, "2026-05", Expected{Value: 0}},
{"zero empty month", 0, "", Expected{Value: 0}},
{"single practice sentinel", 1, "2026-05", Expected{Unknown: true}},
{"single ignores monthKey", 1, "unknown", Expected{Unknown: true}},
{"two practices default month", 2, "2026-05", Expected{Value: 500}},
{"two practices reduced sept", 2, "2025-09", Expected{Value: 250}},
{"two practices reduced march", 2, "2026-03", Expected{Value: 250}},
{"high count same as two", 5, "2025-09", Expected{Value: 250}},
{"unknown future month falls back", 2, "2027-01", Expected{Value: 500}},
{"empty month falls back", 2, "", Expected{Value: 500}},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := CalculateJuniorFee(tc.count, tc.month)
if got != tc.want {
t.Errorf("CalculateJuniorFee(%d, %q) = %+v, want %+v", tc.count, tc.month, got, tc.want)
}
})
}
}

View File

@@ -0,0 +1,49 @@
// Package money ports Czech-locale currency parsing from scripts/infer_payments.py.
package money
import (
"errors"
"strconv"
"strings"
)
// ErrInvalidAmount is returned by ParseCZK when the input cannot be parsed.
var ErrInvalidAmount = errors.New("money: invalid CZK amount")
// ParseCZK parses a Czech-locale amount string and returns the value in CZK
// as a float64. Mirrors scripts/infer_payments.py parse_czk_amount:
//
// - empty input → (0, nil)
// - "Kč"/"CZK" suffixes stripped (case-sensitive, like Python)
// - comma present → comma is decimal sep, dots/spaces are thousand seps
// ("1.500,00" → 1500.0)
// - no comma, 2+ dots → all dots are thousand seps ("1.500.000" → 1500000.0)
// - no comma, ≤1 dot → dot is decimal sep ("1.500" → 1.5)
// - on parse failure → (0, ErrInvalidAmount); callers wanting Python's
// silent-zero behaviour can discard the error: v, _ := ParseCZK(s)
func ParseCZK(s string) (float64, error) {
if s == "" {
return 0, nil
}
s = strings.ReplaceAll(s, "Kč", "")
s = strings.ReplaceAll(s, "CZK", "")
s = strings.TrimSpace(s)
if strings.ContainsRune(s, ',') {
s = strings.ReplaceAll(s, ".", "")
s = strings.ReplaceAll(s, " ", "")
s = strings.ReplaceAll(s, ",", ".")
} else if strings.Count(s, ".") > 1 {
s = strings.ReplaceAll(s, ".", "")
s = strings.ReplaceAll(s, " ", "")
} else {
s = strings.ReplaceAll(s, " ", "")
}
v, err := strconv.ParseFloat(s, 64)
if err != nil {
return 0, ErrInvalidAmount
}
return v, nil
}

View File

@@ -0,0 +1,67 @@
package money
import (
"testing"
)
func TestParseCZK(t *testing.T) {
t.Parallel()
// All expected outputs verified against live Python implementation on 2026-05-06:
// PYTHONPATH=scripts:. python -c '
// from infer_payments import parse_czk_amount
// for v in [None, "", "0", "500", "500 Kč", "500 CZK",
// "1 500", "1500.00", "1 500.00",
// "1.500,00", "1500,5", "1.500.000",
// "1.500", "abc", " ", "100,5 Kč"]:
// print(repr(v), "->", parse_czk_amount(v))
// '
tests := []struct {
name string
input string
want float64
wantErr bool
}{
{"empty string", "", 0, false},
{"zero string", "0", 0, false},
{"plain integer", "500", 500, false},
{"with Kč suffix", "500 Kč", 500, false},
{"with CZK suffix", "500 CZK", 500, false},
{"space thousand sep", "1 500", 1500, false},
{"dot decimal", "1500.00", 1500, false},
{"space thousands dot decimal", "1 500.00", 1500, false},
{"dot thousand comma decimal", "1.500,00", 1500, false},
{"comma decimal no thousands", "1500,5", 1500.5, false},
{"multiple dot thousand seps", "1.500.000", 1500000, false},
{"single dot is decimal heuristic", "1.500", 1.5, false},
{"comma decimal with Kč", "100,5 Kč", 100.5, false},
{"garbage text", "abc", 0, true},
{"spaces only", " ", 0, true},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got, err := ParseCZK(tc.input)
if (err != nil) != tc.wantErr {
t.Errorf("ParseCZK(%q) error = %v, wantErr %v", tc.input, err, tc.wantErr)
}
if got != tc.want {
t.Errorf("ParseCZK(%q) = %v, want %v", tc.input, got, tc.want)
}
})
}
}
// TestParseCZKSilentZero documents that discarding the error recovers Python's
// silent-zero behaviour for any garbage input.
func TestParseCZKSilentZero(t *testing.T) {
t.Parallel()
for _, s := range []string{"abc", " ", "Kč", "CZK"} {
v, _ := ParseCZK(s)
if v != 0 {
t.Errorf("ParseCZK(%q) silent-zero: got %v, want 0", s, v)
}
}
}

View File

@@ -0,0 +1,65 @@
// Package synch ports the bank-sync deduplication helper from
// scripts/sync_fio_to_sheets.py.
package synch
import (
"crypto/sha256"
"encoding/hex"
"math"
"strconv"
"strings"
)
// Transaction is the projection of a Fio transaction that participates
// in the Sync ID hash. Other fields (ks, ss, sender_account, …) are
// intentionally excluded — they are not part of the Python hash.
//
// Currency: leave "" to inherit the Python default of "CZK" (matches
// the HTML scraper path which omits the key entirely).
type Transaction struct {
Date string
Amount float64
Currency string
Sender string
VS string
Message string
BankID string
}
// GenerateSyncID returns the lowercase SHA-256 hex digest of
// "date|amount|currency|sender|vs|message|bank_id" (lower-cased), used
// as the dedup key in column K of the payments sheet.
//
// Byte-stable with scripts/sync_fio_to_sheets.py generate_sync_id.
func GenerateSyncID(tx Transaction) string {
currency := tx.Currency
if currency == "" {
currency = "CZK"
}
raw := strings.ToLower(strings.Join([]string{
tx.Date,
formatAmount(tx.Amount),
currency,
tx.Sender,
tx.VS,
tx.Message,
tx.BankID,
}, "|"))
sum := sha256.Sum256([]byte(raw))
return hex.EncodeToString(sum[:])
}
// formatAmount mimics Python's str(float) for Fio transaction amounts.
// Python uses decimal notation for abs(f) in [1e-4, 1e16) and scientific
// notation outside that range, always adding ".0" to whole-valued decimals.
func formatAmount(f float64) string {
abs := math.Abs(f)
if abs != 0 && (abs < 1e-4 || abs >= 1e16) {
return strconv.FormatFloat(f, 'e', -1, 64)
}
s := strconv.FormatFloat(f, 'f', -1, 64)
if !strings.ContainsRune(s, '.') {
s += ".0"
}
return s
}

View File

@@ -0,0 +1,119 @@
package synch
import (
"testing"
)
// All expected digests verified against the live Python implementation on 2026-05-06:
//
// PYTHONPATH=scripts:. python -c '
// from sync_fio_to_sheets import generate_sync_id
// cases = [
// {"date":"2026-01-15","amount":500.0,"currency":"CZK","sender":"Jan Novak","vs":"123","message":"clenske 1/2026","bank_id":"abc123"},
// {"date":"2026-01-15","amount":500.0,"sender":"Jan Novak","vs":"123","message":"clenske 1/2026","bank_id":"abc123"},
// {"date":"2026-02-10","amount":1234.56,"currency":"CZK","sender":"ABC SRO","vs":"","message":"FAKTURA 42","bank_id":"xyz"},
// {"date":"2026-03-01","amount":-500.0,"currency":"CZK","sender":"refund","vs":"","message":"","bank_id":""},
// {"date":"2026-04-01","amount":0.0,"currency":"CZK","sender":"","vs":"","message":"","bank_id":""},
// {"date":"","amount":0.0,"currency":"CZK","sender":"","vs":"","message":"","bank_id":""},
// ]
// for c in cases: print(generate_sync_id(c))
// '
func TestGenerateSyncID(t *testing.T) {
t.Parallel()
cases := []struct {
name string
tx Transaction
want string
}{
{
name: "all fields set",
tx: Transaction{
Date: "2026-01-15", Amount: 500.0, Currency: "CZK",
Sender: "Jan Novak", VS: "123", Message: "clenske 1/2026", BankID: "abc123",
},
want: "4ac26598b6f23965380690172156a438a7e97a97dcedf222e5afe1afbe2c1bc4",
},
{
name: "currency empty defaults to CZK",
tx: Transaction{
Date: "2026-01-15", Amount: 500.0, Currency: "",
Sender: "Jan Novak", VS: "123", Message: "clenske 1/2026", BankID: "abc123",
},
want: "4ac26598b6f23965380690172156a438a7e97a97dcedf222e5afe1afbe2c1bc4",
},
{
name: "mixed-case fields lowercased before hashing",
tx: Transaction{
Date: "2026-02-10", Amount: 1234.56, Currency: "CZK",
Sender: "ABC SRO", VS: "", Message: "FAKTURA 42", BankID: "xyz",
},
want: "d40fa224d4fa572ffcd58e308e5c6508c4d5ca087b24ef6ff9284528fc128250",
},
{
name: "negative amount",
tx: Transaction{
Date: "2026-03-01", Amount: -500.0, Currency: "CZK",
Sender: "refund", VS: "", Message: "", BankID: "",
},
want: "0c630a407160367c396a2beec08efb94c319b4d84a8b90cc2be89e6ea10c391f",
},
{
name: "zero amount",
tx: Transaction{
Date: "2026-04-01", Amount: 0.0, Currency: "CZK",
Sender: "", VS: "", Message: "", BankID: "",
},
want: "6a23ce53717cd539064d550d2c2ec5de2e9bf81016d16852820ca9b8e259331f",
},
{
// Python equivalent: {"date":"","amount":0.0,"currency":"CZK","sender":"","vs":"","message":"","bank_id":""}
// Note: Python generate_sync_id({}) hashes "" for missing amount, not "0.0".
name: "zero-value Transaction",
tx: Transaction{},
want: "d33d7e391f5a43f0192bb5a34c0ec15715139125678ecef8e1324af7d943b21d",
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
got := GenerateSyncID(tc.tx)
if got != tc.want {
t.Errorf("GenerateSyncID(%+v) = %q, want %q", tc.tx, got, tc.want)
}
})
}
}
// All expected strings verified against the live Python implementation on 2026-05-06:
//
// PYTHONPATH=scripts:. python -c '
// for v in [0.0, 500.0, -500.0, 0.1, 1234.56, 99999.99, 1500000.0, 1e16, 1e-5]:
// print(repr(v), "->", repr(str(v)))
// '
func TestFormatAmount(t *testing.T) {
t.Parallel()
cases := []struct {
in float64
want string
}{
{0.0, "0.0"},
{500.0, "500.0"},
{-500.0, "-500.0"},
{0.1, "0.1"},
{1234.56, "1234.56"},
{99999.99, "99999.99"},
{1500000.0, "1500000.0"},
{1e16, "1e+16"},
{1e-5, "1e-05"},
}
for _, tc := range cases {
got := formatAmount(tc.in)
if got != tc.want {
t.Errorf("formatAmount(%v) = %q, want %q", tc.in, got, tc.want)
}
}
}