All checks were successful
Deploy to K8s / deploy (push) Successful in 8s
Adds internal/domain/czech.Normalize, the first pure-domain function in the Go rewrite (M2 milestone). Matches Python czech_utils.normalize byte- for-byte: NFKD decompose via golang.org/x/text/unicode/norm, drop Mn- category combining marks (unicode.Mn, not IsMark, to match Python's unicodedata.combining() semantics), then strings.ToLower. Includes 13-case table-driven test; all inputs spot-checked against the Python implementation before locking. Adds golang.org/x/text v0.36.0 as first external dependency. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
27 lines
586 B
Go
27 lines
586 B
Go
package czech
|
|
|
|
import (
|
|
"strings"
|
|
"unicode"
|
|
|
|
"golang.org/x/text/unicode/norm"
|
|
)
|
|
|
|
// Normalize strips diacritics and lowercases s.
|
|
//
|
|
// Matches Python: unicodedata.normalize("NFKD", s) then filter out
|
|
// combining characters (unicode.Mn only — not Mc/Me, which have
|
|
// combining class 0 in Python's unicodedata.combining()).
|
|
func Normalize(s string) string {
|
|
decomposed := norm.NFKD.String(s)
|
|
var b strings.Builder
|
|
b.Grow(len(decomposed))
|
|
for _, r := range decomposed {
|
|
if unicode.In(r, unicode.Mn) {
|
|
continue
|
|
}
|
|
b.WriteRune(r)
|
|
}
|
|
return strings.ToLower(b.String())
|
|
}
|