Add Bazoš.cz scraper + project docs #7

Merged
kacerr merged 2 commits from feature/bazos-scraper into main 2026-03-09 10:28:33 +00:00
Owner

Summary

  • Add scrape_bazos.py — new scraper for reality.bazos.cz (HTML parsing with regex, pagination, GPS from detail pages)
  • Integrate Bazoš into merge_and_map.py (deduplication), scrape_and_map.py (purple markers), and run_all.sh (pipeline step 6/7)
  • Add CLAUDE.md project documentation for automatic context in new sessions

Filters applied

  • Dispositions: 3+kk, 3+1, 4+kk, 4+1, 5+kk, 5+1, 6+kk, 6+1
  • Area ≥ 69 m², floor ≥ 2
  • Excludes panel buildings and sídliště
  • Max price 14M CZK, radius 25 km from Prague

Test plan

  • python3 scrape_bazos.py --max-pages 1 --max-properties 5 — listings parsed, details fetched, GPS resolved
  • python3 scrape_bazos.py --max-pages 3 --max-properties 15 — pagination works (60 unique from 3 pages)
  • python3 merge_and_map.py — Bazoš listings merged, map generated
  • Map shows purple Bazoš markers
  • Cache reuse works on re-run

🤖 Generated with Claude Code

## Summary - Add `scrape_bazos.py` — new scraper for reality.bazos.cz (HTML parsing with regex, pagination, GPS from detail pages) - Integrate Bazoš into `merge_and_map.py` (deduplication), `scrape_and_map.py` (purple markers), and `run_all.sh` (pipeline step 6/7) - Add `CLAUDE.md` project documentation for automatic context in new sessions ## Filters applied - Dispositions: 3+kk, 3+1, 4+kk, 4+1, 5+kk, 5+1, 6+kk, 6+1 - Area ≥ 69 m², floor ≥ 2 - Excludes panel buildings and sídliště - Max price 14M CZK, radius 25 km from Prague ## Test plan - [x] `python3 scrape_bazos.py --max-pages 1 --max-properties 5` — listings parsed, details fetched, GPS resolved - [x] `python3 scrape_bazos.py --max-pages 3 --max-properties 15` — pagination works (60 unique from 3 pages) - [x] `python3 merge_and_map.py` — Bazoš listings merged, map generated - [x] Map shows purple Bazoš markers - [x] Cache reuse works on re-run 🤖 Generated with [Claude Code](https://claude.com/claude-code)
littlemeat added 2 commits 2026-03-06 09:00:37 +00:00
New scraper for reality.bazos.cz with full HTML parsing (no API),
GPS extraction from Google Maps links, panel/sídliště filtering,
floor/area parsing from free text, and pagination fix for Bazoš's
numeric locality codes. Integrated into merge pipeline and map
with purple (#7B1FA2) markers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Provides automatic context loading for new Claude Code sessions,
documenting architecture, filters, sources, and conventions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
kacerr merged commit 212a561e65 into main 2026-03-09 10:28:33 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: littlemeat/maru-hleda-byt#7