Add first_seen/last_updated timestamps to track property freshness

Each property record now carries two date fields:
- first_seen: date the listing first appeared (preserved across runs)
- last_updated: date of the most recent scrape that included it

All 6 scrapers (Sreality, Realingo, Bezrealitky, iDNES, PSN, CityHome)
set these fields during scraping. Cached results preserve first_seen and
refresh last_updated. PSN and CityHome gain a load_previous() helper to
track first_seen across runs (they lacked caching before).

The merge script keeps the earliest first_seen and latest last_updated
when deduplicating listings across sources.

The HTML map now shows dates in popups ("Přidáno: DD.MM.YYYY"), displays
a green "NOVÉ" badge on newly discovered listings, and adds a "Přidáno"
dropdown filter (24h / 3 days / 7 days / 14 days) for spotting new ones.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Jan Novak
2026-02-15 21:03:08 +01:00
parent c6089f0da9
commit 0b95c847c4
9 changed files with 1604 additions and 11509 deletions

View File

@@ -79,6 +79,19 @@ def main():
if key in seen_keys:
dupes += 1
existing = seen_keys[key]
# Merge timestamps: keep earliest first_seen, latest last_updated
e_first = e.get("first_seen", "")
ex_first = existing.get("first_seen", "")
if e_first and ex_first:
existing["first_seen"] = min(e_first, ex_first)
elif e_first:
existing["first_seen"] = e_first
e_updated = e.get("last_updated", "")
ex_updated = existing.get("last_updated", "")
if e_updated and ex_updated:
existing["last_updated"] = max(e_updated, ex_updated)
elif e_updated:
existing["last_updated"] = e_updated
# Log it
print(f" Duplikát: {e['locality']} | {format_price(e['price'])} | {e.get('area', '?')}"
f"({e.get('source', '?')} vs {existing.get('source', '?')})")