docs: add caching outcomes summary to prompts directory
This commit is contained in:
29
prompts/outcomes/2026-03-10-cache-data-from-google-sheets.md
Normal file
29
prompts/outcomes/2026-03-10-cache-data-from-google-sheets.md
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
# Google Sheets Data Caching Implementation
|
||||||
|
|
||||||
|
**Date:** 2026-03-11
|
||||||
|
**Objective:** Optimize Flask application performance by heavily caching expensive Google Sheets data processing, avoiding redundant HTTP roundtrips to Google APIs, and ensuring rate limits are not exhausted during simple web app reloads.
|
||||||
|
|
||||||
|
## Implemented Features
|
||||||
|
|
||||||
|
### 1. File-Based JSON Caching (`cache_utils.py`)
|
||||||
|
- **Mechanism:** Implemented a new generic caching system that saves API responses and heavily calculated datasets as `.json` files directly to the local `/tmp/` directory.
|
||||||
|
- **Drive Metadata Checks:** The cache is validated by asking the Google Drive API (`drive.files().get`) for the remote `modifiedTime` of the target Sheet.
|
||||||
|
- **Cache Hit logic:** If the cached version on disk matches the remote `modifiedTime`, the application skips downloading the full CSV payload and computing tuples—instead serving the instant static cache via `json.load`.
|
||||||
|
|
||||||
|
### 2. Global API Auth Object Reuse
|
||||||
|
- **The Problem:** The `_get_drive_service()` and `get_sheets_service()` implementations were completely rebuilding `googleapiclient.discovery` objects for *every single file check*—re-seeking and exchanging Google Service Account tokens constantly.
|
||||||
|
- **The Fix:** Service objects (`_DRIVE_SERVICE`, `_SHEETS_SERVICE`) are now globally cached in application memory. The server authenticates exactly *once* when it wakes up, dramatically saving milliseconds and network resources across every web request. The underlying `httplib2` and `google-auth` intelligently handle silent token refreshes natively.
|
||||||
|
|
||||||
|
### 3. Graceful Configurable Rate Limiting
|
||||||
|
- **In-Memory Debouncing:** Implemented an internal memory state (`_LAST_CHECKED`) inside `cache_utils` that forcefully prevents checking the Drive API `modifiedTime` for a specific file if we already explicitly checked it within the last 5 minutes. This prevents flooding the Google Drive API while clicking wildly around the app GUI.
|
||||||
|
- **Semantic Mappings:** Created a `CACHE_SHEET_MAP` that maps friendly internal cache keys (e.g. `attendance_regular`) back to their raw 44-character Google Sheet IDs.
|
||||||
|
|
||||||
|
### 4. HTTP / Socket Timeout Safety Fix
|
||||||
|
- **The Bug:** Originally, `socket.setdefaulttimeout(10)` was used to prevent Google Drive metadata checks from locking up the worker pool. However, this brutally mutated the underlying Werkzeug/Flask default sockets globally. If fetching thousands of lines from Google *Sheets* (the payload logic) took longer than 10 seconds, Flask would just kill the request with a random `TimeoutError('timed out')`.
|
||||||
|
- **The Fix:** Removed the global mutation. Instantiated a targeted, isolated `httplib2.Http(timeout=10)` injected *specifically* into only the Google Drive API build. The rest of the app can now download massive files without randomly timing out.
|
||||||
|
|
||||||
|
### 5. Developer Experience (DX) Enhancements
|
||||||
|
- **Logging Line Origins:** Enriched the console logging format strings (`logging.basicConfig`) to output `[%(funcName)s]` and `%(filename)s:%(lineno)d` to easily trace exactly which exact file and function is executing on complex stack traces.
|
||||||
|
- **Improved VS Code Local Debugging:**
|
||||||
|
- Integrated `debugpy` launch profiles in `.vscode/launch.json` for "Python Debugger: Flask" (Launching) and "Python Debugger: Attach" (Connecting).
|
||||||
|
- Implemented a standard `make web-attach` target inside the Makefile via `uv run python -m debugpy --listen ...` to allow the background web app to automatically halt and wait for external debuggers before bootstrapping caching layers.
|
||||||
Reference in New Issue
Block a user