April 9: Killing the Composite God Function
Finally cracked open the composite route and split it into focused pipeline stages. Also gated window.jwst debug globals behind dev-only builds, replaced hand-rolled caches with TTLCache, bounded polling retries, and surfaced the errors that frontend components were quietly swallowing.
Developer Journal
Splitting the composite god function (#1091)
The /composite route was the classic 800-line procedural monster — load FITS, reproject, background-subtract, stretch, color-map, blend, luminance-combine, post-stretch, gamma-encode, write output, update job, return. All one function, all in one control flow, impossible to test in isolation.
Refactored into a staged pipeline where each stage is a named function with a clear input/output contract and its own test file. Breakdown:
- Load FITS → memmap, extract 2D, NaN→0, budget-aware downscale
- Reproject → common WCS grid via
reproject_interp(bilinear) - Background neutralization → sigma-clipped median subtraction (optional)
- Per-channel stretch → zscale/asinh/log/sqrt/power/histeq/linear + black/white points + gamma + tone curves
- Color mapping → hue-based or explicit RGB weights, global-max normalization
- Multi-instrument blending → CIELAB lerp with edge feathering
- Luminance blending → HSL lightness replacement (LRGB)
- Post-stack stretch → overall adjustments
- sRGB gamma encoding → display-ready output
The route handler now just builds the pipeline context, runs the stages in sequence, handles the job progress updates, and writes the final output. Each stage is independently testable and can be swapped or extended without touching the others. Issues #963 and #1002 were both asking for this refactor from different angles — both closed with this PR.
TTLCache replaces hand-rolled caches (#1087)
mast/routes.py had three hand-rolled caches — dict + lock + eviction timer per cache. Same bug in all three: eviction was firing on time elapsed since last write, not time elapsed since last read, so hot keys kept evicting. Replaced all three with cachetools.TTLCache, which handles the locking, time tracking, and eviction correctly.
Sized each cache based on expected cardinality × entry size and documented the budget in comments. Net code reduction: ~140 lines.
Gate window.jwst debug globals (#1086)
The window.jwst debug helpers were shipping to production. They were already wrapped in if (typeof window !== 'undefined'), but that just gates against SSR — in the browser they were always live, exposing internal state and functions to anyone who opened devtools. Gated behind import.meta.env.DEV so they only exist in dev builds. Tracking issue #1040 is still open for the broader cleanup (removing the helpers entirely before community release); this PR is the immediate "don't leak internals to prod users" fix.
Bound polling retries in useJobProgress (#1088)
useJobProgress retried failed polls indefinitely. If the backend went down mid-job, the tab sat there burning requests forever. Added an error budget: if the job endpoint fails N times in a row (default 5), surface the error to the UI, stop polling, and let the user retry manually. Transient failures (1–2 in a row) still get absorbed silently.
Surface errors instead of swallowing (#1090)
Several frontend components were catching errors in .catch(() => {}) — sometimes with a console.error, sometimes not. The UI would just stay in its last rendered state, which looked indistinguishable from "still loading" or "worked fine, nothing changed." Audited all the bare catches, either wired them to the toast system (user-actionable errors) or escalated to the error boundary (unexpected errors that should break the component rather than stay silent).
Validate Tags list non-empty in BulkUpdateTags (#1089)
/api/targets/bulk-update-tags was accepting an empty Tags array and happily clearing tags on every selected target. Not the endpoint's contract — the contract is "replace tags with this non-empty set." Added validation that returns 400 if Tags is null or empty with a clear message. The frontend wasn't sending empty arrays, but the API contract shouldn't rely on client discipline.
Side note: brain-dumping cadence
A friend asked about the open backlog and pointed out that always-on autonomous agents would be useful for grinding through code-quality issues. Usage limits make that not quite practical yet — the budget doesn't stretch to 24/7 agent time. Still, worth reconsidering if weekly limits get more generous. Also realized I hadn't been writing to the devblog in a while — writing about the work separately from doing the work helps the docs and the blog both.
What shipped
| PR | Title |
|---|---|
| #1091 | refactor: split composite route god function into focused pipeline stages (#963, #1002) |
| #1090 | fix: surface errors instead of silently swallowing in frontend components |
| #1089 | fix: validate Tags list is non-empty in BulkUpdateTags endpoint |
| #1088 | fix: handle polling failures with error limit in useJobProgress |
| #1087 | fix: replace hand-rolled caches with TTLCache in mast/routes.py |
| #1086 | fix: gate window.jwst debug globals behind import.meta.env.DEV |