Work
A real-time data plane for product surfaces
Replacing a brittle polling pipeline with an event-driven plane that gave product teams live data and gave operators a system they could actually reason about.
- Senior Engineer
- 2023
- Data · Real-time · Event-driven · Reliability
The product wanted “live everything.” The existing pipeline was a quilt of polling jobs, fragile views, and ad-hoc caches that had grown organically into a system no one fully owned.
What was actually wrong
The architecture was not slow because compute was scarce. It was slow because the boundaries were wrong. State was being recomputed in places that should have been observing, and observed in places that should have been authoritative.
The shape of the answer
We treated the system as an event-driven plane: a small set of authoritative producers, a durable log, and consumers that materialised whatever shape each product surface needed. Read models lived close to the surface that used them.
The harder work
The harder work was not technical. It was negotiating ownership: who owns the events, who owns the schemas, who is allowed to break them. We wrote that down before we wrote much code, and the architecture stayed honest as a result.
Outcome
Latency moved from minutes to single seconds end-to-end. The on-call story improved more than the latency did, because the system stopped surprising people.