Systems That Surprised You
The most useful audits of long-lived software are the ones where you were wrong about what would last.
Not wrong about a feature or a deadline. Wrong about durability — the property that is hardest to evaluate at the moment of choice and the most expensive to misjudge.
Every system I have stayed with long enough has eventually surprised me. Something I expected to break held. Something I expected to last quietly stopped working in a way I did not notice for months. The lesson was rarely about the part. It was about the assumption underneath it.
These surprises are worth more than most retrospectives.
Unexpected durability
There is always a part of the system that should not have survived this long. An old script. A library someone wrote in a week. A schema that predates the team. A queue choice made over a lunch conversation.
On any clean evaluation, those pieces should have been replaced years ago. They were not. They quietly kept working, often through changes that should have broken them.
What those parts share is rarely cleverness. It is narrow scope. They did one thing. They had few callers. They made no assumptions about the rest of the system, and the rest of the system made few assumptions about them.
The components that surprised me by lasting were almost always the ones that refused to grow. Not impressive in isolation. Impressive in retention. They survived because nothing depended on them more than necessary, and because nothing about them invited expansion.
This is not protection from every failure mode. Narrow scope can carry its own buried assumptions — date formats, character encodings, integer widths — that one day get tested by a condition the original author never imagined. A small component is not a safe one. It is one whose failure modes have a smaller surface to find. Both can be true.
The honest reading is still uncomfortable. Most of the engineering I have spent worrying about would have been better spent ensuring the same restraint elsewhere. The durable parts were durable because somebody — sometimes accidentally — kept them small.
Quiet failure modes
The opposite surprise is more common. A system stops working correctly, but not in a way that triggers alarms. A scheduled job has been failing silently for weeks. A cache has been serving stale data since a deploy nobody remembers.
These are not the failures the team rehearses for. The team rehearses for the loud ones — the outage, the spike, the 5xx wall. The quiet failures are the ones the system was never instrumented to notice.
What is exposed in those moments is not a missing alert. It is a missing belief. The team did not believe this particular thing could fail. So no one wrote the check. So no one wired the alarm. The failure mode lived in plain sight for as long as the assumption held.
Every quiet failure mode I have found mapped back to an assumption nobody articulated. The fix was rarely the alarm. It was the conversation that surfaced the assumption.
This is the harder kind of stewardship: not catching failure when it happens, but noticing the categories of failure your monitoring is shaped to ignore.
Assumptions exposed by time
Time is the cheapest test the system has. It runs continuously. It exercises every assumption implicit in the original design, including the ones nobody wrote down.
A system in production for five years has been audited by reality more thoroughly than any review process could manage. The parts still standing have passed tests nobody designed.
What surprises me in those audits is not which assumptions failed. It is which assumptions turned out to have been load-bearing in the first place.
Assumptions about traffic volume that quietly shaped the data model.
Assumptions about team size that determined how operations were structured.
Assumptions about deployment cadence that fixed the release pipeline.
Assumptions about the trustworthiness of upstream services that defined the error paths.
Most were never debated. They were the conditions of the moment. They became architecture without ever being decided.
When a long-lived system surprises you, the surprise is usually about one of these. A condition you treated as background turned out to be foreground.
What surprised me about surprise itself
The first few times a system surprised me, I read each surprise individually. A specific job that should not have lasted. A specific failure that should have been caught.
Eventually a pattern emerged. The surprises clustered around two recurring shapes.
Things that turned out to be more durable than expected almost always had less surface area than I remembered.
Things that turned out to be more fragile than expected almost always rested on an assumption I had stopped seeing.
This is not a satisfying lesson. It would have been more useful if certain technologies, patterns, or disciplines reliably produced durable systems. They did not. What produced durability was usually that something stayed small. What produced fragility was usually that something stayed unexamined.
Both findings resist optimization. You cannot decide to keep a system small from the outside; that decision has to survive years of pressure. You cannot list assumptions you have stopped seeing; by definition they are invisible. The best technique available is to assume some are there and invite people who did not build the system to point at them. They have not yet learned which assumptions to overlook.
What I do with the surprises now
Two adjustments have followed from this, both modest.
I treat any old component still in production as evidence of something worth preserving. Not the technology. The conditions that kept it small. Before replacing it, I try to understand whether the replacement will preserve those conditions or quietly remove them.
I treat anything that has not failed for a long time with mild suspicion. Especially anything that processes data, retries on errors, or handles silent edge cases. The absence of visible failure is not the same as correctness. It often just means the monitoring matches the same assumptions the code does.
Neither adjustment is dramatic. Neither shows up in a roadmap. Both have changed more decisions than I expected.
What the surprises were really about
Systems that surprised me did so on a single axis: my model of them was wrong in a specific, locatable way. The durable parts taught me I had over-valued sophistication. The fragile parts taught me I had under-valued the act of naming what the system was relying on.
There is no clean rule that follows. Long-lived systems do not reward rules.
What they reward is the habit of asking, regularly: which parts of this should not still be working — and why are they. Which parts have not failed recently — and what would it look like if they were failing in ways my instruments do not show.
Surprise is information. It is the system telling you which parts of your understanding have not been updated in a while.
That is worth listening to. Even when it means accepting that the parts of the system you invested the least in may have taught you more than the ones you spent years on.
No spam, no sharing to third party. Only you and me.