Skip to main content
← Back to Writing

Making Data Contracts the Unit of Reliability

Making Data Contracts the Unit of Reliability

This article is in reference to:

Stop Shipping Processes. Ship Data Contracts.

As seen on: cfcx.work

Shipping trust, not just motion

The post exists because many organizations are running into the same uncomfortable pattern: they can automate almost everything except certainty. Workflows go live, dashboards fill with metrics, and yet quarter after quarter they are surprised by the same classes of failures—misbooked entries, broken integrations, unexplained gaps.

The “so what” is blunt: as automation spreads, the cost of being wrong compounds faster than the cost of being slow. The piece argues that most teams are still optimizing for motion—getting processes into production—while treating the trustworthiness of the underlying data as an implicit, human-managed detail. That trade works at small scale; at larger scale, it quietly turns speed into fragility.

This is why the essay insists on “data contracts” as the unit that should be shipped. It is not a tooling preference. It is a statement about where reliability actually lives: not in the elegance of a workflow diagram, but in the clarity of what the system is and is not allowed to trust.

The hidden cost of optimistic automation

The post responds to a quiet cultural shift in operations. Over the last decade, organizations have become very good at making work visible as process: swimlanes, RPA bots, low-code flows, scripts in ERPs, integrations stitched together with APIs. The bias is toward more automation, more coverage, more speed.

In that environment, success is often defined as getting a workflow “live.” Once transactions start flowing and throughput is measurable, the project is considered complete. The essay argues that this definition of done is structurally flawed, because it treats data as something that will probably be fine as long as people are careful enough.

The result is what the author calls compounding fragility. Each new automated flow assumes that upstream data is mostly right, mostly complete, mostly consistent. That optimism works—until a small inconsistency appears. A missing rate. A renamed file. A divergent code set between sandbox and production.

Those small inconsistencies do not stay small. Because more and more logic is layered on top of undefined data, each new automation amplifies the impact of any deviation. The system accumulates exceptions that are hard to trace and harder to repair. People experience this as a growing volume of reconciliation work, unexplained variances, and last-minute heroics at period end.

The deeper critique is that organizations are celebrating visible motion (automated steps) while ignoring invisible risk (uncertain assumptions). It is a warning against confusing orchestrated activity with engineered reliability.

Accountability gaps: when nobody owns the assumptions

Beneath the technical examples, the post is really about accountability design. Who owns the shape of the data? Who is responsible for saying what values are allowed, what “complete” means, and what should happen when that standard is not met?

The author highlights two structural gaps that show up across ERPs, data warehouses, and internal tools. Both are less about technology and more about how organizations decide where responsibility begins and ends.

Data as byproduct, not product

First, operational data is often treated as a byproduct of running the business. Teams focus on getting transactions into the system and reports out of it. The tables in between are assumed to be “handled” by someone upstream or by the vendor’s defaults.

When data is a byproduct, nobody is explicitly accountable for its contract. Field names drift, code sets diverge, refresh cadences are unclear, and forward coverage is accidental. The only real validation happens at the end, when a human checks a report and notices that something feels off.

This is a human-first safety net in a machine-first environment. It relies on attention and experience to catch what the system was never told to question. The post pushes against that model, arguing that in a world of large-scale automation, treating data as a side effect is no longer tenable.

Approval confused with definition

The second gap is subtler: mistaking stakeholder approval for machine-readable definition. In many projects, a group reviews sample outputs, agrees that they “look right,” and signs off. That approval becomes the basis for deployment.

But “looks right” is a social judgment, not a technical guarantee. The system still does not know what makes it right. It has no explicit list of required fields, allowed values, failure conditions, or coverage expectations. Tests confirm plausibility rather than correctness.

The piece reframes definition as a contract rather than a meeting outcome. A contract is something the system can enforce: a set of constraints that are either met or violated. This is a shift from consensus to codification, from “everyone agrees this seems okay” to “the system refuses to proceed if the assumptions are not satisfied.”

Data contracts as trust boundaries

The insistence on “shipping data contracts” is really an insistence on explicit trust boundaries. A process is just a set of rules applied to inputs. If the inputs are unconstrained, the best-designed process can only be optimistic. It assumes rather than verifies.

By contrast, a data contract formalizes what the system is allowed to assume. It specifies, in a form both humans and machines can understand, which fields exist and how they are named, which values are valid, what constitutes a complete record, how fresh the data must be, and who may change those definitions.

The original post grounds this in a concrete NetSuite example, but the pattern is general. A contract-driven approach shifts the locus of design from “how do we move through these steps?” to “what must be true about the data for any step to be safe?” It treats the interface between components as an asset to design, not an accident to discover later.

What looks restrictive on paper is, in practice, a trade of invisible chaos for visible structure. Many organizations are already paying the cost—through exceptions, reconciliations, and operational fatigue—but in a way that does not improve the system. A contract makes that cost upfront and productive.

From tribal knowledge to engineered operations

Beneath the technical posture, the essay is about how organizations choose to scale knowledge. It contrasts two models of reliability: one where stability lives in people’s memories, and another where it lives in the system’s boundaries.

When processes ship without data contracts, reliability lives in people’s heads. Certain individuals know which file is “really” final, which subsidiary has special currency logic, which tax codes are safe to ignore for now. The system functions as long as those individuals remain present, attentive, and uninterrupted.

This model makes operations dependent on continuity of memory rather than clarity of design. It rewards heroics over architecture. The organization survives through exceptional effort rather than predictable behavior.

By advocating for data contracts, the author is making a different bet: that operations can and should behave like engineered systems. In that world, the interface between teams is not an email thread or a shared understanding; it is a defined contract. Failure states are expected and classified, not treated as personal shortcomings.

In the end: making data non-negotiable

Ultimately, the piece is a call to reframe where teams spend their design energy. Instead of perfecting steps, it argues for making data non-negotiable: named, owned, validated, and enforced as a first-class object in operational design.

This is framed as a question of control, not complexity. A “working process” is not the same as a controllable system. The former can move transactions; the latter can be reasoned about, tested, and safely changed. Data contracts are presented as the bridge between the two—a way to turn tacit assumptions into explicit boundaries.

Looking ahead, the signal is clear: as organizations layer more automation on top of shared systems, the limiting factor will not be how many workflows they can draw, but how precisely they can define and enforce what those workflows are allowed to assume. The practical shift is subtle but concrete: before celebrating a process as “live,” ask what its data contract is—and where, specifically, the system refuses to proceed when that contract is broken. The answer to that question is where operational reliability actually begins.