
At some point in almost every serious information backbone conversation, the question comes up: could we build this ourselves?
It is a reasonable question. Your organization has engineering capability. You have existing systems to integrate with. You understand your own domain better than any vendor does. And there is an understandable instinct - particularly in regulated sectors where information is a competitive and compliance asset - to keep strategic infrastructure in-house.
This post works through that question honestly. The answer is not always "buy." But the conditions under which "build" is the right answer are narrower than most organizations initially believe - because the true cost of building a governed information backbone, purpose-built for high-trust environments, is almost always underestimated.
Bottom line up front:
Building a governed information backbone is not a data engineering project. It is a sustained investment in model-centric infrastructure, SME-native tooling, semantic APIs, and operational model governance - including the ability to update your information model programmatically as your domain, regulations, and organization evolve. The organizations that underestimate this do not just overspend. They end up with something that cannot do the job.
The first mistake in most build estimates is a failure to define the scope accurately. When organizations say "we could build this," they typically have in mind something like: a graph database, a data model, some APIs, and a front end for SMEs to interact with. That is the visible part. It is not the hard part.
A governed information backbone for a regulated, high-trust environment has to do several things that are easy to underestimate individually, and genuinely demanding in combination:
Of all the dimensions above, the one that most consistently surprises organizations in the build analysis is the last one: ongoing model governance, and in particular the cost and operational risk of updating the information model as things change.
In a regulated environment, your model is not static. Regulations change. Product categories evolve. New substances get classified. Reference data standards are updated. An organization operating across multiple jurisdictions or product lines will face model change as a recurring operational reality, not an occasional exception.
When you build your own backbone, every model change is a program of work. Engineering effort to update the schema. Impact analysis to understand what downstream systems are affected. Testing to verify that the change has propagated correctly. Coordination with the teams whose systems depend on the model. This is not a problem that goes away as the backbone matures - it is a structural property of in-house builds that do not have model change designed in from the outset.
Data Graphs approaches this differently. The platform supports programmatic model updates via API - meaning that model changes can be structured, versioned, and executed through a controlled programmatic interface, rather than requiring bespoke engineering effort each time. Reference data changes, relationship updates, new classification schemes: these are operations the backbone is designed to handle, not exceptions that require a project to manage.
For organizations operating in fast-moving regulatory environments, this is not a marginal efficiency gain. It is the difference between an information backbone that remains current - and therefore trustworthy - and one that drifts behind the operational reality it is supposed to govern.
There is a second cost dimension that deserves specific attention, because it tends to be systematically underestimated in build proposals: the cost of building tooling that domain experts can actually use.
It is easy to specify. Subject matter experts need a way to work with the information model - to populate it, validate it, maintain reference data, and manage classifications - without requiring IT involvement for routine operations. The spec is clear. The build is not.
Building tooling that genuinely puts SMEs in control of the governed model requires the tooling to be designed around the information model itself - not around what a development team found natural to build, and not around a generic data management UI that has been adapted. When this sequencing is reversed - when the UX is built before the model is sound - the tooling embeds assumptions about meaning that the model then has to contort itself to accommodate. The result is a system that sits one step away from the backbone it was supposed to expose.
Most organizations that have attempted to build SME tooling for an information backbone have built it more than once. The first version does not survive contact with real domain experts working at scale. The second is better. The third is usually when the scope of the challenge becomes clear. That iteration cost is rarely in the original build estimate.
This section maps the primary cost dimensions against the build and buy options. The intent is not to present a specific financial model - costs vary significantly by organization, scale, and domain - but to make the comparison across dimensions explicit.
Model-centric design & storage
Model-centric UI for SMEs
API surface (semantic, not just data)
Infrastructure & hosting
Model updates & evolution
Reference data management
Regulated environment readiness
The pattern across these dimensions is consistent: in a build scenario, each line item requires dedicated engineering, architecture, or operational investment. In a Data Graphs deployment, each is a platform capability that is configured, not built - with the exception of the domain-specific model work, which requires genuine subject matter expertise regardless of the path taken. It takes time and resources to reach parity, while you could be enjoying new features from a "Buy" option.
The most useful framing for the build vs buy decision is not "can we build this?" - the answer to that question is almost always yes, in principle.
The more useful question is:
"What would we need to maintain in perpetuity to keep this trustworthy - and is that the best use of our engineering capacity?"
A governed information backbone in a high-trust environment is not a system you build and hand over. It is an operational capability that requires ongoing investment: in model governance, in reference data management, in tooling that keeps SMEs productive, and in the ability to update the model as the world changes. Every one of those demands is a permanent claim on engineering and governance resources. For most organizations, those resources are better directed at the domain knowledge and product differentiation that only they can build - not at recreating infrastructure that already exists as a mature, purpose-built platform.
Data Graphs is built to make that total cost manageable - not by removing the domain expertise requirement, which cannot be outsourced, but by providing the infrastructure, the tooling, and the model governance capability that would otherwise have to be engineered from scratch, and maintained indefinitely.
The backbone itself is the investment that compounds. The platform that hosts it should not be.
For more on this topic, have a look at our "Machine-Readable Information Backbone" series.
Post 1: Preparing for true machine-readable digital product labels - Machine readability is a meaning problem, not a format problem. Most organizations focus on file formats and miss the foundational architecture problem entirely. This is what it actually demands from your organisation.
Post 2: Machine-readable isn't just for machines - Better information architecture foundations improve the experience of the humans who work with product data every day - your SMEs, compliance teams, and end users. Better foundations for machines are better foundations for people.
Post 3: Why AI needs stable meaning - AI operating on ungoverned data is making guesses, and in regulated environments that isn't good enough. A proper information backbone eliminates hallucination for the facts that matter, and gives COOs the freedom to swap AI providers without operational friction.
Post 4: What is an information backbone? A plain-language definition for operational leaders - Written for organizations that already have systems, already have data, and are still asking why none of it feels reliable.