Your Platform Doesn't Know Itself

Platforms are delivering on their promise. Faster deployments, self-service environments, golden paths that reduce cognitive overhead for engineering teams. The developer experience conversation has matured considerably in the last few years, and the results show.

But there is a quieter conversation that most platform teams haven't had yet — and it tends to surface at the worst possible moment, usually somewhere between midnight and 3am.

The question isn't whether your platform can deploy fast. It's whether your platform knows what it's running, what changed, and what the consequences are when something goes wrong. For most platforms today, the honest answer is: not really.

Most Internal Developer Platforms — IDPs — are built with the developer experience as the primary lens. That's not wrong. But a platform that only speaks to developers isn't a platform for the organisation. The operational teams, service managers, and the business stakeholders who depend on what the platform runs are equally its customers. Treating it as a platform — and building it accordingly — changes what you prioritise and who it truly serves.

Part of that broader conversation starts with a question the industry has been asking loudly: do we actually know what we're running? SBOM has risen as a serious answer to exactly that — and it's a sound one, as far as it goes.

SBOM Rose for Good Reason

Software Bill of Materials has become a genuine priority across the industry — and rightly so. Supply chain risk is real: your application pulls in open-source libraries, third-party packages, and frameworks, each of which pulls in its own dependencies. Somewhere in that chain, a component could be compromised or vulnerable — and because it sits several layers deep, you may not know it's there until it's too late. SBOM gives you a structured, machine-readable inventory of what your software is made of, and it is the foundation that vulnerability scanners and Software Asset Management processes build upon. If you've invested in it, that investment is sound.

The point isn't that SBOM is wrong. It's that SBOM is a snapshot of what was built — not intelligence about what is running. It has no memory of what changed yesterday, no awareness of which services depend on each other, and no connection to the incident that woke someone up at 2am. As a security and compliance foundation it earns its place. As operational intelligence, it was never designed to be that.

// a foundation, not a finish line

SBOM tells you what's in your software. The operational questions — what changed, what's connected, what broke, and why — sit in an entirely different layer. To run a platform with confidence, you need to know both.

The Operational Questions Your Platform Can't Answer

Imagine an incident. A critical service is degraded. Your SBOM is up to date, your security scans are clean, your pipelines are green. And yet something is broken — and no one knows why. These are the questions that matter in that moment — and that most platforms have no good answer for:

What changed before this broke?

A reasonable instinct here is: "we have Git, we have pull requests — surely that tells us what changed?" It tells you part of the story. A PR documents what a developer intended to change in the code, who reviewed it, and when it merged. What it doesn't tell you is what actually deployed to production and when, what configuration or infrastructure changed outside of any PR — feature flags, Terraform applies, Helm values, environment variables — or what three other teams deployed in the same 20-minute window. An automated change record captures the actual deployment event, the pipeline that ran it, the exact production timestamp, and the full context of everything else changing around it. When you're diagnosing an incident, you're not reading diffs. You're asking what changed across all systems in the last two hours — and that answer lives in change records, not commit history.

What does this service depend on?

The degraded service calls three downstream APIs and shares a config service with four others. Without a live service map, that blast radius is invisible until you have already felt it.

What is the business impact?

Which customers are affected? Which SLAs are at risk? Which teams need to know? These are operational questions. The platform should be able to answer them without a dozen Slack messages.

Have we seen this before?

The same failure mode surfaced six months ago after a similar deployment pattern. Without incident history correlated to change, that knowledge lives only in someone's memory — if at all.

"SBOM tells your security team what's inside the box. CMDB tells your ops team what happens to the business when the box breaks."

The Three Layers Your Platform Is Missing

Closing this gap doesn't require abandoning the principles that made modern platforms valuable in the first place. It requires adding three layers of operational intelligence that most platforms treat as optional — or as someone else's problem.

Change: Context, Not Control

Every deployment is a change. Every configuration update is a change. Every infrastructure modification is a change. Most platforms already know this — they just don't record it in a way that is useful at speed. When change context is baked into the platform as a first-class concern, the question "what changed before this broke?" has an answer within seconds — not because someone manually logged a ticket, but because the platform captured the deployment event, the pipeline, the timestamp, and the surrounding activity as part of normal operation.

This isn't about reintroducing change approval bureaucracy. It's about giving every change a correlation ID that connects deployment pipelines, configuration state, and incident records into a single coherent timeline. The overhead is minimal. The diagnostic value — especially at 2am — is significant.

CMDB: The Platform's Map of Itself

A Configuration Management Database is not a spreadsheet that someone updates manually every quarter. Done properly, it is a live service map — a constantly refreshed understanding of what services exist, what they depend on, what depends on them, and what the operational and business context of each relationship is.

// SBOM answers

What libraries does this service contain? What versions? Are there known CVEs flagged against them by your scanner?

// CMDB answers

What does this service depend on? What depends on it? Who owns it? What is the blast radius if it degrades?

When integrated with your platform, CMDB transforms incident response. Instead of manually tracing dependencies under pressure, the platform surfaces the service map automatically — upstream and downstream impact, owning teams, related recent changes. The difference in mean time to resolution is not marginal. It is structural.

CMDB also gives you something SBOM fundamentally cannot: visibility of service impact across your organisation. Not just "this component has a vulnerability" but "this component underpins these five services, two of which are customer-facing, one of which is in scope for your upcoming audit." That is the intelligence that changes how you prioritise — and it is the intelligence that ops teams and service managers need, not just engineering teams.

Incident: The Feedback Loop That Closes the Circle

Incidents contain enormous operational knowledge. Every post-incident review, every timeline reconstruction, every root cause analysis is signal. Most organisations store that signal in tickets, documents, and collective memory — disconnected from the change records and configuration state that caused the incident in the first place.

When incident management is integrated into the platform, that feedback loop closes. Incidents are correlated to changes automatically. Recurring patterns become visible. Configuration drift that precedes failures gets flagged before it becomes an incident. Over time, the platform develops operational memory — not just speed.

// the compounding return

Each incident that is properly correlated and resolved adds to the platform's operational intelligence. After twelve months, the difference between a platform with this integration and one without it isn't just faster MTTR — it's organisational resilience that compounds.

What This Means in Practice

The value lands differently depending on where you sit.

// for devops teams

Not more process. More signal.

Deployments automatically generate change context — no manual logging
Incidents surface correlated changes without investigation overhead
Configuration drift is visible before it becomes a failure
Post-incident reviews have a factual timeline to build from
Audit trails emerge from normal platform operation

// for ops teams

Visibility instead of surprise.

Service maps show upstream and downstream dependencies in real time
Change context is available before picking up the phone
Business impact is assessable without chasing multiple teams
Recurring failure patterns become visible across the estate
Handovers carry operational context, not just ticket numbers

How to Start — Without a Big Bang

The instinct when reading this is often to think of a large ITSM transformation programme. That instinct is understandable and almost always wrong. The goal is not to retrofit a legacy process model onto a modern platform. It's to instrument what you are already doing.

Start with change context in your pipelines

Capture deployment metadata as a first-class artefact — what deployed, when, who triggered it, what changed relative to the previous state. This costs almost nothing and pays back immediately.

Build a service catalogue, not a CMDB project

Start with the ten most critical services. Owners, dependencies, downstream consumers. Keep it live by making it part of onboarding new services to the platform. Grow from there.

Connect incidents to changes

Even a simple integration — surfacing recent deployments on an incident timeline — changes the diagnostic conversation immediately. You do not need full correlation before you see value.

Close the feedback loop

After each significant incident, ensure the contributing changes and configuration state are recorded. Build pattern recognition incrementally rather than trying to retrofit history.

Let the platform grow its own memory

Once the instrumentation is in place, operational intelligence accumulates naturally. The platform learns from your estate rather than requiring manual curation at every step.

The Bigger Picture: Platform for Everything, Engine for Anywhere

In our last post, we made the case for platform engineering as the foundation for multi-cloud strategy and workload repatriation — the operational foundation that gives you genuine freedom to put workloads where they belong.

That argument only holds if the platform can actually govern what it's running across all of those environments. A platform that can deploy anywhere but can't tell you what changed, what depends on what, or what broke and why is not a platform for all your applications. It is a fast way to distribute complexity.

The moment you are operating across multiple clouds — or repatriating workloads from hyperscalers back to on-premises or alternative infrastructure — the operational stakes rise sharply. Configuration drift becomes harder to detect. Blast radius becomes harder to assess. Incident correlation becomes harder to achieve manually. The organisations that navigate this successfully are the ones whose platforms have operational resilience and discipline built in — not bolted on after the fact.

"If you want your platform to be the engine for all your applications — cloud-native, legacy, multi-cloud, or repatriated — it needs more than pipelines and a developer portal. Change, CMDB, and Incident aren't ITSM relics. They're the foundation of operational resilience and discipline at scale."

SBOM is a necessary part of that picture — the inventory foundation that security scanning and asset management processes build upon. But an inventory of what you've built is not the same as intelligence about what you're running. The organisations that mistake one for the other will find out the hard way — usually during an incident — what the missing layers actually cost.

Know the why. Find the what.

If this resonates — if you're wondering where your platform sits against this picture, or just want to think it through with someone who's worked both sides of the ITSM and DevOps divide — we're happy to have that conversation. No agenda, just clarity.

Start the conversation Follow on LinkedIn ← More posts