Ecosystem Surface Deep Dive

Semantic Infrastructure.

The universal navigation layer that makes fragmented data usable by humans, platforms, and AI agents.

Every other surface — CDP, DSP, retail media, publisher, BI — produces data in its own language. Semantic infrastructure is the layer that gives those fragments shared meaning: consistent definitions, identity, metadata, ontology, and governance. It is what lets a person, a platform, or an AI agent ask a question and get a trustworthy answer across silos, instead of five tools that quietly disagree.

Semantic infrastructure is the universal navigation layer that lets fragmented data ecosystems become usable by humans, platforms, and AI agents.

Scope this playbook → Back to Ecosystem Surfaces →

Semantic Infrastructure — the lit node in the activation flow: governed data in, a governed decision out. Hover a node for detail.

Decision fit

Fast read.

Best when: Fragmentation across silos stops humans, platforms, or agents from getting one trustworthy answer.
Not when: You have a single source of truth already, or the need is one report rather than shared meaning.
Primary buyer: Data, platform, architecture, and AI / analytics-engineering leaders.
Primary output: One consistent layer of meaning — definitions, identity, metadata, governance — across every surface.
Main risk: Treating it as another dashboard or a one-off project instead of infrastructure.
Best next step: Define your metrics and entities once, in a semantic layer, then expose them under governance.

The problem

The problem: fragmentation has no shared language.

Every surface in this playbook is a silo with its own vocabulary. Nothing on top of them can be trusted until the meaning underneath is shared.

Data is scattered
Customer, media, commerce, and measurement data live in different platforms, clouds, and clean rooms.
Each silo speaks its own language
The same word — “customer”, “conversion”, “active” — means something different in every tool.
Metrics quietly disagree
Two dashboards show two numbers for the same thing, and no one can say which is right.
Agents can move data but not understand it
Protocols let an agent reach the data; without shared meaning it returns confident, wrong answers.

The solution

The solution: one navigation layer over all of it.

Semantic infrastructure does not replace the surfaces. It sits above them and gives every consumer — human, platform, or agent — one consistent way to navigate.

Shared definitions
Each metric and entity is defined once, owned, and reused everywhere — so tools and agents agree.
Identity & resolution
People, accounts, and SKUs resolve to consistent entities across silos.
Metadata & ontology
A catalog, lineage, and an ontology describe what the data is and how it relates.
Governance & policy
Access, consent, and output policy travel with the meaning, not bolted on after.

Why now

Market context: from BI feature to critical infrastructure.

The “semantic layer” went from a BI feature to what analysts now call critical infrastructure for the AI era.
“Context layer for AI” is the fast-rising 2026 reframing of the same idea — meaning plus governance for agents.
Open standards arrived: the Open Semantic Interchange (OSI) spec for portable, vendor-neutral metric definitions (2025).
Agent protocols matured: the Model Context Protocol (MCP) became an open standard now governed under the Linux Foundation.
The catalog / metadata layer consolidated fast through 2025 acquisitions — validate current ownership before betting on a vendor.
The durable lesson: protocols move bytes between agents; only a semantic layer supplies shared meaning.

Evolution

Semantic-layer evolution.

Semantic-layer evolution

01
BI semantic models
Per-tool models (LookML and kin) defined metrics inside one BI platform.
02
Headless / metrics layer
Definitions pulled out of BI so every tool shares one metric.
03
Catalog & active metadata
Cataloging, lineage, and governance describe and control the data.
04
Knowledge graph & ontology
Entities and relationships modelled so meaning is explicit, not implied.
05
Context layer for AI
Meaning + governance exposed to agents via MCP and open standards (OSI).
06
Agent-ready meaning layer
Humans, platforms, and agents navigate every silo through one layer.

Landscape

Who plays here — examples, not a ranking.

Named as examples, not a ranking. The “meaning” half (semantic / metrics layer, knowledge graphs) and the “governance” half (catalogs, active metadata) are converging. This layer consolidated fast in 2025 — several catalogs were acquired — so validate current ownership and names.

Semantic / metrics layer

dbt Semantic Layer (MetricFlow)
Cube
AtScale
Looker (LookML)
Malloy (experimental)

Catalog / metadata / governance

Atlan
Collibra
Alation
Microsoft Purview
Databricks Unity Catalog

Knowledge graph / ontology

Neo4j
Stardog
TigerGraph
RelationalAI
Palantir (Foundry / Ontology)
GraphDB (Graphwise)

Open standards & agent protocols

Open Semantic Interchange (OSI)
Model Context Protocol (MCP)
Agent2Agent (A2A)

Capability map

What it does — and where it quietly fails.

What to weigh — and where it bites. Validate current support and ownership per platform.

Capability	What it means	Why it matters	Watch-out
Shared metric definitions	One definition per metric.	Tools and agents agree.	Per-tool definitions drift apart.
Identity / entity resolution	Consistent people / accounts / SKUs.	Joins across silos.	Match quality and keys.
Metadata / catalog	What the data is.	Discoverable, trustable.	Stale or partial coverage.
Lineage	Where a number came from.	Trust and debugging.	Breaks at tool boundaries.
Ontology / knowledge graph	Entities and relationships.	Explicit meaning for reasoning.	Modelling effort is real.
Governance / policy	Access, consent, output rules.	Safe by design.	Bolted on after never holds.
Semantic-for-agents (MCP)	Meaning served to agents.	Agents answer, not guess.	Protocol without a semantic layer.
Interoperability / standards	Portable definitions (OSI).	Avoid lock-in.	Proprietary-only semantics.
Active metadata	Metadata that drives action.	Automation and alerts.	Hype vs real workflows.
Access control	Who sees / queries what.	Governed self-serve and agents.	NL / agent access bypassing it.
Observability / trust	Freshness and quality signals.	Know when to trust a number.	No signal = silent errors.

First-party data

How every surface plugs into the semantic layer.

This is the one surface that sits above the others. Each surface registers its meaning here so the whole ecosystem can be navigated as one.

From data platforms
- Catalog, lineage, and active metadata
- Governed access to sources
From the metrics layer
- One owned definition per metric
- Portable via open standards (OSI)
From identity
- Resolved entities across silos
- Consistent keys for joins
To agents
- Meaning served via MCP, not raw bytes
- Governance and lineage on every answer

How it connects

What the layer gives each consumer.

Consumer	What it gets	Without it
Humans	Trusted, consistent answers	Every dashboard disagrees
Platforms	Interoperable definitions and identity	Brittle point-to-point integrations
AI agents	Meaning, not just bytes (MCP + semantic layer)	Confident, wrong answers
Clean rooms	Shared definitions for matching	Incomparable outputs
BI / MMM	One metric definition	Models built on shifting sand

Agentic shift

Why agents need a semantic layer.

Agent protocols — MCP for tools and data, A2A for agent-to-agent workflows — let agents reach almost anything. But a protocol moves bytes; it does not supply meaning. Without a consistent semantic layer underneath, an agent can fetch the data and still answer the wrong question confidently.

Agent use cases

Natural-language access across silos
Agent-to-agent workflows (A2A)
Governed tool / data access via MCP
Grounding LLMs in ontology / GraphRAG
Automated metric lookups with lineage
Cross-surface questions answered once

Agent risks

Confident, wrong answers without shared meaning
Ungoverned MCP / agent access to data
Definitional drift between tools and agents
Governance bypass via natural-language queries
Betting on one vendor mid-consolidation

Governance needed

A semantic layer as the source of meaning
Open standards (OSI) to avoid lock-in
Lineage and provenance on every answer
Access and output policy on agent queries
Human approval on consequential actions

From meaning to execution

Portable meaning for agentic activation.

Semantic infrastructure gives fragmented data ecosystems a shared language. Signal containerization turns that shared language into an executable object. The semantic layer defines what a signal means; the container carries that meaning into activation, policy, evaluation, and agent workflows.

Semantic definition
What the signal means and which business intent it maps to.
Execution route
Where the signal can run: clean room, SSP, DSP, deal, bidstream, or measurement layer.
Governed output
What the signal can produce and how the result is audited.

MCP moves the bytes. The semantic layer supplies the meaning.

Agent protocols like MCP and A2A are now open standards governed under the Linux Foundation, and they let agents reach data, tools, and each other. But reaching data is not understanding it. Gartner has predicted that a majority of agentic-analytics efforts relying on MCP alone will fail without a consistent semantic layer. The protocol is the plumbing; the semantic layer is what turns reachable data into trustworthy answers. Build the meaning layer, then connect the agents. (Validate current standards and adoption.)

Strategic read

SWOT.

Strengths

One definition of truth
Cross-silo interoperability
Agent-ready meaning
Open standards emerging (OSI / MCP)

Weaknesses

Modelling and ownership effort
Easy to mistake for another dashboard
Fast vendor consolidation
Value is indirect / foundational

Opportunities

Context layer for AI agents
GraphRAG and ontology grounding
Portable definitions via OSI
Governed self-serve and automation
A true single source of meaning

Threats

Protocol-only (MCP without meaning)
Proprietary semantic lock-in
Definitional drift
Ownership churn from acquisitions
Governance bypass via NL / agents

The stack

The semantic stack.

Top to bottom: who consumes meaning, the governance and definitions that create it, and the sources it is built on.

The semantic stack

Output-led decision rules

Design backward from the output.

Output needed	Better-fit pattern	Watch-out
Trusted cross-silo answers	Semantic / metrics layer over all sources	Skip it and every tool disagrees.
Agent-ready data	MCP + semantic layer + governance	A protocol without meaning misleads.
Portable definitions	Adopt an open standard (OSI)	Proprietary-only semantics lock you in.
Entity resolution	Identity layer / knowledge graph	Match quality and keys.
Governed access	Policy + lineage + audit on every query	NL / agent access bypassing governance.

First moves

What to build first.

01
Define your core metrics and entities once, in a semantic layer — owned, not per-tool.
02
Stand up catalog and lineage so every number is discoverable and traceable.
03
Adopt an open standard (OSI) so definitions stay portable across vendors.
04
Expose meaning to agents via MCP under governance — not raw, ungoverned data.

Anti-patterns

Where this goes wrong.

Building another dashboard instead of a meaning layer.
Relying on MCP / protocols alone, with no semantic layer underneath.
Letting each tool define its own metrics.
Giving agents ungoverned access to data.
Treating semantic infrastructure as a one-off project, not infrastructure.

POC to production

12 questions before the POC becomes production.

01
Business decision
What single decision does this surface improve?
02
Data inputs
What data feeds it, who owns it, and where does it live?
03
Identity logic
How are people / accounts / SKUs resolved and matched?
04
Consent / governance
What is the consent basis and the output policy?
05
Metric definition
Are the metrics defined, owned, and comparable?
06
Output policy
What can leave — aggregate, score, segment, report, API?
07
Activation rights
Is the output eligible to activate, and where?
08
Measurement method
How is the result measured, and is the method defensible?
09
Technical owner
Who builds and runs the pipeline?
10
Commercial owner
Who owns the budget / commercial outcome?
11
Feedback loop
How do results flow back into the model and the decision?
12
Production path
What turns the POC into a governed, repeatable workflow?

Watch-outs

Practical caveats.

01
A protocol that reaches the data is not the same as a layer that understands it.
02
Agentic analytics on MCP alone is predicted to fail without a consistent semantic layer (Gartner).
03
Definitions must be owned — unowned metrics drift back into disagreement.
04
“Context layer” is largely the same idea as “semantic layer”, relabelled for the AI era.
05
This layer is consolidating fast — validate current ownership before committing.

Capability validation note

Product names, ownership, and availability across these surfaces change quickly. Treat this as an advisory fit guide, not procurement documentation — validate current capabilities and access against official sources before implementation.

Market references last validated: June 6, 2026. Revalidate before pitch use.

Need help connecting this surface to the operating model?

The surface only creates value when data, semantics, governance, activation, and measurement are designed together.

Scope this playbook → Back to Ecosystem Surfaces →

Semantic Infrastructure.

Data is scattered

Each silo speaks its own language

Metrics quietly disagree

Agents can move data but not understand it

Shared definitions

Identity & resolution

Metadata & ontology

Governance & policy

BI semantic models

Headless / metrics layer

Catalog & active metadata

Knowledge graph & ontology

Context layer for AI

Agent-ready meaning layer

From data platforms

From the metrics layer

From identity

To agents

Semantic definition

Execution route

Governed output

Need help connecting this surface to the operating model?