NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Contents
Semantic Layer / Metrics
A semantic layer is the place where business definitions like 'revenue,' 'active user,' and 'gross margin' are defined once, in code, so every dashboard, query, and AI tool gives the same answer.
A semantic layer is the place where business definitions live. "Revenue," "active user," "gross margin," "customer lifetime value" — these sound like simple terms, but in any company larger than ten people they are interpretations, not facts. Does revenue include refunds? Does active user mean logged in this month or this week? Does gross margin subtract fulfillment costs or only COGS? The semantic layer's job is to answer these questions once, in code, so that every dashboard, every SQL query, and every AI tool gives the same answer.
The metaphor: a semantic layer is the dictionary your data speaks. Without one, every team makes up its own definitions, half the dashboards disagree by 7%, and the CFO cannot trust any number she sees. With one, "revenue" means exactly one thing, and the calculation lives in one file that everyone references.
This is one of the oldest unsolved problems in data infrastructure. It is also one of the most contentious, because every BI vendor, warehouse vendor, and metrics startup wants to own it.
The idea of a semantic layer is at least 35 years old. Business Objects, founded in 1990 in France, shipped the first widely used commercial semantic layer in 1991: a feature called the "universe," which let analysts define business terms in a GUI on top of a relational database. SAP acquired Business Objects in 2007 for $6.8 billion, mostly to get its hands on this idea.
Through the 1990s and 2000s, the semantic layer was synonymous with OLAP cubes: pre-aggregated, multi-dimensional data structures (Microsoft Analysis Services, Oracle Essbase, IBM Cognos TM1, Mondrian) that materialized common metric calculations into fast lookup tables. The cube was the semantic layer. You modeled your dimensions and measures in MDX, you built the cube overnight, and analysts queried it through Excel. This model dominated enterprise BI for two decades.
Then the cloud warehouses arrived. Snowflake, BigQuery, and Redshift made raw SQL queries fast and cheap enough that you no longer needed pre-aggregated cubes for performance. The cube vendors looked legacy almost overnight. But the problem the cube solved — consistent business definitions — did not go away. It just needed a new home.
The new home turned out to be Looker and its modeling language, LookML. Founded in 2012, Looker was the first BI tool to insist that every metric was defined in version-controlled code rather than in a drag-and-drop GUI. This was both Looker's genius and its original sin: it made consistency possible, but it locked all the business logic into a proprietary language inside a proprietary tool. When Google bought Looker in 2019 for $2.6B, they bought one of the most defensible semantic layers in the industry.
Around 2020, a new generation of vendors decided to pull the semantic layer out of the BI tool entirely. The pitch: make the semantic layer a standalone service that any BI tool, notebook, or AI agent can query. This is sometimes called "headless BI", and the leading exponents are Cube, Transform/Supergrain (acquired by dbt Labs in 2023), AtScale, and the dbt Semantic Layer itself. None has decisively won.
A semantic layer typically holds three kinds of definitions:
1. Dimensions. Things you slice by. Date, country, product category, customer segment.
2. Measures. Things you aggregate. Revenue, order count, active users, conversion rate. A measure is not a column — it is a calculation. Revenue might be SUM(orders.total) - SUM(refunds.amount) joined across two tables. The measure definition hides this join.
3. Joins and entities. How tables relate to each other. The semantic layer knows that orders joins to customers on customer_id, and that customers joins to regions on region_id, so when an analyst asks for "revenue by region" the layer figures out the joins automatically.
When a BI tool or AI agent asks the semantic layer "give me revenue by region for the last 30 days," the semantic layer compiles that request into a SQL query against the underlying warehouse, runs it, and returns the result. The user never wrote the SQL. The user never had to know that orders and regions are different tables. The semantic layer encoded all of that.
The semantic layer was a sleepy enterprise BI feature for twenty years. Then in 2020-2024 it became one of the hottest categories in data infrastructure. Three things changed:
1. dbt made data modeling cool. dbt trained an entire generation of data analysts to write SQL transformations as version-controlled code. Once you have a clean dbt project, the obvious next question is: "where do I define metrics on top of these models?" That natural extension is exactly what a semantic layer does.
2. Headless BI became viable. With cheap warehouse compute and real-time SQL, you no longer need a fat BI tool to host the semantic layer. You can run it as a microservice that exposes a SQL or REST or GraphQL endpoint. Cube and Supergrain were the first to do this seriously.
3. LLMs need a semantic layer to give consistent answers. This is the new and most important driver. When a business user asks an AI agent "what was revenue last quarter?", the LLM needs one canonical definition of revenue. Without a semantic layer, the LLM will write whatever SQL it feels like, and two users asking the same question can get two different answers. Every serious AI-on-data product (TextQL included) needs a semantic layer underneath, which has dramatically increased demand for the category.
The semantic layer market is genuinely contested in 2026:
The honest take: no one has won and probably no one will win cleanly. The warehouse vendors (Snowflake, Databricks) want to host the semantic layer themselves. The BI vendors (Looker, Power BI, Tableau) want to keep owning it. The independent vendors (Cube, AtScale, dbt) want to sit between them. All three positions are commercially defensible and the market is large enough to support multiple winners. Expect the next five years to look like a slow, ugly knife fight.
TextQL Ana consumes whatever semantic layer the customer already has. If you use LookML, Ana queries through LookML. If you use Cube, Ana queries through Cube. If you use the dbt Semantic Layer, Ana uses MetricFlow. TextQL also ships its own Ontology, a semantic layer designed specifically for AI agents, for customers who do not yet have one. The bet is that the semantic layer is the most important piece of infrastructure for trustworthy AI on data, and TextQL works with whatever flavor the customer prefers.
See TextQL in action