Semantic layers are finally getting opinionated enough to be useful

Semantic layers are finally getting opinionated enough to be useful

Semantic layers are finally getting opinionated enough to be useful

A semantic layer isn't new; it's the "business translation" that turns raw data into actionable insights for real decisions. It is now necessary for AI, as it's a relatively new technology.

40% of people who use Databricks still don't use dbt. Each BI tool at use in your company has its own definition of "revenue." There are dozens of dashboards, but none of them line up. What happened?

AtScale, Stardog, Databricks Unity Catalog Metrics, and other semantic layers fix this issue by defining metrics once and making them usable in SQL, DAX, MDX, Python, and even AI agents.

The secret is not in "no-code BI." It's no-drift semantics, which means that metrics have the same meaning for analysts, ML engineers, and LLMs.

Your dashboards and model training data should both use the same "revenue" metric.

The AtScale + Databricks "Semantic Lakehouse" model gets this right:

  • No moving data 
  • Automatic aggregates 
  • Unified metric definitions 
  • Direct integration with Unity Catalog and Spark.

Thus providing AI with a stable way to discern the truth in business.

My new TIL post, "Semantic Layer Solutions in Modern Data Architecture" has a clear walkthrough. It talks about vendors (AtScale, Stardog, Timbr), integrations (Unity Catalog Metrics, Power BI), and the Databricks + AtScale partnership that makes "semantic lakehouse" more than just a buzzword. It explains what a semantic layer is, why 40% of Databricks users still don't use dbt, and how tools like AtScale, Stardog, and Databricks Unity Catalog Metrics are working to solve the "truth problem" in analytics.

It's clear to me now that the semantic layer isn't just a BI issue; it's the missing link between data, metrics, and AI. Once "revenue" has the same meaning in SQL, DAX, MDX, and Python, you've built a base for both human and machine reasoning. That semantic consistency carries over to everything else, such as dashboards, copilots, and evaluations.

"AtScale's main selling point is that it stops data from moving..." It queries data in place within Databricks, creates and manages aggregates independently, accelerates performance through intelligent caching, and maintains a single source of truth without duplication. (AtScale x Databricks blog)

You create drift as soon as you pull data into a BI cube. AtScale's Databricks integration closes that loop by bringing together technical lineage (through Unity Catalog) and business-facing semantics. It's not flashy, but it's the kind of base that AI architectures will need.

The next step is to use the same layer to train and test the AI model. The same "revenue" metric that powers dashboards should also drive model features and evaluation metrics. That's how you stop AI from learning about the business from one definition while executives use another.

Pushback

  • "Single source of truth" is just a saying unless you hard-gate BI against the layer. You'll end up with "two truths and a hope" if teams can still point tools directly at warehouse tables.
  • It sounds great that MDX, DAX, and LookML will all work the same way. However, in reality, edge-case functions and time-intelligence logic do not map one-to-one. Set aside time for testing equivalence.
  • It's true that vendor lock-in happens. If your BI surface area is small, Unity Catalog Metrics and dbt/MetricFlow-style semantics might be "good enough." When you're "multi-surface and politically decentralized," AtScale is worth it.

Refer to the full TIL for details on the vendor breakdown, query protocols, and implementation patterns.