Semantic layers are finally getting opinionated enough to be useful

Semantic layers are finally getting opinionated enough to be useful
Summary

Semantic layers are finally getting opinionated enough to be useful

A semantic layer isn't new; it's the "business translation" between raw data and real decisions. AI makes it necessary now.

40% of Databricks users still don't use dbt. Each BI tool in your org has its own definition of "revenue." There are dozens of dashboards, but none of them line up. What happened?

AtScale, Stardog, Databricks Unity Catalog Metrics, and other semantic layers fix this by defining metrics once and making them usable in SQL, DAX, MDX, Python, and even AI agents.

The secret isn't "no-code BI." It's no-drift semantics: metrics mean the same thing to analysts, ML engineers, and LLMs.

Your dashboards and model training data should both use the same "revenue" metric.

The AtScale + Databricks "Semantic Lakehouse" model gets this right:

  • No moving data
  • Automatic aggregates
  • Unified metric definitions
  • Direct integration with Unity Catalog and Spark

It gives AI a stable source of business truth.

My new TIL post, "Semantic Layer Solutions in Modern Data Architecture" covers vendors (AtScale, Stardog, Timbr), integrations (Unity Catalog Metrics, Power BI), and the Databricks + AtScale partnership that makes "semantic lakehouse" more than a buzzword. It explains what a semantic layer is, why 40% of Databricks users still don't use dbt, and how tools like AtScale and Databricks Unity Catalog Metrics are working to solve the "truth problem" in analytics.

The semantic layer isn't just a BI issue. Once "revenue" has the same meaning in SQL, DAX, MDX, and Python, you've built a base for both human and machine reasoning. That semantic consistency carries over to dashboards etc.

"AtScale's main selling point is that it stops data from moving..." It queries data in place within Databricks, creates and manages aggregates independently, accelerates performance through intelligent caching, and maintains a single source of truth without duplication. (AtScale x Databricks blog)

You create drift the moment you pull data into a BI cube. AtScale's Databricks integration closes that loop by bringing together technical lineage (through Unity Catalog) and business-facing semantics. It's not flashy, but it's the foundation AI architectures need.

The next step is to use the same layer to train and test the AI model. The same "revenue" metric that powers dashboards should also drive model features and evaluation metrics. That's how you stop AI from learning about the business from one definition while executives use another.

Pushback

  • "Single source of truth" is just a saying unless you hard-gate BI against the layer. You'll end up with "two truths and a hope" if teams can still point tools directly at warehouse tables.
  • MDX-to-DAX-to-LookML equivalence sounds clean in theory; edge-case functions and time-intelligence logic do not map one-to-one. Set aside time for testing.
  • Vendor lock-in is real. If your BI surface is small, Unity Catalog Metrics and dbt/MetricFlow-style semantics might be "good enough." When you're "multi-surface and politically decentralized," AtScale is worth it.

See the full TIL for vendor details, query protocols, and implementation patterns.