Semantic Layer Solutions in Modern Data Architecture
Meaning
The semantic layer is the "business translation" layer that sits between raw data and business users. It translates technical data structures into business terms that everyone can understand, allowing non-technical users to access and analyze data without needing extensive knowledge of technology.
What this means in real life:
"No movement" means more than saving money; it also means controlling risk. Each extract to a cube or data mart is just another old copy, another governance boundary, and another breach surface. You can get a cube-like user experience without having to duplicate data topologies if a vendor can create aggregates in place (like AtScale on Databricks) and cache them smartly.
Why it matters
A shared metric contract is the link that is missing between data stores, BI, and AI agents. If "revenue" means the same thing in SQL, MDX, DAX, and an API, you've gotten rid of 80% of the governance drama before it even starts.
The Problem in the Market Right Now
The Gap
- 40% of people who use Databricks haven't started using dbt yet. Different BI tools define their metrics in different ways, which leads to different truths.
- Business users can't get to data in warehouses or lakehouses directly.
- Data stays locked in technical formats
How Semantic Layers Help
- One place to find all the metrics (a universal definition of "revenue")
- Easy access to complex data for businesses
- Automated management of aggregates for performance
- Consistent definitions across all BI tools
Big Semantic Layer Companies
Solutions that work on their own
- AtScale: Makes automated aggregates and a unified access layer.
- Stardog: Uses a knowledge graph to show how complex relationships work.
- Timbr: Uses semantic SQL knowledge graphs.
Built into the platform
- Power BI: has a built-in semantic model.
- Databricks Unity Catalog Metrics: a new feature that lets AI access metrics through SQL.
Ways to Put It into Action
Vendor Implementation
Work with AtScale/Stardog to set up their tools on your clients' systems. Not as different, but goes to market faster.
Personalized Semantic Experiences
Make metric stores for your business with custom interfaces. Requires product thinking to get higher value and more differentiation.
Semantic Layer Focused on AI
Create semantic layers just for AI/ML use cases where having consistent metric definitions is very important for training and testing models.
Value for Strategy
Semantic layers are more important than specific use cases, such as AI QA. Every customer of Databricks or Snowflake needs this, which opens up high-value consulting opportunities for projects worth $500,000 or more.
AtScale-Databricks Partnership: The Semantic Lakehouse Architecture
AtScale makes a "Semantic Lakehouse" by putting a semantic layer between Databricks Lakehouse storage and business intelligence tools. With this architecture, business users can ask questions about complicated data using business terms they already know, without having to move the data out of Databricks.
Main Points of Integration
Technical Integration
- Works with both Databricks SQL and Apache Spark engines.
- Integrates with Unity Catalog for governance and lineage.
- Easy setup through Databricks Partner Connect.
- Direct connection to SQL warehouses and compute clusters.
Support for Query Protocol
AtScale adds to Databricks' SQL query capabilities by adding:
- MDX (to work with OLAP cubes)
- DAX (for native Power BI integration)
- REST and Python APIs for programmatic access
- LookML for Looker to work together
The "No Movement" Promise
The main benefit of AtScale is that it stops data from moving around. AtScale does not take data out of Databricks and put it into separate OLAP cubes or BI-specific data marts. Instead, it does this:
- Queries data directly in Databricks.
- Automatically creates and manages aggregates in the lakehouse.
- Improves query performance with smart caching.
- Keeps a single source of truth without duplication.
Effects on Business
The partnership solves the main problem that 40% of Databricks users have: making data easy for business users who aren't tech-savvy to use. AtScale fills this gap by giving technical data structures a business-friendly interface that lets users do their own analytics while still keeping control.
Related context that you should read after this post:
- Headless BI / Metric Store patterns (semantics that work with more than one query dialect)
- Knowledge-graph semantics for complicated relationship logic (Stardog/Timbr)
- Platform-native semantics (Power BI datasets; Unity Catalog Metrics) for teams that can work together on one surface