Making SAP Data Smarter in Databricks with Semantic Metadata Delta Sharing

SAP Business Data Cloud Connect for Databricks: Overview and roadmap -  Qubika

SAP datasets are incredibly rich—but let’s be honest, they’re not always easy to interpret. Technical table names like VBAK or fields like KUNNR may be precise, but they rarely tell a clear business story on their own. As a result, teams often spend hours translating this data into something meaningful, storing that context in spreadsheets, documents, or simply relying on internal know-how.

That gap between raw data and business understanding is exactly what the SAP–Databricks collaboration aims to fix.

Automatic Syncing of Business Context

With the general availability of semantic metadata syncing between SAP Business Data Cloud (BDC) and Databricks Unity Catalog, SAP data becomes far more intuitive to use.

Here’s what changes:

  • Semantic metadata is now automatically shared at the table level whenever SAP Delta Share data is accessed.
  • Business-friendly names, descriptions, and context appear instantly in Unity Catalog.
  • Updates made in SAP BDC reflect automatically—keeping it as the single source of truth.

In simple terms, instead of decoding cryptic SAP structures, users (and even AI tools) can immediately understand what the data represents—no manual mapping or back-and-forth needed.

This builds on BDC Connect, which already allows SAP teams to publish governed data into Databricks via Delta Sharing. Now, with metadata and governance tags included, teams can seamlessly combine SAP data with other enterprise sources for analytics and AI—without rebuilding context from scratch.

Why This Matters for AI

This isn’t just about making life easier for data engineers—it’s a big step forward for AI.

AI systems rely heavily on context. Without it, they struggle to interpret relationships between datasets, leading to weak or inaccurate outputs. SAP’s semantic metadata fills that gap by embedding decades of business logic directly into the data layer.

For example:

  • Relationships like primary and foreign keys are clearly defined.
  • Column-level descriptions provide meaning, not just labels.
  • AI tools like Databricks AI Assistant can generate accurate, ready-to-use queries.

So instead of guessing how SalesOrder connects to SalesOrderItem, the system already knows—and can answer natural language questions like:

“What’s the relationship between these two tables?”

The result? Faster insights, fewer errors, and much more reliable AI-driven analytics.

Built-In Governance, No Extra Work

Another key advantage is automated governance.

SAP BDC now shares data classification tags (like those in the PersonalData namespace) directly into Unity Catalog. These tags help with:

  • Compliance and regulatory requirements
  • Access control
  • Responsible AI usage

And the best part—this happens automatically. No need for manual tagging or additional setup.

The Bottom Line

By bringing semantic metadata into Databricks, SAP data becomes more than just structured information—it becomes understandable, usable, and AI-ready from the start.

No more decoding table names.
No more scattered documentation.
Just clear, connected data that works for both humans and machines.