Section 4 is one idea seen from two directions: move data without copying it. This lesson is the outbound half — you have data a partner needs. The old way (export a CSV, run an ETL into their system) is stale the moment it lands, ungoverned, and racks up transfer cost. Delta Sharing lets the recipient read your live data directly from your cloud storage, no copy made.
The spine
Beat 1 — the anchor, and the one decision
Anchor. Delta Sharing is an open protocol that lets a recipient read your data directly from your cloud storage — no copy, always live, governed by Unity Catalog. Everything else hangs on one decision: who is the recipient — another Unity Catalog org, or someone outside Databricks?
Beat 2 — the fork: who's the recipient?
Predict: Partner A runs Databricks + Unity Catalog. Partner B is a pandas/Power BI shop, no Databricks. Can you share the same things to both?
…
No — and that's the whole exam axis:
| Databricks-to-Databricks (D2D) | Open sharing (D2O — open protocol) | |
|---|---|---|
| Recipient | another Unity Catalog Databricks org | any tool/platform (non-Databricks) |
| Can share | tables + notebooks, volumes, ML models | Delta tables only |
| Identifies recipient by | their sharing identifier | a credential/token file |
- D2D is the richer path — because both sides have UC, you can share notebooks, volumes, and models, not just tables. You set it up with the recipient's sharing identifier: a unique reference to their UC metastore, form
<cloud>:<region>:<metastore-uuid>, and no token is exchanged. - Open sharing (D2O) reaches recipients who aren't on Databricks at all (pandas, Spark, Power BI via the open connector) — but it's Delta tables only: no volumes, models, or notebooks.
Tells: "share with a non-Databricks partner / open tools" → open (D2O), tables only. "share tables and notebooks/models with another UC org" → D2D.
Lock it. No-copy, live, UC-governed. Recipient on UC → D2D (tables + notebooks/volumes/models, via sharing identifier). Recipient off Databricks → open/D2O (Delta tables only, token file).
The dials (skim now; return when a question needs one)
◆ Creating and populating a share
You must be a metastore admin or hold CREATE SHARE (the UC privilege model, [Unity Catalog privileges — the three-level traversal and delegation](/lessons/s7-uc-privileges/)). Define a share, then add objects: CREATE SHARE … then ALTER SHARE … ADD TABLE ….
◆ WITH HISTORY — time travel, streaming, CDF, and read speed
To let a D2D recipient do time travel, streaming reads, or read the Change Data Feed on a shared table — and for better read performance — share it WITH HISTORY:
ALTER SHARE sales_share ADD TABLE products WITH HISTORY;
It shares the table's version history so the recipient can query past versions and stream. Tell: "recipient needs time travel / streaming / CDF" → WITH HISTORY. And for optimal read performance specifically, the tested combo is WITH HISTORY + enable CDF + no partitioning on the shared table.
◆ How open sharing reaches the bytes (the mechanism)
An open-sharing recipient reading a table without history still reads straight from your storage — via temporary, scoped-down security credentials from the cloud storage, restricted to the root directory of the shared Delta table. So they get short-lived, least-privilege access to exactly that table's files, nothing else.
◆ Egress cost — and the R2 trick
The recipient pulls bytes directly from the provider's storage, so a cross-cloud or cross-region share incurs egress fees (the provider's cloud charges outbound transfer). Tested mitigation: store the shared dataset in Cloudflare R2, which charges zero egress, before sharing widely across AWS/Azure/GCP. Tell: "minimize/eliminate egress cost sharing across clouds" → R2.
◆ Interop cousin — Delta UniForm
Adjacent to sharing: Delta UniForm makes a Delta table readable by Iceberg (and Hudi) tools by generating their metadata alongside the Delta table — an Iceberg-only tool reads your Delta table with no copy or conversion. Tell: "let external Iceberg tools read this Delta table" → enable UniForm to iceberg.
◆ Name the collision (with the next lesson)
- Delta Sharing — share your data out, in place (no copy).
- Lakehouse Federation (
[Lakehouse Federation — query external data in place](/lessons/s4-federation/)) — query others' data in, in place (no copy).
Same "no copy" principle, opposite directions.
Takeaways (rebuild it from these)
- Delta Sharing = open protocol, live data, no copy, UC-governed. Decision axis: who is the recipient.
- D2D (recipient on UC) shares tables + notebooks/volumes/models, via the recipient's sharing identifier (their metastore ref), no token. Open/D2O reaches non-Databricks tools but is Delta tables only (token file).
- Creating shares needs metastore admin /
CREATE SHARE; add tables withALTER SHARE … ADD TABLE. WITH HISTORYenables time travel / streaming / CDF (best read perf =WITH HISTORY+ CDF + no partitioning). Open sharing serves bytes via temporary scoped credentials to the table's root dir.- Cross-cloud/region = egress cost; Cloudflare R2 (zero egress) avoids it. Delta UniForm = let Iceberg/Hudi tools read a Delta table.
Before you move on — say these without scrolling up
- The one decision that drives every Delta Sharing question.
- Partner is on UC vs not — what can you share to each, and how is the recipient identified?
- Recipient needs streaming + CDF + time travel on a shared table — what clause, and what else helps read speed?
- Sharing across clouds runs up cost — what is it, and the fix?
Next: flip the direction — querying external data into Databricks without copying it → [Lakehouse Federation — query external data in place](/lessons/s4-federation/).