Governance isn't only who can touch the data — it's people being able to find and understand it. A catalog with thousands of undocumented tables is technically governed and practically useless. This lesson is the toolkit for that second half: document it (comments), classify it (tags), and inspect it (DESCRIBE).
The spine
Beat 1 — the anchor: describe, comment, tag
Anchor. Discoverability metadata is three moves — comments (human descriptions on tables/columns), tags (queryable key–value labels for classification, e.g. PII), and the inspection commands that read it all back. Every question here is one of those three.
Beat 2 — DESCRIBE EXTENDED shows everything at once
Predict: you need one command to confirm a table's column comments and a
contains_piiproperty and aCHECKconstraint. Which?
…
DESCRIBE EXTENDED (= DESCRIBE TABLE EXTENDED) — it returns the complete picture in one output. The narrower commands each show only a slice:
| Command | Shows | Misses |
|---|---|---|
DESCRIBE EXTENDED | columns + comments, table comment, properties, constraints | — (the complete view) |
SHOW TBLPROPERTIES | just the properties | every comment |
DESCRIBE DETAIL | file/format/location details | column comments, custom properties |
DESCRIBE HISTORY | the version log ([How Delta Lake works — the transaction log](/lessons/f2-delta-transaction-log/)) | schema annotations |
Lock it. "Confirm comments + properties + constraints together" →
DESCRIBE EXTENDED. The others are slices.
The dials (skim now; return when a question needs one)
◆ Comments — and AI-generated comments at scale
Document a table/column with COMMENT (CREATE TABLE payments … COMMENT 'settled payments', or column comments). Writing hundreds by hand is the bottleneck — so Catalog Explorer offers AI-generated comments (the "AI Generate" option): an LLM inspects column names, types, and sample values and drafts descriptions you review and accept. Tell: "improve discoverability across hundreds of tables with minimal manual effort" → AI-generated comments in Catalog Explorer.
◆ Tags — classification you can query
Tags are key–value labels for governance classification (mark PII, domain, sensitivity). Syntax for multiple tags:
ALTER TABLE t SET TAGS ('key1' = 'value1', 'key2' = 'value2');
Plural TAGS with parentheses for a set of key = value pairs (a single tag can also be set). It's programmatic (works inside an automated ETL step) and queryable, so "all PII tables" becomes a metadata query.
◆ Two more governance facts
ALTER TABLE … RENAME TOupdates the metastore reference only — the underlying data files don't move or rewrite (the catalog-vs-files split from[How Delta Lake works — the transaction log](/lessons/f2-delta-transaction-log/)). Tell: "rename a table — what happens to the data?" → only the metastore pointer changes.- Govern policy centrally, once. When different teams hand-roll their own masking on the same columns, the fix is a single governed UDF applied as a column mask centrally (
[Row filters and column masks — access control inside a table](/lessons/s7-row-col-masks/)) — one source of truth, preventing inconsistent exposure. Governance favours "define once, apply everywhere."
Takeaways (rebuild it from these)
- Discoverability = comments (describe) + tags (classify) + DESCRIBE (inspect).
DESCRIBE EXTENDED(=DESCRIBE TABLE EXTENDED) shows column + table comments, properties, and constraints together;SHOW TBLPROPERTIES/DESCRIBE DETAIL/DESCRIBE HISTORYeach show only a slice.- AI-generated comments (Catalog Explorer "AI Generate") draft descriptions across many tables — the scale answer for documentation.
ALTER TABLE t SET TAGS ('k'='v', …)— programmatic, queryable classification (e.g. PII).RENAMEchanges only the metastore pointer (files untouched); govern shared policy with one central UDF, not per-team copies.
Before you move on — say these without scrolling up
- One command to confirm comments + a property + a constraint together — which, and why not the others?
- Document hundreds of tables with minimal effort — what feature?
- The multi-tag syntax, and two things tags let you do (automated + queryable).
RENAMEa table — what happens to the data files?
That completes Section 8's governance story: how grants cascade (inheritance) → make the governed data discoverable (metadata).