Use system tables (billing usage, audit access, query history) to build observability, cost-attribution, and compliance reporting with SQL.

The Spark UI forgets everything when the cluster stops; run history expires in 60 days. So how does a platform lead answer "what did each team spend last quarter" or an auditor's "who read the customer table in the last 90 days"? Those need a surface that is durable, account-wide, and queryable with plain SQL — system tables. If the last lesson was the map, this is its one durable continent.

The spine

Beat 1 — the anchor: monitoring becomes a `SELECT`

Anchor. System tables are Databricks-managed Delta tables, in the system catalog, that record what happens across your whole account — cost, access, query execution, job runs — queried with ordinary SQL. No agent, no export, no extra infra. The skill is knowing which table holds your answer and which columns to group by.

Layout is system.<schema>.<table>, three-part names, read-only (Databricks writes; you query):

Table	Holds	Classic use
`system.billing.usage`	one row per workload's usage, in DBUs	cost attribution / chargeback
`system.billing.list_prices`	price per SKU over time	convert DBUs → dollars
`system.access.audit`	every permission-checked action (who/what/when)	compliance, security
`system.query.history`	every query's text, duration, rows	find slow/expensive queries
*`system.compute.` / `system.lakeflow.`*	cluster, warehouse, job metadata	correlate cost to compute/jobs

(Some schemas must be enabled by an account admin — treat "is it enabled" as a prerequisite.)

Beat 2 — `system.billing.usage`, and the #1 trap

The exam leans on this table. Its columns:

usage_date — the day (group by for daily reports).
sku_name — what kind of compute (all-purpose / jobs / SQL-warehouse / serverless — the menu from [Job & environment configuration — compute and Spark tuning](/lessons/s1-job-env-config/)).
usage_quantity — how much, in DBUs (a Databricks Unit — the normalized billing unit).
identity_metadata.run_as — who ran it (user or service principal).

Predict: a query sums usage_quantity per team and labels the result "cost in dollars." What's wrong?

…

usage_quantity is in DBUs, not money. To get dollars you join list_prices on sku_name and multiply. A "cost" question whose query sums usage_quantity alone is reporting DBUs — the correct answer includes the price join. (That's the single most-tested trap here.)

The attribution pattern (both exam questions are this):

SELECT usage_date,
       identity_metadata.run_as AS run_as,   -- who
       sku_name,                             -- which compute
       SUM(usage_quantity)      AS dbus      -- how much (DBUs)
FROM system.billing.usage
WHERE usage_date >= DATE_SUB(CURRENT_DATE, 30)
GROUP BY usage_date, run_as, sku_name
ORDER BY usage_date DESC, dbus DESC;

GROUP BY run_as, sku_name per day = per-user, per-compute consumption → build a dashboard, schedule a daily refresh, share with execs. (For dollars, join list_prices.)

Lock it. billing.usage: usage_date / sku_name / usage_quantity (DBUs) / run_as. Attribution = GROUP BY run_as, sku_name. DBUs ≠ dollars → join list_prices.

The dials (skim now; return when a question needs one)

◆ The other two you'll be asked about

system.access.audit — the authoritative who-did-what: timestamp, user_identity, action (SELECT/INSERT/…), object. The answer to "auditor wants who accessed/modified table X" and "which table is queried most." It's the account-wide audit log from the collision (distinct from cluster/pipeline event logs), tied to governance in [Access control — least privilege and the object permission ladders](/lessons/s7-access-control/).
system.query.history — every query's text, duration, rows scanned. The durable, cross-warehouse cousin of the Query Profile: Query Profile explains one query interactively ([Reading the evidence — Query Profile & Spark UI](/lessons/s6-query-profile/)); query.history finds the slowest queries of the week with a SELECT.

(Note: a full cost+usage dashboard often draws on both billing and access — usage for spend, audit for who/what.)

◆ Gotchas

DBUs ≠ dollars — join list_prices (the #1 trap).
System tables have their own retention — copy out for very long compliance windows.
Querying them uses compute — schedule/materialize heavy monitoring queries rather than rescanning constantly.

Takeaways (rebuild it from these)

System tables = Databricks-managed Delta tables in system, account-wide, durable, read-only, queried with SQL. Monitoring becomes a SELECT.
system.billing.usage: usage_date, sku_name, usage_quantity in DBUs, identity_metadata.run_as. Attribution = GROUP BY run_as, sku_name per day → dashboard + daily refresh.
DBUs are not dollars — join system.billing.list_prices on sku_name.
system.access.audit = who-did-what (compliance, most-accessed table); system.query.history = durable per-query record (slowest queries).
Watch retention (copy out for long compliance) and cost (queries use compute); some schemas need admin enablement.

Before you move on — say these without scrolling up

Where does durable, account-wide activity live, and how do you read it?
A "cost per team" query sums usage_quantity — why is that not cost, and the fix?
Which columns give you per-user, per-compute attribution?
"Who accessed table X for the auditor" vs "the slowest queries this week" — which system table each?

Next: once a monitoring query returns a number, how do you get told when it crosses a line — SQL Alerts → [SQL Alerts — the single-value rule that makes or breaks them](/lessons/s5-sql-alerts/).

System tables — the account's durable, queryable memory