You've built the monitoring queries — cost per user from system tables, failed-records from the pipeline event log, null-rate on a table. But nobody watches a dashboard at 3 a.m. A SQL Alert runs a query on a schedule and notifies you when the result crosses a line. Simple idea — with exactly one sharp edge that every exam question here pokes at.
The spine
Beat 1 — the anchor: the alert only compares; the query must think
Predict: you want to be paged when "missing-email % > 15% AND invalid-phone % > 10%." Can you just hand the alert those two columns and an AND?
…
No — and that "no" is the whole lesson:
Anchor. A SQL Alert evaluates a single value — one field from the query result — against a threshold, and notifies when true. It cannot natively "trigger when A > x AND B > y." All the intelligence lives in the SQL query: write a query whose one returned field is true exactly when you want to be paged. The alert only compares; the query must think.
Internalize the inversion and every SQL-alert question becomes: how do I make the query emit a single field meaning "alert now"?
Beat 2 — the consequence: collapse conditions into one field
The most-tested move. Two conditions → one field, two clean ways:
1. A CASE returning 1/0 — evaluate the boolean in SQL, alert on = 1:
SELECT CASE
WHEN new_tickets > 200 AND under_processing > 150 THEN 1 ELSE 0
END AS should_alert
FROM ( SELECT SUM(CASE WHEN status='new' THEN 1 ELSE 0 END) AS new_tickets,
SUM(CASE WHEN status='in_progress' THEN 1 ELSE 0 END) AS under_processing
FROM tickets );
-- Alert condition: should_alert = 1
Both thresholds live inside the CASE; the alert checks one scalar. (The call-center/ticket question and its siblings.)
2. Compute the metrics as aggregate columns and point the alert's evaluated field at the one that matters. Same principle — query does the arithmetic, alert reads one number.
The wrong answers always return multiple columns/rows and expect the alert to AND them — it can't.
Lock it. Multiple conditions →
CASE WHEN c1 AND c2 THEN 1 ELSE 0 END, alert on= 1(or a single aggregate). One field carries the verdict.
The dials (skim now; return when a question needs one)
◆ Grouped queries — query decides what, alert config decides how often
When the query groups (e.g. mean(temperature) > 120 per sensor_id), the alert evaluates its configured column against the threshold across the returned rows — firing per group as framed. If it sends three consecutive notifications, that's the alert re-evaluating on its schedule and re-notifying while the condition holds — cadence is an alert setting, not query logic. Keep straight: the query decides what is compared; the alert schedule/notification config decides how often you hear it. (So "average temp for at least one sensor exceeded 120 on three consecutive executions" = the query's per-sensor condition held across three scheduled runs.)
◆ Temporal conditions — push the time logic into the query
"Alert when the daily average of percent_null exceeds 5% for three consecutive days, notify once":
- Daily average + three-consecutive-days = data logic → in the query (
GROUP BY DATE_TRUNC('DAY', window.end)for the daily average, then window/self-join to check the run of days), emitting one true/false field. - Notify once = an alert configuration choice.
That percent_null typically comes from a Lakehouse Monitoring profile-metrics table — Databricks auto-profiles a table and writes metrics (percent_null, drift) to a metrics table on a schedule, which you then alert on. The standard "monitor a column's null rate over time" pipeline.
◆ The current model — Alerts v2 (freshness)
- Alerts v2 define an evaluation: a
source(which column to evaluate), acomparison_operator(GREATER_THAN,EQUAL, …), athreshold(a literal or another column). States:OK/TRIGGERED/ERROR— legacyUNKNOWNis gone. - Legacy alerts framed it as "trigger when a value passes a threshold / when any row is returned." Single-value principle identical; only the config surface and state names changed.
Channels (email, Slack, PagerDuty, webhooks), schedule, destinations are configuration. And tune thresholds / "notify once" to avoid alert fatigue — an ignored alert is worse than none.
Takeaways (rebuild it from these)
- A SQL Alert compares one field to a threshold — the alert only compares; the query must do the thinking.
- Multiple conditions → collapse into one field:
CASE WHEN c1 AND c2 THEN 1 ELSE 0 END, alert on= 1. Returning many columns for the alert to AND is the classic wrong answer. - Grouped query → alert evaluates the chosen column per result; notification cadence ("three consecutive", "notify once") is an alert setting, not query logic.
- Temporal conditions → time logic in the query (
GROUP BY DATE_TRUNC, window/self-join); often fed by Lakehouse Monitoring profile-metrics (percent_null, drift). - Alerts v2 =
source+comparison_operator+threshold; statesOK/TRIGGERED/ERROR(noUNKNOWN).
Before you move on — say these without scrolling up
- Why can't an alert do "A > x AND B > y" directly — and what do you do instead?
- A grouped alert fires three times — is that the query or the alert config?
- "Daily avg over 3 consecutive days, notify once" — which part is query, which is alert setting?
- Alerts v2 evaluation — the three pieces, and the states.
Next: the operational side — driving jobs via REST/CLI, notifications, and retry settings → [Operational job monitoring — REST/CLI, notifications, retry policy](/lessons/s5-job-monitoring/).