SELECT is blunt: grant it and the user sees every row and every column value; withhold it and they see nothing. But real policy is finer — "sales reps see only their region," "everyone but HR sees a masked SSN."
Predict: you could keep a separately-filtered copy of the table per group. Why is that a bad idea?
…
Data duplication nightmare — N copies to keep in sync. Unity Catalog does it in place, at query time, with two native tools.
The spine
Beat 1 — the anchor: a SQL UDF attached to the table, run per caller
Anchor. Both tools are the same mechanism: write a SQL UDF, then attach it to the table. On every query Databricks runs it as the caller's identity/group, transparently — a row filter returns a boolean (is this row visible?); a column mask returns the (possibly redacted) value for a column. No duplicated tables; the user's SQL is unchanged — results just differ by who runs it.
The most useful tell is the return type: row filter → boolean ("show this row?"); column mask → the column's value ("what should this cell read?").
Beat 2 — the two, side by side
Row filter — UDF returns BOOLEAN, attach with SET ROW FILTER … ON (cols):
CREATE FUNCTION region_filter(region STRING)
RETURN IF(IS_ACCOUNT_GROUP_MEMBER('admin'), true, region = 'US');
ALTER TABLE sales SET ROW FILTER region_filter ON (region);
Admins get true for every row; everyone else sees only region = 'US'. Rows returning false vanish from every query — no user WHERE needed. Columns in ON (…) are passed as the function's arguments.
Column mask — UDF receives the value, returns masked-or-not, attach with ALTER COLUMN … SET MASK:
CREATE FUNCTION ssn_mask(ssn STRING)
RETURN CASE WHEN IS_ACCOUNT_GROUP_MEMBER('HRAdmin') THEN ssn ELSE 'XXX-XX-XXXX' END;
ALTER TABLE users ALTER COLUMN ssn SET MASK ssn_mask;
HRAdmin sees the real SSN; others see the literal. A mask can even be dynamic against a mapping table — the UDF looks the user up in a group_access table to decide what to reveal.
They compose: "sales see only their region and non-admins see masked email" = a row filter on region + a column mask on email, both on the same table.
Lock it. Filter → boolean (which rows); mask → value (which cell contents). Both = a UDF attached to the table, evaluated per caller. Branch on
IS_ACCOUNT_GROUP_MEMBER.
The dials (skim now; return when a question needs one)
◆ The older alternative — dynamic views (and the collision)
Before native filters/masks, a dynamic view did the same with CASE + a group function:
CREATE VIEW students_vw AS
SELECT * FROM students
WHERE CASE WHEN is_member('instructors') THEN TRUE
ELSE is_active IS FALSE END;
Read that carefully (a favourite question): instructors get TRUE for every row → all students; non-members fall to is_active IS FALSE → only inactive students. (A column version wraps each column in CASE WHEN is_member(...) THEN col ELSE 'REDACTED' END.)
Collision — two ways to do fine-grained access:
- Native row filter / column mask — attached to the table; protection follows the table, transparent to all queries. Current, recommended.
- Dynamic view — a separate view object; users must query the view, and the base table must still be locked down (or they bypass it). Older, still valid.
◆ The group-function collision
IS_ACCOUNT_GROUP_MEMBER('g') = account-level membership (preferred for UC filters/masks); is_member('g') = workspace-level (the classic dynamic-view function). Same shape, different scope — recognise which the question uses.
◆ The at-rest caveat (sets up the next lesson)
Masks are query-time, not at-rest. They change what a query returns, but the files still hold the real values — anyone with direct object-storage access reads them unmasked. So masks pair with access control; where the stored value must be protected, you transform it (hashing/pseudonymization) — the next lesson.
Takeaways (rebuild it from these)
- Row filters/column masks = a SQL UDF attached to the table, run per query as the caller. Filter → boolean; mask → the column value. They compose.
- Row filter:
RETURN <boolean>+SET ROW FILTER fn ON (cols). Column mask:RETURN <value>+ALTER COLUMN c SET MASK fn. - Branch on
IS_ACCOUNT_GROUP_MEMBER(account, preferred) vsis_member(workspace, dynamic-view classic). - Dynamic view = older alternative: separate view +
CASE/group-function; users must query the view and the base table must be locked down. Native filters/masks live on the table (transparent). - Masks are query-time, not at-rest — protect stored values via hashing/pseudonymization (
[Anonymization — hashing, pseudonymization, and protecting values at rest](/lessons/s7-anonymization/)).
Before you move on — say these without scrolling up
- Why not just keep a filtered copy per group — and what does UC do instead?
- Row filter vs column mask — what does each UDF return?
- Native filters/masks vs dynamic views — where does each live, and which is recommended?
- Why isn't a column mask enough to protect data at rest?
Next: protecting the values themselves — hashing, tokenization, and the SHA-2 details the exam checks → [Anonymization — hashing, pseudonymization, and protecting values at rest](/lessons/s7-anonymization/).