[Python project structure for Databricks Asset Bundles](/lessons/s1-dabs-project/) gave the project structure; this lesson is the deploy. You built a job by clicking through the Jobs UI in dev. Now you need the exact same job — tasks, schedule, cluster — in staging and prod, versioned so every change is reviewed and reversible.
Predict: the naive way is to rebuild it by hand in three workspaces. What three things does that cost you?
…
Slow, error-prone, and no history (no review, no rollback). That's the problem bundles solve.
The spine
Beat 1 — the anchor: resources as YAML, deployed to any target
Anchor. A bundle declares your Databricks resources — jobs, pipelines, apps — and their config as YAML in your repo, and deploys that single definition to any target workspace. Infrastructure-as-code for Databricks: the source of truth is a file in git, not clicks in a workspace. (Current name Declarative Automation Bundles; the exam says Databricks Asset Bundles — same thing, acronym still DAB.)
The databricks.yml at the root defines: the bundle (name/settings), its resources (jobs/pipelines/apps/volumes — a resource can even carry a permissions mapping, so ACLs are code too, [Access control — least privilege and the object permission ladders](/lessons/s7-access-control/)), and its targets (named environments: dev, staging, prod). Resources say what exists; targets say where and how it deploys.
Beat 2 — the lifecycle, and the CI/CD trap
| Command | Does | Note |
|---|---|---|
databricks bundle init | scaffold from a template | one-time only |
databricks bundle validate | check config is well-formed | before every deploy |
databricks bundle deploy -t <target> | push resources to the target | |
databricks bundle run <key> -t <target> | run a bundle job/pipeline in that target |
Predict: which of these four is not part of a CI/CD pipeline?
…
init — the bundle already lives in the repo, and re-running init would overwrite/reset it. So CI/CD = validate → deploy → run (never init).
Lock it. Bundle = resources+config as YAML deployed per target (
-tselects the environment; dev mode isolates/pauses, prod deploys as-is). CI/CD sequence = validate → deploy → run.
The dials (skim now; return when a question needs one)
◆ Adopt an EXISTING job — generate, then bind
The scenario that trips people: a production job built in the UI, and you want to manage it as code without recreating it or making a duplicate. Two steps:
databricks bundle generate job --existing-job-id <id>— capture the live job's config into YAML (+ download referenced files) into your bundle.databricks bundle deployment bind <bundle_job> <remote-job-id>— link the bundle resource to the existing remote job by id, so futuredeploys update the real job instead of creating a second one.
Tell: "adopt an existing job into a bundle without losing/duplicating it" → generate, then bind.
◆ CI/CD identity, and where bundles sit
In an automated pipeline (GitHub Actions), authenticate as a service principal via OAuth token federation — short-lived, no stored long-lived secret — not a PAT ([Secrets — storing credentials, redaction, and scope ACLs](/lessons/s7-secrets/)). Bundles are the recommended CI/CD path; the lighter alternative deploys code only — a production Git folder ([Git Folders & CI/CD — version control inside the workspace](/lessons/s9-git-cicd/)). Bundles manage resources; a Git folder just syncs files.
Takeaways (rebuild it from these)
- A bundle = Databricks resources + config as versioned YAML (
databricks.yml), deployed to any target. Current name Declarative Automation Bundles; exam says Databricks Asset Bundles. - Lifecycle:
init(one-time) →validate→deploy -t→run -t. CI/CD = validate → deploy → run (neverinit). - Targets = named environments;
-t/--targetpicks one; dev mode isolates/pauses, prod deploys as-is. - Adopt an existing job:
bundle generate(capture to YAML) →bundle deployment bind(link by id → deploys update, don't duplicate). - Bundles carry permissions (ACLs as code) + variables; CI/CD authenticates as a service principal via OAuth.
Before you move on — say these without scrolling up
- What a bundle declares, and what a "target" is.
- The four lifecycle commands — and which is not in CI/CD, and why.
- Adopt a UI-built job into a bundle without duplicating it — the two commands.
- Bundle vs production Git folder — what does each deploy?
Next: the version-control layer underneath — Git Folders, branching, and how a .py file becomes a notebook → [Git Folders & CI/CD — version control inside the workspace](/lessons/s9-git-cicd/).