Use Git Folders for branch-based development, collaboration, and CI/CD; understand the notebook source format and how it enables testing and version control.

Two analysts edit the same Databricks notebook, relying on built-in revision history. It works until one overwrites the other — no branches, no review, no way back. Real teams solved this with Git; Git Folders bring that discipline inside the workspace.

The spine

Beat 1 — the anchor: a Git client in the workspace

Anchor. A Git Folder is Databricks' built-in visual Git client: clone a remote repo into the workspace and develop notebooks/files with the full Git workflow — branches, commits, push/pull, merges, diffs. Version control, isolated collaboration, and the on-ramp to CI/CD. (Git Folders were called Repos — the exam uses both.)

Beat 2 — collaboration = branches, not overwrites

Predict: you can't push straight to main, and you don't want to touch anyone else's work. How do you share your changes?

…

Feature branch → commit → push → open a pull request. Each engineer works on their own branch (often in their own Git folder mapped to the same repo), isolated until merged. Two gotchas:

A branch isn't in the dropdown? Your local Git folder hasn't fetched it — pull from the remote to refresh the branch list.
Merge conflict? Resolve it in the Git Folders UI — manually edit out the <<<</====/>>>> markers (or accept incoming/current), then mark resolved.

Lock it. Feature branch → commit → push → PR; rarely push to main. Missing branch → pull; conflicts → resolve in the UI.

The dials (skim now; return when a question needs one)

◆ The magic first line — how a `.py` file is a notebook

Open a Databricks Python notebook as plain text and the first line is # Databricks notebook source. That magic comment is what marks a plain source file (.py, .sql, .scala, .r) as a Databricks notebook ("source format") — why notebooks live in git as reviewable, diffable text. Contrast .ipynb (Jupyter): heavier, but it preserves outputs and dashboard/visualization definitions that source format drops. Tells: "what makes a .py a notebook" → the # Databricks notebook source first line; "keep dashboards/outputs in git" → .ipynb.

◆ Why this unlocks testing

Recall [Unit & integration testing on Databricks](/lessons/s1-testing/): to run pytest, functions must live in importable .py modules, not be trapped in notebooks. Git Folders let you keep arbitrary .py files alongside notebooks — factor logic into modules, import, unit-test (with the sys.path care from [Python project structure for Databricks Asset Bundles](/lessons/s1-dabs-project/)). Version control + testability together. Tell: "unit test functions in the workspace" → define + test functions in Files in Git Folders/Repos.

◆ CI/CD — two paths (name the collision)

Declarative Automation Bundles (recommended) — deploy resources (jobs, pipelines) and code as one versioned unit ([Declarative Automation Bundles — deploying Databricks as code](/lessons/s9-dabs-deploy/)). The primary path.
Production Git folder (code-only) — an admin creates a top-level folder cloned to a branch; a GitHub Action on merge (or scheduled job) updates the Git folder to the latest commit. No resource management — just synced files.

Git Folder syncs files/code; a bundle deploys resources. "Deploy just the notebooks" → production Git folder; "deploy the jobs and pipelines as code" → a bundle.

Takeaways (rebuild it from these)

Git Folder (was Repos) = the workspace's built-in Git client: clone, branch, commit, push/pull, merge, diff.
Collaboration = feature branch → commit → push → PR; rarely push to main. Missing branch → pull; conflicts → resolve in the UI.
# Databricks notebook source (first line) marks a plain .py/.sql/… file as a notebook (source format). .ipynb preserves outputs/dashboards.
Git Folders let you keep importable .py modules → makes pytest-style unit testing possible ([Unit & integration testing on Databricks](/lessons/s1-testing/)).
CI/CD: bundles (deploy resources + code) vs a production Git folder (code-only, synced via GitHub Actions/scheduled job). Git folder = files; bundle = resources.

Before you move on — say these without scrolling up

Share your work without pushing to main or touching others — the workflow?
Expected branch missing from the dropdown — what do you do?
What single line makes a .py file a Databricks notebook — and when do you use .ipynb instead?
Production Git folder vs bundle — which deploys files, which deploys resources?

That completes Section 9: version control (Git Folders) → automated deployment (bundles).

Git Folders & CI/CD — version control inside the workspace