Home

Learn

Deep, connected lessons. Read one, then practice its questions in the same place.

Structured Streaming & the state model
Compare Spark Structured Streaming and Lakeflow Spark Declarative Pipelines to determine the optimal approach; build reliable batch and streaming pipelines. (Underlies streaming across Sections 1 and 3.)
21 Q
Lakeflow Spark Declarative Pipelines
Build and manage reliable, production-ready batch and streaming pipelines using Lakeflow Spark Declarative Pipelines and Auto Loader; use expectations for quality.
4 Q
Streaming tables vs materialized views
Explain the advantages and disadvantages of streaming tables compared to materialized views.
2 Q
APPLY CHANGES — CDC and SCD, declaratively
Use APPLY CHANGES APIs to simplify CDC in Lakeflow Spark Declarative Pipelines.
7 Q
Jobs & orchestration — multi-task, dependencies, control flow
Create and automate ETL workloads using Jobs via UI/APIs/CLI; create pipeline components that use control-flow operators (if/else, for-each).
Jobs via REST API and CLI
Create and automate ETL workloads using Jobs via UI/APIs/CLI.
Job & environment configuration — compute and Spark tuning
Choose the appropriate configs for environments and dependencies, high memory for notebook tasks, and auto-optimization to disallow retries.
6 Q
UDFs — Python vs Pandas, and why the type is everything
Develop User-Defined Functions (UDFs) using Pandas/Python UDF.
5 Q
Managing third-party libraries
Manage and troubleshoot external third-party library installations and dependencies in Databricks, including PyPI packages, local wheels, and source archives.
3 Q
Python project structure for Databricks Asset Bundles
Design and implement a scalable Python project structure optimized for Databricks Asset Bundles (DABs), enabling modular development, deployment automation, and CI/CD integration.
2 Q
Unit & integration testing on Databricks
Develop unit and integration tests using assertDataFrameEqual, assertSchemaEqual, DataFrame.transform, and testing frameworks, to ensure code correctness, including a built-in debugger.
1 Q
Reference card — job parameters & secrets in notebooks
Understand the notebook development environment, variable management, and creating secure, configurable code.
3 Q
Reference card — pipeline control-flow operators
Create a pipeline component that uses control flow operators (e.g., if/else, for/each, etc.).
1 Q