<<Download>> Download Microsoft Word Course Outline Icon Word Version Download PDF Course Outline Icon PDF Version

Updated June 2026

Apache Airflow Programming: Developing, Configuring, and Automating Workflows

Class Duration

28 hours of live training delivered over 4-5 days.

Student Prerequisites

  • Practical experience with Python.
  • Familiarity with Containerization and Container Orchestration.
  • Basic Linux command line skills.

Target Audience

Python developers, data engineers, and DevOps practitioners who need to build, automate, and maintain production data pipelines and workflows with Apache Airflow 3.

Description

Delivered over four to five days, this course immerses participants in Apache Airflow 3's architecture and configuration, guiding them through setting up environments, choosing executors, and developing robust DAGs in Python with the airflow.sdk authoring interface. Through hands-on exercises—ranging from dynamic task mapping, asset-based scheduling, and deferrable operators to cloud integrations and custom plugin development—attendees will master best practices for automating, monitoring, and optimizing production-ready workflows.

This course provides a comprehensive introduction to Apache Airflow 3, covering its architecture, configuration, and workflow automation capabilities. Participants will learn how to set up and manage Airflow environments, configure executors, and develop DAGs in Python using the airflow.sdk authoring interface. The course explores essential components like tasks, operators, variables, connections, and assets, as well as advanced topics such as asset-based scheduling, DAG versioning, dynamic task mapping, deferrable operators, and custom plugins. Hands-on exercises include running DAGs, scheduling tasks, integrating cloud providers, testing DAGs, and monitoring workflows through logs and the modern Airflow UI. By the end of the course, participants will be equipped to build, automate, and optimize data pipelines using Airflow 3.

Learning Outcomes

  • Understand Apache Airflow 3's architecture—api-server, scheduler, dag processor, triggerer, and workers—and how it automates distributed workflows.
  • Set up and configure Airflow using different execution modes and database backends.
  • Learn key Airflow components, including DAGs, tasks, operators, variables, connections, and assets.
  • Develop and run DAGs using the airflow.sdk @dag and @task decorators, the Operator API, and dynamic task mapping.
  • Integrate Airflow with cloud providers such as AWS and Azure.
  • Utilize built-in operators, sensors, and deferrable operators to automate task execution and monitoring.
  • Extend Airflow by creating custom operators, providers, and plugins.
  • Apply best practices for asset-based scheduling, DAG versioning, testing, logging, debugging, and optimizing workflows.

Training Materials

Comprehensive courseware is distributed online at the start of class. All students receive a downloadable MP4 recording of the training.

Software Requirements

Students will need a free, personal GitHub account to access the courseware. Students will need permission to install Python and Visual Studio Code on their computers. Also, students will need permission to install Python Packages and Visual Studio Code extensions. If students are unable to configure a local environment, a cloud-based environment can be provided.

Training Topics

What is Apache Airflow?

  • Distributed Task Automation
  • Compared to Cron Jobs
  • Compared to Celery
  • Scalability and Reliability
  • Directed Acyclic Graphs (DAGs)
  • Workflows as Code
  • What's New in Airflow 3

Workflows as Code (no programming)

  • Anatomy of a DAG
  • Directed Acyclic Graphs
  • Operators
  • Tasks
  • Variables
  • XComs
  • Providers
  • Connections
  • Assets (formerly Datasets)
  • DAG Versioning
  • Explore how DAG parts connect to the new UI
  • DAG Serialization
  • Schedulers
  • Pools

Installation and Configuration

  • Python Virtual Environment
  • Install Airflow
  • Airflow Constraints File
  • Standalone Mode
  • Run the API Server, Scheduler, and DAG Processor Independently
  • SQLite vs PostgreSQL
  • Configure with PostgreSQL
  • Executors: Local, Celery, Kubernetes, and Edge
  • Airflow and Kubernetes (Helm Chart)

Developing DAGs with the Task SDK

  • The airflow.sdk Authoring Interface
  • @dag and @task Decorators (TaskFlow API)
  • Operator API
  • Defining Task Dependencies
  • Passing Data with XComs
  • Templating with Jinja
  • Params and Runtime Context
  • Using Connections and Variables in Code

Scheduling and Assets

  • Cron Expressions and Timetables
  • Asset-Based Scheduling
  • The @asset Decorator
  • Event-Driven Scheduling with Asset Watchers
  • Backfills
  • DAG Versioning in Practice

Dynamic and Deferrable Tasks

  • Dynamic Task Mapping (expand and partial)
  • Mapping over Task Output
  • Branching and Trigger Rules
  • Sensors
  • Deferrable Operators and the Triggerer

Cloud Integration and Custom Plugins

  • Provider Packages
  • AWS and Azure Integrations
  • Custom Operators and Hooks
  • Building Custom Plugins

Testing and Monitoring

  • Testing DAGs with dag.test()
  • Unit Testing Tasks
  • Debugging DAGs
  • Review Task Logs in the UI
  • Notifications and Callbacks
  • Deadline Alerts (replacing SLAs)
<<Download>> Download Microsoft Word Course Outline Icon Word Version Download PDF Course Outline Icon PDF Version