Scheduling in Airflow

To Nha Notes | Nov. 17, 2022, 7:29 p.m.

Execution dates in Airflow

Time represented in terms of Airflow’s scheduling intervals. Assumes a daily interval with a start date of 2019-01-01.

 Airflow defines the execution date of a DAG as the start of the corresponding interval. Conceptually, this makes sense if we consider that the execution date marks our schedule interval rather than the moment our DAG is actually executed. Unfortunately, the naming can be a bit confusing.

With Airflow execution dates being defined as the start of the corresponding schedule intervals, they can be used to derive the start and end of a specific interval (below figure). For example, when executing a task, the start and end of the corresponding interval are defined by the execution_date (the start of the interval) and the next_execution date (the start of the next interval) parameters. Similarly, the previous schedule interval can be derived using the previous_execution_date and execution_date parameters.

 The current interval can be derived from a combination of the execution_date and the next_execution_date, which signifies the start of the next interval and thus the end of the current one.

References

Chapter 3 of the ebook Data Pipelines with Apache Airflow