Airflow dag unit testing

9/25/2023

for writing this blog post - testing Airflow code can be difficult.

Following is one the example I started with class test_file_not_found_error(self, download_file_from_s3_bucket, get_current_context, log):ĭownload_file_from_s3_bucket. py 20 import sys, os, re from airflow import DAG from airflow. Now I was trying to write a test case where I can check this except clause. Updates_file_path = utils.download_file_from_s3_bucket(files.get("updates_file")) Here is a task where I download a file from S3, there is more stuff going on but I removed that for this example. I tried multiple approaches for example, by creating a dagrun or only running the task function but nothing is helping. As mentioned above, to carry out our tests on the data project, we must first provide the external dependencies to Airflow. I tried to tinker around with capturing exceptions but I don't seem to find a workaround by myself.I am trying to write unittests for some of the tasks built with Airflow TaskFlow API. In this talk I cover: How to test and debug tasks locally. There are 6 ways we can wire dynamism into an Airflow DAG: Using Airflow variables. from datetime import timedelta import pytest from unittest import TestCase pytest.fixture def testdag (dag): dag.scheduleinterval timedelta (days1) override cuz once gets skipped done set ( ) def run (key. How do you ensure your workflows work before deploying to production In this talk I’ll go over various ways to assure your code works as intended - both on a task and a DAG level. This works inside Airflow worker because the exceptions are captured and the state is changed but with. Heres a function you can use in a pytest test case that will run the tasks of your DAG in order. The term integrity test is popularized by the blog post Data’s Inferno: 7 Circles of Data Testing Hell with Airflow.It is a simple and common test to help DAGs avoid unnecessary deployments and to provide a faster feedback loop. This works correctly when I have no failures but if the mock failing operator or the count_successes are triggered, the exceptions break the code. Some options for example, DAGFOLDER, are loaded before you have a chance to call loadtestconfig (). Please note that this operation is not reversible. You can load these at any time by calling (). Return dag_runs if dag_runs else expected_state",[ĭef test_my_custom_operator_execute_no_trigger(dag,expected_state):Ĭount_instance= run.get_task_instance(task_id=TEST_TASK_ID)Īssert count_instance.state=expected_state To set up dag.test, add these two lines to the bottom of your dag file: if name 'main': dag.test() and that’s it You can add argument such as executiondate if you want to test argument-specific DAG runs, but otherwise you can run or debug DAGs as needed. Using the Test Mode Configuration Airflow has a fixed set of test mode configuration options. Only ", python_callable=fail(True))ĭef get_most_recent_dag_run(dag_id)->DagRun:ĭag_runs.sort(key=lambda x: x.execution_date, reverse=True) I have an operator that counts failures in a specific way and if it's not happy does: if not run_success:į"Failure rate too high.

0 Comments

Airflow dag unit testing

Leave a Reply.

Author

Archives

Categories