Secoda and Airflow integration

June 1, 2021

We’re excited to announce that we’ve just launched the ability to integrate Airflow with Secoda. Airflow is an open source workflow management tool that was created in Airbnb in 2014 as a way to manage complex workflows. Airflow gives data engineers the ability to programmatically schedule their workflow and monitor their workflow through the Airflow interface. The airflow integration will pull information related to Airflow jobs and put them into their own page on the Secoda UI. Airflow is an important part of the data landscape and should be included in any data discovery tools that want to encompass the entire data stack. The Airflow DAG page will display the following information for documentation purposes:

Basic information

  • Name
  • Description
  • Owner(s)
  • Tags
  • Scheduled interval
  • Last run
  • Last run status
  • README

Task list associated with DAG (similar to columns in Table details view)

  • Name
  • Owner(s)
  • Last run
  • Last run status

DAG associations

  • Table → Airflow DAG association (coming soon)

Jobs will be a new searchable entity in Secoda by the metadata listed above and will allow data users to easily find all the information related to their Airflow workflow. The jobs integrated are searchable in the Secoda search bar and the Secoda Slack app.

Searchable jobs in the Secoda UI

Why add Airflow to Secoda?

Over the last few years, producing and storing data has become increasingly cheaper and easier. Organizations are now flooded with data and context in separate data warehouses, data lakes, and tools like Airflow and dbt. These data assets are becoming more difficult to manage and protect and for consumers outside of the data team, this complexity has made it increasingly more difficult to understand what data exists, what to trust and how to use data.

Even with great data practices, many organizations still struggle to get value out of their data - up to 73% of all enterprise data goes unused. One of the big contributors to this problem is that organizations create data silos by not documenting and centralizing their data in a place where every employee can access the information.

With Secoda and Airflow, you can verify DAGS and manage them in one central place. You can also enhance your Airflow jobs with additional details like tags, owners and related tables. All details from your DAGS are automatically transferred to Secoda, allowing you to document your data in code and allowing your business users to search for it through a simple, intuitive UI.

A DAG presented in the Secoda UI

Additionally, Secoda allows you to connect to Slack, which Airflow users can use to stay updated about new DAGS, changes and new documentation that other members are creating. This feature can help teams stay informed about the changes that are made across their infrastructure.

The integration with Airflow works with Airflow version 2.0 and above. For Secoda, this integration enhances our ability to provide information on how the data should be interpreted, information on how the data is created and used and information on the frequency and types of updates to the data. This will help us make a more intuitive and relevant search experience for everyone who integrates Airflow to Secoda. In the future, we're going to add the ability to associate this information to the tables in Secoda.

Our goal is to continue to simplify data discovery while trying to make the most intuitive data discovery platform for all users. To make things even more exciting, we're offering teams that connects Airflow to Secoda a 20% off discount to the product. We're super excited to help data teams find clarity in the sea of data and are continuing to work towards integrations and features that can make data discovery simple.