White Paper: Data Democratization Through Data Management
Our recently released white paper, "Data Democratization Through Data Management" offers an in-depth exploration of the subject. This article will…
Read moreThis article explores using Airflow 2 in environments with multiple teams (tenants) and concludes with a brief overview of out-of-the-box features to be (potentially) delivered in Airflow 3.
Airflow 2 does not natively support multi-tenancy. For strong isolation, it's recommended that each team be provided with its own separate Airflow environment. If you're deploying Airflow, you can share some resources while maintaining isolation. Additionally, workarounds allow multiple teams to use a single Airflow instance, but these solutions typically require trust between users and may not offer complete isolation (it is not entirely possible to prevent users from impacting one another).
Airflow 3 is expected to eventually support multi-team deployment, as the relevant AIP-67 has been accepted, but the specifics are not yet fully defined or implemented.
In environments with multiple teams, companies need a single deployment of Airflow, where separate teams in the company structure have access to only a subset of resources belonging to the team. They mainly want to reduce costs and simplify management compared to maintaining multiple separate deployments. While Airflow is a robust platform for orchestrating workflows, its architecture needs adaptation to support diverse team needs, access controls, and workloads.
A thorough review was conducted across various sources, including PRs, GitHub issues and discussions, Stack Overflow questions, articles, blog posts, official Airflow documentation, and presentations from the Airflow Summit. The aim was not only to explore the possible setups of Airflow for multiple teams but also to identify best practices, examine how other organizations are solving this problem in real-world scenarios, and uncover any potential "hacks" or innovative approaches that could be applied to enhance the solution.
A sincere thank you to all the Airflow community members and everyone who provided thoughtful feedback to help shape this document!
This is the only approach that offers full isolation.
Each team has its dedicated Airflow instance, running in isolation from other teams. This solution involves deploying a complete instance of Airflow for each team, including all required components (database, web server, scheduler, workers, and configuration, including execution environment (libraries, operating system, variables, and connections)).
With self-managed Airflow deployments, some resources can be "shared" to reduce infrastructure costs, while keeping Airflow instances team-specific, e.g., Airflow's metadata database for each Airflow instance can be hosted in a separate database schema on a single database server. This "shared database" approach ensures full isolation from the user's perspective, though resource contention might occur at the infrastructure level. It's important to note that this setup is better suited for self-managed Airflow deployments - it requires control over the environment configuration, which is typically not feasible with managed services like Google Cloud Composer, AWS MWAA, or Astronomer. Managed services usually provide a single database instance per Airflow environment, making creating separate schemas for different teams impossible. Additionally, managed Airflow services often require a separate, dedicated Kubernetes cluster provisioned by the platform, which means users cannot deploy Airflow on a cluster of their choice, further limiting flexibility and customization.
Provision multiple Airflow environments according to your organization's policies, whether deploying via a Helm chart or using a managed service like Composer. These options offer enough flexibility in configuration to ensure that each team's environment is tailored to their specific needs, from resource allocation to custom execution environments.
When self-hosting Airflow, e.g., on Kubernetes (not using managed Airflow services), a separate namespace should be created within the Kubernetes cluster for each Airflow instance. If sharing a database server, the database for each Airflow instance should be hosted in a separate database schema. One can also consider using the same Kubernetes Clusters to execute workloads of tasks (check out Kubernetes Executor).
Authentication can be streamlined across these multiple instances. With features like Airflow’s Auth Manager, a centralized authentication proxy, such as Keycloak, can provide unified access to all web server instances. This allows for simplified authentication management across teams with a single sign-on (SSO) experience and consistent URL structures for accessing the UI.
Leveraging Airflow’s Helm chart simplifies Kubernetes orchestration by automating deployment, scaling, and configuration across environments. Additionally, using infrastructure-as-code (IaC) tools like Terraform enables consistent, repeatable infrastructure setups, making it easier to manage infrastructure changes and track versions. Implementing centralized logging, monitoring with tools like Prometheus or Grafana, and automated backups further helps streamline operations and reduces the overhead of maintaining multiple Airflow instances.
This solution is ideal for prioritizing cost-efficiency and simplicity over strict team isolation. Managed Airflow services like Google Composer or AWS MWAA align well with this approach since they're optimized for running a single, shared Airflow instance.
A single Airflow instance is shared across multiple teams, with all resources (database, web server, scheduler, and workers) being used collectively. While each team can have separate execution environments, task queues, and DAG processing configurations, they still operate within the same core infrastructure, especially the database. This lack of true isolation introduces potential risks, as one team’s actions could inadvertently impact another’s workflows or configurations.
Achieving full tenant isolation with a single Airflow deployment is NOT possible out-of-the-box because each Airflow environment uses a single database, and Airflow lacks fine-grained access control for the database. As a result, there is no built-in mechanism to prevent DAG authors from making changes to or deleting any database contents, including altering the state or return values of other DAGs or altering/removing DAGs created by different teams and access to critical elements like Connections, Variables, and the overall metadata store. There is no way to fully prevent a malicious actor from one team from impacting another, as each DAG author has unrestricted access to the entire database, and limitations are hard to effectively implement.
Some control can be enforced through custom CI/CD checks or by imposing user restrictions, such as using DAG factories to limit unauthorized operations in the code. However, this approach shifts the access control issue elsewhere, requiring additional external implementations. For example, while CI/CD might validate that DAGs access only permitted data, it raises a new question: who oversees and approves the CI/CD process itself?
Setting up a shared Airflow instance for multiple teams involves configuring Airflow to handle multiple execution environments, with the potential need to manage multiple task queues depending on specific requirements. This can be done using Cluster Policies to enforce separation between team workloads and operators, such as setting team-specific Kubernetes Pod Templates or Celery queues. Separate node pools should also be used to avoid performance bottlenecks.
Managing user roles and restricting access to specific DAGs or Airflow components in the UI can be done with built-in Airflow mechanisms. However, because the database and core resources are shared, you must carefully manage permissions and configurations to prevent teams from interfering with each other's workflows (see the `Important disclaimer` above).
Teams can utilize different sets of DAG file processors for each team's DAGs folder in the Airflow DAG folder. Multiple dag processors can be run with a –subdir option.
Currently, the only way to provide strong isolation is for each team/tenant to have their own instance of Airflow that does not share any resources with other Airflow instances. This is because, in Airflow 2, there is no built-in mechanism that can prevent a dag author from accessing and altering anything in the DB.
Managed Airflow (Composer, MWAA, Astro, etc.) usually takes care of authentication for you and it's available out of the box. By default, Airflow uses Flask AppBuilder (FAB) auth manager for authentication/authorization so that you can create users, roles and assign permissions. You can also easily provide your own auth manager when needed.
Airflow has a mechanism for access control that lets you decide what DAGs a user can see, edit, delete, etc., but it has some limitations that make it possible to prevent such a scenario only on the UI level.
In more detail, Airflow UI Access Control only controls visibility and access in Airflow UI, DAG UI, and stable API in Airflow 2. Airflow UI Access Control does not apply to other interfaces that are available to users, such as Airflow CLI commands. This model is not enforced in the DAG and tasks code. For example, you can deploy a DAG that changes Airflow roles and user assignments for these roles.
As mentioned above, Airflow does not have a mechanism to enforce it. Airflow trusts the DAG authors, so any variable or connection can be easily accessed / altered / or deleted from within the DAG.
In such cases, using an external secrets backend or implementing a custom solution, such as assigning a separate Service Account (SA) for each DAG ID and requiring DAG authors to request access to that specific SA, can help achieve better isolation and security.
It's also worth considering establishing conventions and a mutual trust agreement between users, ensuring that even though they technically have access to each other's secrets, they agree not to misuse this access. In the end, this approach can make setup and maintenance easier, while significantly reducing costs by sharing a single Airflow instance across multiple users or teams.
Some control can also be enforced through custom CI/CD checks. However, this approach shifts the access control issue elsewhere, requiring additional external implementations. For example, while CI/CD might validate that DAGs access only permitted connections, it raises a new question: Who oversees and approves the CI/CD process itself?
There is no built-in Airflow mechanism designed to prevent that scenario from happening.
One option is to have some CI/CD checks. Another option is to use Airflow's cluster policies that can verify, e.g., if dag is coming from a specific directory (using dag.fileloc) and adheres to some predefined rules (e.g., dag_id starts with the name of the directory that the team has access to). One team will not be able to overwrite the DAG of another team with such validation in place. Cluster policy can either skip such a DAG or change its dag_id (it can break workflows that trigger DAG based on dag_id).
Apart from implementing some access policies in the UI that prevent the team from seeing anything but their own DAGs, some access policies to the files themselves must be implemented. Whether you are using standalone or managed Airflow, in the dags/ directory, you can create subdirectories for each team and only grant them permissions to their own directory. Airflow should then process all the files from the dags/ folder and all its subdirectories.
The scheduler and dag_processor processes should be run separately. For high availability, you can run multiple schedulers concurrently. You can also run multiple dag processors, e.g., one per team directory (with a –subdir option).
Each team can use its own Service Account (SA) key, mounted on Kubernetes. Even when secrets are mounted manually, it is necessary to prevent the DAG author from choosing the “wrong” executor. You can utilize the pod_mutation_hook to dynamically modify and append necessary configurations to worker definitions during runtime, preventing users from tampering with the worker setup at the DAG level. We can also enforce this restriction by using DAG factories or CI/CD checks that ensure the team uses only its own secrets and connections or conducting code reviews to verify compliance.
Alternatively, you can configure a separate set of workers (and DAG processors) for each team, assigning them to specific queues. A combination of cluster policies and DAG file processors with subdirectory organization can limit DAGs from one folder to specific workers. While not completely foolproof, this method restricts queues and enforces some separation.
Grant the Service Account of one team permission to access only the specific presentation layer of the other team’s data. This ensures controlled and intentional data sharing.
Instead of relying on Airflow Variables or Connections, which lack fine-grained permissions, sensitive information can be stored and managed using Kubernetes secrets. These secrets can be manually mounted and tied to specific teams. However, additional security checks should be implemented in the deployment process to ensure that secrets are only accessible to their respective teams.
Astronomer recommends providing separate Airflow instances (url1, url2, url3)
Google provided a list of pros and cons of single-tenant and multi-tenant Composer (url)
Amazon recommends providing separate Airflow instances but provides some security steps when using a single Airflow for multiple teams (url)
Disclaimer: This analysis is based on publicly available information about how companies are deploying Airflow, such as recordings, blog posts, and other shared materials. It does not imply that the described practices represent the entirety of their deployment strategies.
Apple (url)
Adobe (url)
… without providing direct access to Airflow for end users.
… with built-in Airflow RBAC (with some small custom extensions sometimes)
Balyasny Asset Management (BAM) (url)
Delivery Hero (url [11:20])
DXC Technology (url)
Fanduel (url [7:55])
kiwi.com (url)
Shopify (url, lessons learned blog post)
Snap (url)
Based on publicly available resources, real-world implementations of multi-team Airflow deployments vary widely. However, the majority opt for separate Airflow environments, typically deployed on Kubernetes or managed services for better isolation and scalability.
While some companies may be using a single Airflow instance as the foundation, restrict direct user access through custom interfaces or DSLs. This approach enforces team separation at an application level, enabling granular control that Airflow alone cannot provide. However, this method often involves building an entirely separate application layer that merely leverages Airflow as a backend, requiring significant effort to develop and maintain over time.
Others adopt a hybrid approach, sharing a single Airflow instance for less critical teams while providing separate, isolated deployments for more sensitive or high-impact workloads. This approach enables cost savings when trust between users can be established while balancing resource efficiency with the need for strict separation of critical operations.
These diverse strategies highlight the trade-offs between isolation, operational complexity, and budget constraints, providing valuable lessons for designing Airflow solutions in multi-team settings.
Although work on Airflow 3 is underway, and some AIPs have been accepted, details still need to be resolved. While key concepts have been agreed upon, there's no guarantee that everything will be implemented exactly as planned or described in this document.
The main AIP for multi-team setup in Airflow is AIP-67 (but inside it, you can find the list of other AIPs that are making it possible, e.g., AIP-72 that replaced AIP-44 and will provide fine-grained access to db only via Task SDK)
You can track work progress on Airflow 3 here, but what's important to note is that the AIP-67 is NOT scheduled to be delivered in 3.0 but somewhere later.
Some bullet points from AIP-67 (what most probably will be possible in the future):
Choosing the right Airflow setup requires careful consideration of your team's needs for isolation, resource management, and collaboration. In its current form, Airflow 2 lacks native multi-tenancy support, meaning that organizations looking for strong separation between teams should ideally provide each team with its own dedicated Airflow environment. This approach ensures that issues or failures within one team's workflows don't impact others. However, if resource sharing is a priority, there are ways to deploy Airflow such that teams can share an instance, though this comes with trade-offs in terms of security and isolation. Trust between teams is essential in such setups, as preventing one team's actions from affecting another is difficult.
Looking ahead, Airflow 3 promises better support for multi-team environments. This forthcoming feature could streamline deployment for organizations managing multiple teams, though the implementation details are still evolving. When designing your Airflow architecture, it's important to weigh current limitations against future capabilities.
Cluster policies - check or mutate DAGs or Tasks on a cluster-wide level
Our recently released white paper, "Data Democratization Through Data Management" offers an in-depth exploration of the subject. This article will…
Read moreData Mesh as an answer In more complex Data Lakes, I usually meet the following problems in organizations that make data usage very inefficient: Teams…
Read moreThe client who needs Data Analytics Platform ING is a global bank with a European base, serving large corporations, multinationals and financial…
Read moreSQL language was invented in 1970 and has powered databases for decades. It allows you not only to query the data, but also to modify it easily on the…
Read moreHardly anyone needs convincing that the more a data-driven company you are, the better. We all have examples of great tech companies in mind. The…
Read moreFounded by former Spotify data engineers in 2014, GetInData consists of a team of experienced and passionate Big Data veterans with proven track of…
Read moreTogether, we will select the best Big Data solutions for your organization and build a project that will have a real impact on your organization.
What did you find most impressive about GetInData?