Skip to main content

Set Up Starlake with Airflow or Dagster

Starlake supports Airflow and Dagster as orchestration engines. The starlake-docker repository provides Docker Compose files that install and configure Starlake, the web UI, and the chosen orchestrator in a single command. This setup is designed for developers and data engineers who want a ready-to-use pipeline environment without manual orchestrator configuration.

The core project is open source: starlake-ai/starlake.

Deploy Starlake with Airflow or Dagster via Docker Compose

Prerequisites

  • Docker and Docker Compose installed and running
  • Git installed (to clone the repository)

Step-by-Step Setup

  1. Clone the starlake-docker repository
git clone https://github.com/starlake-ai/starlake-docker.git
  1. Navigate to the docker directory
cd starlake-docker/docker
  1. Start the stack

For Airflow:

docker compose up

For Dagster:

docker compose -f docker-compose-dagster.yml up
  1. Open the Starlake UI -- Navigate to http://localhost in your browser.

To run on a different port, set the SL_UI_PORT environment variable:

SL_UI_PORT=8080 docker compose up

Stop the Starlake Docker Compose Stack

To stop all services, run in the same directory:

docker compose down

Mount External Starlake Projects in Docker

If you have existing Starlake projects and want to access them from the Docker setup, mount their parent folder as an NFS volume.

Steps for macOS

  1. Run setup_mac_nfs.sh to expose your folder via NFS. Modify the root folder to share if needed (default: /user).
  2. In docker-compose.yml, comment out - external_projects_data:/external_projects.
  3. Uncomment - starlake-prj-nfs-mount:/external_projects.
  4. At the bottom of the file, modify the volume path to point to your parent folder.

Expected Folder Structure

The mounted parent folder must contain one or more Starlake project directories, each with a metadata/ subdirectory:

my_parent_folder/
├── sl_project_1/
│ └── metadata/
│ └── ...
├── sl_project_2/
│ └── metadata/
│ └── ...

If you have multiple parent folders, create a separate volume for each one.

Known Limitations

  • You cannot mount Starlake projects directly under /external_projects. Place them inside a subfolder.
  • This feature has been tested on macOS and Linux only.

Deploy Starlake on Kubernetes

Kubernetes deployment is available via Helm charts. Contact the Starlake engineering team on Slack for the latest Helm chart and deployment instructions.

Deploy Starlake on AWS, GCP, or Azure

Cloud deployment via Terraform is available. Contact the Starlake engineering team on Slack for the latest Terraform scripts targeting AWS, GCP, or Azure.

Frequently Asked Questions

How do I set up Starlake with Airflow?

Clone the starlake-docker repository, navigate to the docker folder, and run docker compose up. Airflow is automatically installed and configured. Access the Starlake UI at http://localhost.

How do I set up Starlake with Dagster instead of Airflow?

Use the Dagster-specific compose file: docker compose -f docker-compose-dagster.yml up. Dagster replaces Airflow as the orchestrator in this configuration.

Can I deploy Starlake on Kubernetes?

Kubernetes deployment is available. Contact the Starlake engineering team on Slack to get the latest Helm chart.

What orchestration tools does Starlake support?

Starlake supports Airflow and Dagster. Both are pre-configured in the Docker Compose setup.

How do I mount external projects in the Starlake Docker setup?

On macOS, run setup_mac_nfs.sh to expose your folder via NFS. Then modify the docker-compose.yml to use the NFS volume mount instead of the default volume. External project folders must contain the Starlake project structure (metadata/ etc.).

How do I change the port for Starlake UI?

Set the SL_UI_PORT environment variable before running Docker Compose. Example: SL_UI_PORT=8080 docker compose up.