Skip to main content

Starlake with Airflow / Dagster

tip

This is a Docker-based setup for Starlake with Airflow or Dagster pre-configured.

This is ideal for developers and data engineers who want to quickly set up a data pipeline environment without the need to separately install and configure Airflow or Dagster. Airflow and Dagster wxill be automatically installed and configured as part of the Starlake setup. It provides a streamlined setup process, allowing you to focus on building and managing your data projects rather than dealing with infrastructure configurations.

The core project is open source and can be found at starlake-ai/starlake

Starlake may be installed on your own infrastructure, whether on your on-premises or in the cloud.

You may install it on a server or a developer workstation using the docker-compose file provided to run the Starlake UI with Airflow or Dagster. Three installation options are available for the following platforms:

  • Starlake on Docker Compose
  • Starlake on Kubernetes
  • Starlake on AWS / GCP / Azure

Starlake on Docker Compose

  1. Clone this repository
git clone https://github.com/starlake-ai/starlake-docker.git
  1. Change directory to the cloned repository
cd starlake-docker/docker
  1. Run the following command to start Starlake UI with Airflow
docker compose up

to run Starlake UI with Dagster, run the following command

docker compose -f docker-compose-dagster.yml up

To run on a different port, set the SL_UI_PORT environment variable. For example, to run on port 8080, run the following command

SL_UI_PORT=8080 docker-compose up
  1. Open your browser and navigate to http://localhost or if you chose a different port http://localhost:$SL_UI_PORT to access Starlake UI

That's it! You have successfully started Starlake on your own infrastructure.

Mounting external projects

If you have any starlake container projects and want to mount it:

  • run setup_mac_nfs.sh if you are on mac in order to expose your folder via NFS. Modify the root folder to share if necessary. By default it is set to /user. This change is not specific to starlake and may be used in other container.
  • comment - external_projects_data:/external_projects
  • uncomment - starlake-prj-nfs-mount:/external_projects
  • go to the end of the file and modify the path of the volume to point to the starlake container folder

Starlake container folder should contain the starlake project folder:

 my_container_to_mount
|
- sl_project_1
|
- metadata
- ...
|
- sl_project_2
|
- metadata
- ...

If you have many container projects, create as many volume as needed.

Limit

  • Currently, we cannot mount the starlake projects directory directly under the mounted /external_projects. Subfolders of the mounted external project can't be accessed correctly.
  • This feature has been tested only on MacOS and linux at the moment

Stopping Starlake UI

To stop Starlake UI, run the following command in the same directory

docker compose down