Skip to main content

Deploy Starlake Pipelines

Deploying a Starlake project to production involves three steps: setting environment variables, copying metadata to cloud storage, and deploying generated DAGs to your orchestrator. The Starlake Docker container runs on GCP, AWS, Azure, or on-premise with no vendor lock-in.

Deployment overview

  1. Set environment variables -- Configure SL_ENV and SL_ROOT in your Starlake Docker container.
  2. Copy metadata to cloud storage -- Upload your metadata folder to a cloud storage bucket.
  3. Deploy DAGs to your orchestrator -- Copy metadata/dags/generated to your orchestrator's DAGs folder.
  4. Verify -- Run a dry-run to confirm the pipeline works before enabling scheduling.

Set environment variables

Configure the Starlake Docker container with the target environment:

note

Use environment variables to abstract your project from the target environment as described in the Environment configuration guide.

Set SL_ENV to the target environment name and SL_ROOT to the project root directory inside the container.

Copy metadata to cloud storage

Copy your metadata folder to a cloud storage bucket accessible by the Starlake Docker container. The container runs load and transform commands against the metadata stored in cloud storage.

This approach works identically across cloud providers. The Starlake Docker container reads the metadata from the bucket and executes the pipeline commands.

Deploy DAGs to your orchestrator

Copy the contents of the metadata/dags/generated folder to your orchestrator's DAGs folder. The orchestrator schedules and triggers Starlake commands via the Docker container.

For DAG generation details, see the Orchestration Tutorial. For customization, see Customizing DAG Generation.

Frequently Asked Questions

How to deploy a Starlake project to production?

Copy the metadata folder to a cloud storage bucket, then run load/transform commands from the Starlake Docker container on any cloud provider.

What environment variables need to be configured?

SL_ENV (target environment) and SL_ROOT (project root directory) must be defined in the Starlake Docker container.

How to deploy DAGs to the orchestrator?

Copy the contents of the metadata/dags/generated folder to your orchestrator's DAGs folder.

Does Starlake work with any cloud provider?

Yes. The Starlake Docker container can run on GCP, AWS, Azure, or on-premise.

How to abstract differences between environments?

Use environment variables to parameterize your project. Environment configuration is described in the Environment section of the documentation.

What is the typical deployment flow?

  1. Configure SL_ENV and SL_ROOT.
  2. Copy metadata to cloud storage.
  3. Copy the generated DAGs to the orchestrator.
  4. The orchestrator runs Starlake commands via the Docker container.