Skip to main content

Orchestrate Load Jobs

Starlake generates orchestration DAGs automatically from your YAML load and transform definitions. You do not write DAG code manually. The starlake dag-generate command analyzes job dependencies and produces ready-to-deploy DAG files for your orchestrator.

Supported orchestrators:

  • Airflow -- Generates Python DAG files.
  • Dagster -- Generates Dagster job definitions.
  • Snowflake Tasks -- Generates Snowflake Task definitions.

Why automatic DAG generation matters

In a traditional data pipeline, you must write and maintain orchestration code alongside your data logic. Starlake eliminates this duplication: your YAML definitions already declare the tables, domains and dependencies. The dag-generate command derives the execution graph from these declarations.

This means:

  • Zero DAG code to write or maintain.
  • Dependencies between load and transform jobs are resolved automatically.
  • Adding a new table or domain updates the DAG on the next generation.

How to generate and deploy DAGs

  1. Ensure all YAML configurations are complete -- Table, domain and transform definitions must be in place.
  2. Run the generation command:
starlake dag-generate
  1. Deploy the generated files -- Copy the output to your Airflow DAGs folder, Dagster repository or Snowflake Tasks environment.

The command analyzes dependencies between your load and transform jobs and produces the correct execution order.

Next Steps

Frequently Asked Questions

How do I generate DAGs for load jobs in Starlake?

Run the starlake dag-generate command. It analyzes dependencies between load and transform jobs and produces DAG files for the configured orchestrator.

Which orchestrators does Starlake support for load jobs?

Starlake supports Airflow, Dagster and Snowflake Tasks for load job orchestration.

Do I need to write Airflow DAGs manually for Starlake?

No. The starlake dag-generate command automatically generates DAGs from the YAML definitions in your project.

Are dependencies between load and transform jobs handled automatically?

Yes. DAG generation analyzes the dependencies declared in YAML files and produces a correct execution graph.

Where can I find the full orchestration documentation for Starlake?

In the Orchestration Tutorial and DAG Customization pages.