Transform | Starlake

📄️ SQL Transform Tutorial: Create KPI Tables with Starlake

Step-by-step tutorial: transform data with SQL SELECT statements in Starlake. Create revenue, product, and order KPI tables. Automatic dependency resolution, lineage graph, recursive execution. Works with DuckDB, BigQuery, Snowflake.

📄️ Starlake Transform YAML Configuration: Write Strategies, Partitioning, ACL

Complete YAML configuration reference for Starlake transforms. Set write strategies (overwrite, upsert, append), table partitioning, clustering, row-level security, expectations, and cross-database writes.

📄️ SQL Transform Syntax: SELECT, Incremental Models, and Custom SQL in Starlake

Write SQL transforms in Starlake using standard SELECT statements. Configure incremental models with sl_start_date/sl_end_date, document calculated columns, and use custom MERGE/INSERT with parseSQL: false.

📄️ Python Transforms in Starlake: PySpark DataFrame Pipelines

Run Python transforms in Starlake with PySpark. Pass arguments via --options, return a DataFrame as SL_THIS temporary view, and materialize results to any table. Same YAML config as SQL transforms.

📄️ Export Starlake Transform Results to CSV, Parquet, or Another Database

Export Starlake transform results to CSV, JSON, Parquet, or Avro files. Write to cloud storage (GCS, S3) or another database using sink.connectionRef. YAML configuration examples included.

📄️ Orchestrate Starlake Transforms: Automatic DAG Generation for Airflow, Dagster, Snowflake Tasks

Generate execution DAGs for Starlake SQL and Python transforms. Automatic dependency resolution from SQL analysis. Supports Airflow, Dagster, and Snowflake Tasks. Includes load and transform jobs in correct order.