Skip to main content

Getting Started

Starlake Skills is an open-source Claude Code plugin that provides 48 specialized skills for building, configuring, and operating Starlake data pipelines.

Whether you're setting up a new data project, configuring ingestion pipelines, writing transformations, or deploying orchestration DAGs — Starlake Skills gives your AI assistant deep expertise in every aspect of the Starlake platform.

What You Can Do

CategorySkillsExamples
Ingestion & Loading9 skillsAuto-infer schemas, load CSV/JSON/XML, Kafka, Elasticsearch
Transformation2 skillsSQL/Python transformations with write strategies
Extraction5 skillsExtract schemas and data from BigQuery, JDBC sources
Schema Management5 skillsBootstrap projects, Excel-to-YAML, DDL generation
Data Quality1 skillExpectations with Jinja2 macros and validation patterns
Lineage4 skillsColumn-level, table-level, and ACL dependency tracking
Operations8 skillsValidation, metrics, freshness, GizmoSQL, migrations
Security2 skillsIAM policies, RLS, CLS, privacy transformations
Orchestration2 skillsAirflow and Dagster DAG generation and deployment
Utilities5 skillsParquet conversion, comparisons, site generation

Supported Platforms

Data Warehouses

  • BigQuery — Native and Spark loaders
  • Snowflake — JDBC connectivity
  • DuckDB — Embedded SQL engine
  • PostgreSQL — JDBC connectivity
  • Redshift — JDBC connectivity
  • Databricks — FS and Spark engines

Processing Engines

  • Spark — Distributed processing
  • Native — Built-in Starlake engine
  • DuckDB — Embedded analytical SQL

Orchestration

  • Apache Airflow — Python DAG generation
  • Dagster — Asset-based orchestration

Data Formats

CSV, JSON, XML, Parquet, Elasticsearch indices, Kafka topics

How It Works

Starlake Skills integrates directly into Claude Code as a plugin. Once installed, you can ask Claude natural-language questions about any Starlake topic and receive expert guidance with production-ready configurations.

You: How do I load CSV files from GCS into BigQuery with deduplication?

Claude: [Uses the `load` skill to provide complete YAML configuration
with UPSERT_BY_KEY_AND_TIMESTAMP write strategy, domain config,
and schema definitions]

Each skill contains detailed knowledge about:

  • CLI command syntax and all available options
  • YAML configuration patterns with examples
  • Write strategies, sink configurations, and engine-specific behaviors
  • Best practices for production deployments
  • Troubleshooting guidance

Next Steps

  • Quickstart — Install and use your first skill in 5 minutes
  • Setup — Detailed installation and configuration options
  • Skills Catalog — Browse all 48 skills by category