Skip to main content

Native Load Mode

Native load mode bypasses the Spark validation layer and sends files directly to the data warehouse engine (BigQuery, Snowflake, Databricks or DuckDB). This improves ingestion speed when type validation and attribute scripts are not required.

How Starlake loads data

By default, Starlake uses Spark for data validation (type checking, regex patterns, attribute scripts) and then delegates the write to the target data warehouse engine. This two-step process guarantees data quality but adds overhead.

Native load mode skips the Spark validation step entirely. The data warehouse's own engine handles the file ingestion directly. Use native mode when:

  • Source data quality is already guaranteed upstream.
  • You need maximum ingestion throughput.
  • You do not rely on computed columns, type validation or row rejection.

Trade-offs: native vs Spark mode

CapabilitySpark modeNative mode
Type validation (regex)YesNo
Computed columns (script)YesNo
Row rejection to audit tableYesNo
Ingestion speedStandardFaster
File metadata columnsYesNo

When you need validation features, keep the default Spark mode. See Type Validation for details on what native mode disables.

Configuration priority

The loader setting follows a Table > Domain > Project priority. The most specific configuration wins.

Table level

Set loader: native in the table YAML to enable native load for a single table:

metadata/load/<domain>/<table>.sl.yml
table:
metadata:
...
loader: native
...
attributes:
...

Domain level

Set loader: native in the domain configuration to enable native load for all tables in the domain:

metadata/load/<domain>/_config.sl.yml
load:
metadata:
...
loader: native
...
tables:
...

Tables within the domain can override this setting individually.

Project level

Set loader: native in application.sl.yml to enable native load for the entire project:

metadata/application.sl.yml
application:
...
loader: native
...

Individual domains or tables can override this setting.

Frequently Asked Questions

What is native load mode in Starlake?

Native load mode lets you load files directly into the data warehouse without Spark validation. This can improve performance when validation is not needed.

How do I enable native load for a single table?

Add loader: native in the metadata section of the table YAML file (metadata/load/<domain>/<table>.sl.yml).

Can I enable native load mode for an entire domain?

Yes. Set loader: native in the _config.sl.yml file of the domain. All tables in the domain will use the native loader unless overridden at the table level.

How do I enable native load for the whole project?

Set loader: native in application.sl.yml. All tables in the project will use the native loader unless overridden at the domain or table level.

What is the configuration priority order for the loader?

Table > Domain > Project. The most specific configuration wins.

When should I avoid native load mode?

When you need type validation, on-the-fly transformations (attribute scripts), or rejection of invalid rows. Native load mode skips all Spark-side validation.