extract-rest-data

Synopsis

starlake extract-rest-data [options]

Description

Extract data from REST API endpoints into CSV files. Supports pagination (offset, cursor, link header, page number), authentication (bearer, API key, basic, OAuth2), rate limiting, and parent-child endpoint relationships.

The extracted CSV files can then be ingested using starlake load.

Examples

starlake.sh extract-rest-data --config my-rest-api --outputDir /tmp/api-data starlake.sh extract-rest-data --config my-rest-api --outputDir /tmp/api-data --limit 1000

Parameters

Parameter	Cardinality	Description
--config `<value>`	Required	REST API extraction config file (in metadata/extract/)
--outputDir `<value>`	Required	Where to output CSV files
--limit `<value>`	Optional	Limit number of records per endpoint
--parallelism `<value>`	Optional	Parallelism level for endpoint extraction. Default: 16
--incremental `<value>`	Optional	Only extract new data since last extraction. Uses incrementalField from endpoint config.
--resume `<value>`	Optional	Resume extraction from where a previous run failed, skipping already-extracted pages.
--outputFormat `<value>`	Optional	Output format: csv (default) or jsonl (JSON Lines, preserves nested structures)
--reportFormat `<value>`	Optional	Report format: console, json, html

Synopsis​

Description​

Examples

Parameters​

Synopsis

Description

Parameters