Skip to main content

extract-rest-data

Synopsis

starlake extract-rest-data [options]

Description

Extract data from REST API endpoints into CSV files. Supports pagination (offset, cursor, link header, page number), authentication (bearer, API key, basic, OAuth2), rate limiting, and parent-child endpoint relationships.

The extracted CSV files can then be ingested using starlake load.

Examples

starlake.sh extract-rest-data --config my-rest-api --outputDir /tmp/api-data starlake.sh extract-rest-data --config my-rest-api --outputDir /tmp/api-data --limit 1000

Parameters

ParameterCardinalityDescription
--config <value>RequiredREST API extraction config file (in metadata/extract/)
--outputDir <value>RequiredWhere to output CSV files
--limit <value>OptionalLimit number of records per endpoint
--parallelism <value>OptionalParallelism level for endpoint extraction. Default: 16
--incremental <value>OptionalOnly extract new data since last extraction. Uses incrementalField from endpoint config.
--reportFormat <value>OptionalReport format: console, json, html