I/O Subsystem¶
Load and extract data into/from CrateDB.
About¶
Using the InfluxDB and MongoDB I/O subsystems, you can transfer data from InfluxDB and MongoDB to CrateDB and CrateDB Cloud.
What’s inside¶
A one-stop command ctk load table
to load data into database tables.
Installation¶
Latest release.
pip install --upgrade 'cratedb-toolkit[all]'
Development version.
pip install --upgrade 'cratedb-toolkit[all] @ git+https://github.com/crate/cratedb-toolkit.git'
General Notes¶
By default, the table name will be derived from the name of the input resource.
If you would like to specify a different table name, use the --table
command
line option, or the CRATEDB_TABLE
environment variable.
When aiming to write into a table in a different database schema, use the
--schema
command line option, or the CRATEDB_SCHEMA
environment variable.
When omitting this parameter, the default value doc
will be used.
Cloud Import API¶
Using the CrateDB Cloud Import API, you can import files in CSV, JSON, and Parquet formats.
Prerequisites¶
Authenticate with CrateDB Cloud using one of those identity providers: Cognito, Azure AD, GitHub, Google.
croud login --idp azuread
To discover the list of available database clusters.
croud clusters list
Define the cluster id of your CrateDB Cloud Cluster you are aiming to connect to, and its connection credentials.
export CRATEDB_CLOUD_CLUSTER_ID='e1e38d92-a650-48f1-8a70-8133f2d5c400'
Define authentication credentials, to satisfy the ctk shell
command you will
use to inspect the database content after transferring data.
export CRATEDB_USERNAME='admin'
export CRATEDB_PASSWORD='3$MJ5fALP8bNOYCYBMLOrzd&'
Usage¶
Load data into database table.
ctk load table 'https://cdn.crate.io/downloads/datasets/cratedb-datasets/cloud-tutorials/data_weather.csv.gz'
ctk load table 'https://github.com/crate/cratedb-datasets/raw/main/cloud-tutorials/data_marketing.json.gz'
ctk load table 'https://cdn.crate.io/downloads/datasets/cratedb-datasets/timeseries/yc.2019.07-tiny.parquet.gz'
Query and aggregate data using SQL.
ctk shell --command="SELECT * FROM data_weather LIMIT 10;"
ctk shell --command="SELECT * FROM data_weather LIMIT 10;" --format=csv
ctk shell --command="SELECT * FROM data_weather LIMIT 10;" --format=json
Backlog¶
Exercise data imports from AWS S3 and other Object Storage providers.