Skip to main content

ClickHouse

Preparation

To allow Shaped to connect to your ClickHouse database, you need to create a read-only user and share those credentials through the Create Dataset endpoint. You can create a read-only user on a ClickHouse database with the following commands:

# 1. Create a new user with a password
CREATE USER read_only_user IDENTIFIED BY 'secure_password1!';

# 2. Grant SELECT privileges on the database you want to access
GRANT SELECT ON database_name.* TO read_only_user;

# 3. If you want to restrict access to specific tables, use:
GRANT SELECT ON database_name.table_name TO read_only_user;

Dataset Configuration

Required fields

FieldExampleDescription
schema_typeCLICKHOUSESpecifies the connector schema type, in this case "CLICKHOUSE".
tableeventsThe name of the table to sync.
userread_only_userAccess account username.
passwordsecure_password1!Access account password.
hostclickhouse.example.comDatabase hostname.
port9440Database port (the default for ClickHouse HTTPS is 8443, HTTP is 8123).
replication_keycreated_atThe name of the column that contains a datetime key or ascending id for ordering data during incremental syncs.

Optional fields

FieldExampleDescription
databaseanalyticsThe name of the database that contains the table to sync. If not specified, the default database will be used.
columns["userId", "eventType", "timestamp", "properties"]The name of the columns you wish to sync from ClickHouse into Shaped. If not specified, all columns will be synced.
unique_keys["eventId"]Specify a list of columns that uniquely identify a row in the table, if duplicate rows are inserted with these keys, the latest row will be used.
description"User events data"A description of the dataset.
schedule_interval"@hourly"The schedule on which to sync data. Defaults to "@hourly".

Dataset Creation Example

Below is an example of a ClickHouse dataset connector configuration:

name: clickhouse_events_dataset
schema_type: CLICKHOUSE
table: users
user: read_only_user
password: secure_password1!
host: clickhouse.example.com
port: 9440
database: analytics
replication_key: updated_at
unique_keys:
- userId
columns:
- userId
- eventType
- timestamp
- properties
- created_at
- updated_at

The following payload will create a ClickHouse dataset and begin syncing data from Shaped using the Shaped CLI:

shaped create-dataset --file dataset.yaml