Snowflake
Preparation
To allow Shaped to connect to your Snowflake database, you need to create a read-only user and share those credentials through the Create Dataset endpoint. You can create a read-only user on a Snowflake database instance with the following commands:
-- 1. Create a role that will be used to grant access to the specific table.
CREATE ROLE read_only_role;
-- 2. Create a user and grant the read-only role to the user.
CREATE USER read_only_user
PASSWORD = '<password>'
DEFAULT_ROLE = read_only_role
MUST_CHANGE_PASSWORD = FALSE;
-- 3. Grant the newly created role to the user.
GRANT ROLE read_only_role TO USER read_only_user;
-- 4. Grant the SELECT permission on the specific table to the read-only role.
GRANT SELECT ON TABLE <database_name>.<schema_name>.<table_name> TO ROLE read_only_role;
-- 5. Grant USAGE permission on the schema and database containing the table.
GRANT USAGE ON SCHEMA <database_name>.<schema_name> TO ROLE read_only_role;
GRANT USAGE ON DATABASE <database_name> TO ROLE read_only_role;
Replace password, database_name, schema_name, and table_name with the appropriate values for your Snowflake account.
Dataset Configuration
Required fields
Field | Example | Description |
---|---|---|
schema_type | SNOWFLAKE | Specifies the connector schema type, in this case "Snowflake". |
table | analytics.events | The name of the table to sync, prefixed with the schema name if applicable. |
user | your_user | Access account username. |
password | pAssw0rd1! | Access account password. |
account | snowflake-access | Your Snowflake account identifier. See Account Identifiers. |
database | movielens | The name of the database that contains table to sync. |
database. | replication_key | updated_at | The name of the column that contains a datetime key or ascending id for ordering data during incremental syncs. Note if your column name is uppercase, you should convert to lowercase.
Optional fields
Field | Example | Description |
---|---|---|
columns | ["productId", "color", "brand", "stockLevel"] | The name of the columns you wish to sync from Snowflake into Shaped. If not specified, all columns will be synced. Note if your column name(s) are uppercase, you should convert to lowercase. |
database_schema | public | The name of the schema that contains table to sync. |
role | read_only_role | The name of the role to use when connecting to the database. |
warehouse | your_warehouse | The name of the warehouse to use when connecting to the database. |
unique_keys | ["productId"] | Specify a list of columns that uniquely identify a row in the table, if duplicate rows are inserted with these keys, the latest row will be used. |
batch_size | 100000 | The number of rows to fetch from the database in each batch, changing this can improve throughput for large tables. The default is 10000. |
Dataset Creation Example
Below is an example of a Snowflake dataset connector configuration:
name: your_snowflake_dataset
schema_type: SNOWFLAKE
table: movies
user: your_user
password: pAssw0rd1!
account: snowflake-access
database: movielens
replication_key: updated_at
The following payload will create a Snowflake dataset and begin syncing data from Shaped using the Shaped CLI.
shaped create-dataset --file dataset.yaml