Skip to main content

PostgreSQL

Preparation

To allow Shaped to connect to your Postgres database, you need to create a read-only user and pass the details into a Create Dataset request. You can create a read-only user on your Postgres DB with the following commands:

# 1. Create a new user. 
CREATE USER read_only_user WITH PASSWORD 'secure_password1!';

# 2. Grant the user connection to the database.
GRANT CONNECT ON DATABASE database_name TO read_only_user;

# 3. Grant the user usage of the schema.
GRANT USAGE ON SCHEMA public TO read_only_user;

# 4. Grant the user read access to all the tables in the schema. Note you can also
# restrict this to your specific user, item and interaction views.
GRANT SELECT ON ALL TABLES IN SCHEMA public TO read_only_user;

# 5. Grant the group access to future tables in the schema.
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO read_only_user;

Dataset Configuration

Required fields

FieldExampleDescription
schema_typePOSTGRESSpecifies the connector schema type, in this case "Postgres".
tablemoviesThe name of the table to sync.
useryour_userAccess account username.
passwordpAssw0rd1!Access account password.
hostmy-postgres-db.xxxxxxx.us-east-2.rds.amazonaws.comDatabase hostname.
port5432Database port (the default for Postgres is 5432).
databasemovielensThe name of the database that contains table to sync.
replication_keyupdated_atThe name of the column that contains a datetime key or ascending id for ordering data during incremental syncs.

Optional fields

FieldExampleDescription
database_schemapublicThe name of the schema that contains table to sync.
columns["productId", "color", "brand", "stockLevel"]The name of the columns you wish to sync from Postgres into Shaped. If not specified, all columns will be synced.
unique_keys["productId"]Specify a list of columns that uniquely identify a row in the table, if duplicate rows are inserted with these keys, the latest row will be used.
ssl_certificate_authority(certificate content)The certificate authority that should be used to verify the server's identity.
ssl_client_certificate(certificate content)The certificate that should be used to verify your identity to the server.
ssl_client_private_key(key content)The private key for the certificate you provided.
ssh_tunnel_hostssh.example.comThe hostname of the SSH server to tunnel through.
ssh_tunnel_port22The port of the SSH server (default is 22).
ssh_tunnel_usernameyour_ssh_usernameThe username for SSH authentication.
ssh_tunnel_passwordyour_ssh_passwordThe password for SSH authentication (if not using private key).
ssh_tunnel_private_key(key content)The content of the private key for SSH authentication. Can be used instead of private_key_path.
ssh_tunnel_private_key_password(password)The password for the private key (if using ssh_tunnel_private_key).

Dataset Creation Example

Below is an example of a Postgres dataset connector configuration:

name: your_postgres_dataset
schema_type: POSTGRES
table: movies
user: your_user
password: pAssw0rd1!
host: my-postgres-db.xxxxxxx.us-east-2.rds.amazonaws.com
port: 5432
database: movielens
replication_key: updated_at

The following payload will create a Postgres dataset and begin syncing data from Shaped using the Shaped CLI.

shaped create-dataset --file dataset.yaml