Skip to main content

BigQuery

Shaped supports BigQuery through a Dataset Connector which will automatically sync data between your BigQuery data warehouse and Shaped.

Preparation

To allow Shaped to connect to your BigQuery data warehouse, you need to grant Shaped’s GCP service account read-only access to your project, dataset, and tables. You can do this through the GCP portal or with the following steps:

  1. Install the gcloud command line tool.
  2. Contact us for our service account via email.
  3. Grant our service account permission to access your project, dataset, and tables via the roles/bigquery.dataViewer, roles/bigquery.jobUser and roles/bigquery.readSessionUser roles.
gcloud projects add-iam-policy-binding <YOUR_PROJECT> \
--member='serviceAccount:<OUR_SERVICE_ACCOUNT>' \
--role='roles/bigquery.dataViewer'

gcloud projects add-iam-policy-binding <YOUR_PROJECT> \
--member='serviceAccount:<OUR_SERVICE_ACCOUNT>' \
--role='roles/bigquery.jobUser'

gcloud projects add-iam-policy-binding <YOUR_PROJECT> \
--member='serviceAccount:<OUR_SERVICE_ACCOUNT>' \
--role='roles/bigquery.readSessionUser'

Dataset Configuration

Below are the fields required for the BigQuery dataset connector:

FieldExampleDescription
schema_typeBIGQUERYSpecifies the connector schema type, in this case "BIGQUERY".
config.tablebq-project.shaped.dataSpecifies the Google BigQuery table, qualified with the project and dataset. If your project name contains invalid YAML characters, you must surround it with backticks and double-quotes.
config.columns["productId", "color", "brand", "stockLevel", "createdAt"]The name of the columns you wish to sync from BigQuery into Shaped.
config.datetime_keycreatedAtThe name of the column that contains a datetime key for ordering data during incremental syncs.
config.start_datetime2020-01-01T00:00:00ZOptionally specify a start datetime for the initial sync. If not specified, the initial sync will start from the earliest datetime in the table.
config.filters["event_type IN ['viewed', 'liked']" ]Optionally specify a list of filters to apply to the table. Filters must be in the form of a SQL WHERE clause.

Dataset Creation Example

Below is an example of a BigQuery dataset connector configuration:

dataset_name: your_bigquery_dataset
schema_type: BIGQUERY
config:
table: "`bq-project`.shaped.`data`"
columns: ["productId", "color", "brand", "stockLevel", "createdAt"]
datetime_key: "createdAt"
start_datetime: "2020-01-01T00:00:00Z"
filters: ["stockLevel > 0"]

The following payload will create a BigQuery dataset and begin syncing data from Shaped using the Shaped CLI.

shaped create-dataset --file dataset.yaml