The S3 connector allows you to create a Shaped model directly from a set of Parquet, CSV, or JSONL files within an S3 bucket.

The Parquet schema must match the schema of the user, item and interactions mapped within the Create Model call.

Shaped fetches data from the given S3 bucket periodically each time the model is trained. To ensure it’s trained on the most recent data, make sure you push the latest data to S3 periodically.

Resource Access

Shaped needs access to the S3 bucket that contains your files. This can be done by granting explicit read access to the Shaped AWS Customer Data Access Role.

To grant access:

  1. Create an IAM Trust Policy attached to your S3 bucket that grants the following permissions:
    1. s3:GetObject
    2. s3:ListBucket
  2. In this trust policy, grant the Shaped AWS Customer Data Access Role IAM permissions to assume the role:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::{account_id}:role/CustomerS3DataAccessRole"
      },
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resources": [
        "arn:aws:s3:::{your_bucket}",
        "arn:aws:s3:::{your_bucket}/*"
      ]
    }
  ]
}

The details of shaped_aws_account_id are available on request.

Connector Config Definition

Below are the fields required for the File connection_config

"connector_configs": [{
    "type": "File",
    "id": "file",
}]
FieldExampleDescription
type“File”Specifies the connector type, in this case “File”.
id“file”Specifies the connector id, in this case “file”.