Skip to main content

Kinesis

Our real-time data streaming infrastructure is built on-top of the AWS Kinesis Data Streams service.

As part of this integration Shaped can also receive arbitrary schema-fied JSON events for customers that would like to deliver data in real-time, either by sending events directly to the Kinesis Data Stream or by sending events to the Shaped Dataset Insert API Method.

Dataset Configuration

Required fields

FieldExampleDescription
schema_typeKINESISSpecifies the connector schema type, in this case "KINESIS".
column_schemaSee belowSpecifies the schema of the dataset.
unique_keys["id"]Specify a list of columns that uniquely identify a row in the dataset, if duplicate rows are inserted with these keys, the latest row will be used.

Optional fields

FieldExampleDescription
tenant_aws_account_id"123456789012"Specifies the AWS Account ID of the tenant that will be sending data to the Kinesis Data Stream. If this field is provided, the Kinesis Access Role ARN will be provided to the tenant to assume in order to send events to the Kinesis Data Stream directly.

Creation Example

For the supported column data-types, see the Custom Dataset documentation.

custom_kinesis_dataset.yaml
name: my_custom_dataset
schema_type: KINESIS
unique_keys: [id]
column_schema:
event_name: String
event_time: DateTime
user_id: String
event_weight: Float
event_value: Int32
event_tags: Array(String)
json_payload: String
tenant_aws_account_id: "123456789012"

When creating a Kinesis dataset, the dataset will be provisioned asynchronously and will transition from SCHEDULING to DEPLOYING then ACTIVE once the Kinesis Data Stream is ready to receive events.

CLI

shaped create-dataset --file custom_kinesis_dataset.yaml

Delivering Data via Shaped API

The Shaped API provides an endpoint to insert events into the dataset with your account API key, Dataset Insert.

Delivering Data via Kinesis SDK

When the Kinesis dataset is provisioned, you will be provided with the following information from the Shaped team:

  • Kinesis Stream ARN, the Amazon Resource Name of the Kinesis Data Stream provisioned for your account.
  • Kinesis Access Role ARN, if a tenant_aws_account_id is provided, the Amazon Resource Name of the IAM Role that you will need to assume in order to send events to the Kinesis Data Stream directly.

In order to forward events from to the AWS Kinesis Data Stream SDK you will need to perform the following steps in your environment:

  1. Assume the IAM Role provided by the Shaped team, which will have appropriate permissions to be assumed from your principal.
  2. Use the appropriate AWS Kinesis SDK to send events to the provided Kinesis stream.

See: Writing Data to Kinesis Data Streams for additional information.