Skip to main content

Kafka

This connector is in Beta. Reach out to our support (hello@shaped.ai) if you face any issues when integrating your Kafka cluster with Shaped.

Preparation

In order for Shaped to successfully connect to your Kafka cluster, you need to:

  • ensure your Kafka cluster is publicly available. To use private connectivity options (such as AWS PrivateLink) reach out to our support (hello@shaped.ai).
  • enable SSL for your cluster and use a publicly-verifiable certificate. To use private certificates reach out to our support (hello@shaped.ai).
  • create a set of SASL credentials, which has sufficient permissions to read from your Kafka topic and to create consumer groups with ID of format shaped-*.

Dataset Configuration

Required fields

FieldExampleDescription
schema_typeKAFKASpecifies the connector schema type, in this case "KAFKA".
column_schemaSee belowSpecifies the schema of the dataset.
unique_keys["id"]Specify a list of columns that uniquely identify a row in the dataset, if duplicate rows are inserted with these keys, the latest row will be used.
topicyour-kafka-topicSpecifies the Kafka topic, to which the connector should subscribe. If your topic name contains invalid YAML characters, you must surround it with backticks and double-quotes.
usernameJSW2WAMXW9CMTCVOUsername used for the SASL authorization to your cluster.
passwordpAssw0rd1!Passsword used for the SASL authorization to your cluster.
bootstrap_serverxxxxxxx.us-east-2.aws.confluent.cloud:9092URI of the bootstrap server of your Kafka cluster.

Dataset Creation Example

Below is an example of a Kafka dataset connector configuration:

name: your_kafka_dataset
schema_type: KAFKA
unique_keys: [id]
column_schema:
event_name: String
event_time: DateTime
user_id: String
event_weight: Float
event_value: Int32
event_tags: Array(String)
json_payload: String
username: JSW2WAMXW9CMTCVO
password: pAssw0rd1!
bootstrap_server: xxxxxxx.us-east-2.aws.confluent.cloud:9092
topic: your-kafka-topic

The following payload will create a Kafka dataset and begin syncing data from Shaped using the Shaped CLI.

shaped create-dataset --file dataset.yaml

Delivering Data

After setting up the connector, you need to publish your data to the specified Kafka topic in JSON format. If formatted correctly, data will be loaded into Shaped automatically.