Kafka
This connector is in Beta. Reach out to our support (hello@shaped.ai) if you face any issues when integrating your Kafka cluster with Shaped.
Preparation
In order for Shaped to successfully connect to your Kafka cluster, you need to:
- ensure your Kafka cluster is publicly available. To use private connectivity options (such as AWS PrivateLink) reach out to our support (hello@shaped.ai).
- enable SSL for your cluster and use a publicly-verifiable certificate. To use private certificates reach out to our support (hello@shaped.ai).
- create a set of SASL credentials, which has sufficient permissions to read from your Kafka topic and to create consumer groups with ID of format
shaped-*
.
Dataset Configuration
Required fields
Field | Example | Description |
---|---|---|
schema_type | KAFKA | Specifies the connector schema type, in this case "KAFKA". |
column_schema | See below | Specifies the schema of the dataset. |
unique_keys | ["id"] | Specify a list of columns that uniquely identify a row in the dataset, if duplicate rows are inserted with these keys, the latest row will be used. |
topic | your-kafka-topic | Specifies the Kafka topic, to which the connector should subscribe. If your topic name contains invalid YAML characters, you must surround it with backticks and double-quotes. |
username | JSW2WAMXW9CMTCVO | Username used for the SASL authorization to your cluster. |
password | pAssw0rd1! | Passsword used for the SASL authorization to your cluster. |
bootstrap_server | xxxxxxx.us-east-2.aws.confluent.cloud:9092 | URI of the bootstrap server of your Kafka cluster. |
Dataset Creation Example
Below is an example of a Kafka dataset connector configuration:
name: your_kafka_dataset
schema_type: KAFKA
unique_keys: [id]
column_schema:
event_name: String
event_time: DateTime
user_id: String
event_weight: Float
event_value: Int32
event_tags: Array(String)
json_payload: String
username: JSW2WAMXW9CMTCVO
password: pAssw0rd1!
bootstrap_server: xxxxxxx.us-east-2.aws.confluent.cloud:9092
topic: your-kafka-topic
The following payload will create a Kafka dataset and begin syncing data from Shaped using the Shaped CLI.
shaped create-dataset --file dataset.yaml
Delivering Data
After setting up the connector, you need to publish your data to the specified Kafka topic in JSON format. If formatted correctly, data will be loaded into Shaped automatically.