KSQL stream from Kafka Topic Maintain same partition values

Question

I am creating a ksql stream from kafka topic. Source topic has 50 partitions, and target stream also has 50 partitions, But the issue is source partition 1 is going to random partition in the target stream ( example partition 10).

Schema: CREATE STREAM SCHEMA_BASE ( ID VARCHAR, Timestamp VARCHAR, CITY VARCHAR, Partition INTEGER) WITH ( KAFKA_TOPIC = 'SPARK_EVENTS', VALUE_FORMAT = 'JSON', TIMESTAMP_FORMAT = 'yyyy-MM-dd''T''HH:mm:ss.SSSSSSS''Z''', TIMESTAMP = 'Timestamp' );

Stream : CREATE STREAM spark_event_streams as SELECT ID, Timestamp, CITY, Partition FROM SCHEMA_BASE PARTITION BY Partition;

Is there a way I can force the target stream to use exact partitioning??

Did you use custom partioner while producing your data into the main stream? What is the keys in SPARK_EVENTS topic? Seems your main stream is not partioned by PARTITION — Ran Lupovich

Matthias J. Sax Matthias J. Sax · Accepted Answer · 2021-08-11T00:00:06

Custom partitioning is not supported in ksqlDB, and ksqlDB always uses the default partitioner, that implement a round-robin strategy if the message key is null.

I filed https://github.com/confluentinc/ksql/issues/7984 to maybe extend ksqlDB with a new feature.

KSQL stream from Kafka Topic Maintain same partition values

1 Answers