0
votes

I have a consumer application deployed on several ENVs (dev, test, stage & preprod). They all are consuming the same Kafka Topic (means works like multiple consumer of same topic).

I have separate producer applications for all ENVs (dev, test, stage & preprod). While producing message inside the payload it has a field to mention the producer's ENV.

Our requirement is that - Dev ENV's consumer should only consume Dev ENV's producer application's messages. Same goes to other ENVs.

My question is - should I go with Consumer side filtering? Is this will ensure our requirement? How it will ensure our requirement?

Thanks in advance.

2

2 Answers

1
votes

You have multiple options on how to deal with this requirement. However, I don't think it is in general a good idea to have one topic for different environments. Looking into data protection and access permissions this doesn't sound like a good design.

Anyway, I see the following options.

Option 1: Use the environment (dev, test, ...) as the key of the topic and tell the consumer to filter by key.

Option 2: Write producers that send data from each environment to individual partitions and tell the consumers for each environment to only read from a particular partition.

But before implementing Option 2, I would rather do Option 3: Have a topic for each environment and let the Producer/Consumer write/read from the differen topics.

0
votes

I agree with mike that using a single topic across environments is not a good idea.

However if you are going to do this, then I would suggest you use a stream processor to create separate topics for your consumers. You can do this in Kafka Streams, ksqlDB, etc.

ksqlDB would look like this:

-- Declare stream over existing topic
CREATE STREAM FOO_ALL_ENVS WITH (KAFKA_TOPIC='my_source_topic', VALUE_FORMAT='AVRO'); 

-- Create derived stream & new topic populated with message just for DEV
-- You can explicitly provide the target Kafka topic name.
CREATE STREAM FOO_DEV WITH (KAFKA_TOPIC='foo_dev') AS SELECT * FROM FOO_ALL_ENVS WHERE ENV='DEV';

-- Create derived stream & new topic populated with message just for PROD
-- If you don't specify a Kafka topic name it will inherit from the 
-- stream name (i.e. `FOO_PROD`)
CREATE STREAM FOO_PROD AS SELECT * FROM FOO_ALL_ENVS WHERE ENV='PROD';
-- etc

Now you have your producer writing to a single topic (if you must), but your consumers can consume from a topic that is specific to their environment. The ksqlDB statements are continuous queries so will process all existing messages in the source topic and every new message that arrives.