I have an architecture where we have two separate applications. The original source is a sql database. App1 listens to CDC tables to track changes to tables in that database, normalizes, and serializes those changes. It takes those serialized messages and sends them to a Kafka topic. App2 listens to that topic, adapts the messages to different formats, and sends those adapted messages to their respective destinations via HTTP.
So our streaming architecture looks like:
SQL (CDC event) -> App1 ( normalizes events) -> Kafka -> App2 (adapts events to endpoints) -> various endpoints
We're looking to add error handling in case of failure and cannot tolerate duplicate events, missing events, or changing of order. Given the architecture above, all we really care about is that exactly-once applies to messages getting from App1 to App2 (our separate producers and consumers)
Everything I'm reading and every example I've found of the transactional api points to "streaming". It looks like the Kafka streaming api is meant for an individual application that takes an input from a Kafka topic, does its processing, and outputs it to another Kafka topic, which doesn't seem to apply to our use of Kafka. Here's an excerpt from Confluent's docs:
Now, stream processing is nothing but a read-process-write operation on a Kafka topic; a consumer reads messages from a Kafka topic, some processing logic transforms those messages or modifies state maintained by the processor, and a producer writes the resulting messages to another Kafka topic. Exactly once stream processing is simply the ability to execute a read-process-write operation exactly one time. In this case, “getting the right answer” means not missing any input messages or producing any duplicate output. This is the behavior users expect from an exactly once stream processor.
I'm struggling to wrap my head around how we can use exactly-once with our Kafka topic, or if Kafka's exactly-once is even built for non-"streaming" use cases. Will we have to build our own deduplication and fault tolerance?