1
votes

I am reading data from a RestfulAPI which represent dependent entities. e.g from /students I get student objects and from /teachers I get teacher object. Student is connected to Teacher object (student has teacher Id). The problem is that I produce from /students to Kafka into students topic and from /teachers to teachers topic but when I try to join between them with Kafka Streams, sometimes the event of student comes before its teacher event has arrived thus I do not receive the joined record of student and teacher (due to early arrived students). To use window is not optimal because I would like to get student updates all the time.

  1. My question is - how do I sync the events so I'll be able to resolve depending objects.
  2. Currently I'm polling the API service manually and produce the results to Kafka - is there any way to use Kafka Connect instead with the Rest API as a source in a simple way?
1

1 Answers

0
votes

The following approach should help:

  1. Create a stream for the Teachers topics, since incoming records will be stable.
  2. To handle an incoming flow of students, create a KTable for Students.
  3. Perform an non-windowed join between teachers and students.

KTable is a changelog stream, so all incoming records will be treated as inserts or updates.

You can refer to this documentation.