2
votes

I am learning Kafka and for me it makes sense to use Avro to have a Kafka topic with Schema.

But I am missing something when it comes to where put the schema definition:

  • If I do not use the schema registry, but have the Avro file inside my project, I can generate Java classes and use it as an abstraction layer when sending the message. This is very nice but now I have multiple versions of this file in multiple projects. I can imagine that keeping them in sync will hurt.

  • If I use the schema registry, the problem above is solved. But now I do not see a way to profit from the schema definition when producing the message: I need to manually generate the GenericRecord object to send to Kafka and I would not have any way to see if the message I generated matches the schema.

  • I also see no way to use the schema in order to deserialize the message on the consumer side.

Is there any way to profit from the schema definition when serializing and deserializing the message?

I cannot find any example which does that on both ends, specially using the schema registry.

1

1 Answers

3
votes

You right, you have to use the schema-registry to avoid schema versionning issues.

I would not have any way to see if the message I generated matches the schema

why that ? you can easely write some unit tests to validate your GenericRecord on your producer side.

Otherwise, i recommend to you to use

KafkaAvroSerializer and KafkaAvroDeserializer on respectively producer and consumer sides.

Both are connected to the schema registry with a SchemaRegistryClient implementation : CachedSchemaRegistryClient or MockSchemaRegistryClient (dedicated to your unit tests)

  • Serializer/Deserializer can be found here: io.confluent:kafka-avro-serializer:3.2.0
  • SchemaRegistryClient implementation can be found here: io.confluent:kafka-schema-registry-client:3.2.0

In case you use maven :

<dependency>
    <groupId>io.confluent</groupId>
    <artifactId>kafka-avro-serializer</artifactId>
    <version>3.2.0</version>
</dependency>

<dependency>
    <groupId>io.confluent</groupId>
    <artifactId>kafka-schema-registry-client</artifactId>
    <version>3.2.0</version>
</dependency>