0
votes

I am using kafka connect framework from confluent to produce messages from my application servers into a kafka cluster (zookeeper + brokers + schema registry for avro support).

The data I am sending through connect is defined by an avro schema. My schema represents structured object containing ENUMS. Indeed Apache avro supports supports enumeration types. I dont have to commit my schema to the registry because kafka connect API does it automatically.

My problem is that kafka connect seems to parse ENUMS into String. When I try to consume I see that the schema commited by connect is not correct since it has converted all my ENUMS into String. Thereby I cannot consume my data withou implementing a conversion logic from String back to ENUMS.

I want to keep my logical information as an ENUM and to use kafka connect as well. I jumped into the kafka-connect code and it seems to not handle the enumeration types but only basic types.

My alternative current alternative is to build my own producing framework which keeps ENUMS by imitating connect framework but this is time consuming, and I cannot avoid to use ENUMS.

Have you manage to produce and consume record containing ENUMS to kafka using kafka-connect ?

Any help or experience feedback is welcomme, Thanks!

1
Where I work, our policy prohibits Avro enum because adding a symbol to an enum defines a new schema which cannot be read by readers using the old schema. The Avro specification says if the writer's symbol is not present in the reader's enum, then an error is signaled. For this reason, we represent them as Avro string.Chin Huang
Hi Chun Huang, thanks for this clear answer. Are you implementing logic to transform the String back into an enum at the consumer level and for all consumers ?user3677404
Yes, producers and consumers implement logic to transform between enum and string.Chin Huang

1 Answers

0
votes

In the more recent versions of Connect (around maybe 4.2+), there are these properties. (I personally haven't seen in the documentation, but I was able to find them in the source code because I also found the same thing as you)

As you can see, the default is false, and I've been told in the newer releases, it'll be set to true

public static final String ENHANCED_AVRO_SCHEMA_SUPPORT_CONFIG = "enhanced.avro.schema.support";
public static final boolean ENHANCED_AVRO_SCHEMA_SUPPORT_DEFAULT = false;
public static final String ENHANCED_AVRO_SCHEMA_SUPPORT_DOC =
  "Enable enhanced avro schema support in AvroConverter: Enum symbol preservation and Package"
      + " Name awareness";

As for now, you'll need to set these at the worker / connector level to have the enums preserved, assuming you're running a version of Connect that has these