0
votes

I was wondering can I use Confluent Schema registry to generate (and then send it to kafka) schema less avro records? If yes can somebody please share some resources for it? I am not able to find any example on Confluent website and Google.

I have a plain delimited file and I have a separate schema for it, currently I am using Avro Generic Record schema to serialize the Avro records and sending it through Kafka. This way the schema is still attached with the record which makes it more bulkier. My logic is that if I remove the schema while sending the record from kafka I will be able to get higher throughput.

2
Why would you want to use a schema registry to send schemaless records? I am confused.Fabien
Actually I am currently using Generic record Avro schema to generate Avro records from csv so my understanding is it is appending schema to the Avro binary records while sending it to kafka which makes my Kafka load more bulkier.Explorer
I am not aware that you can natively dissociate Avro from the schema incorporated in the data... But, it seems that Kafka implements specific serializers for Avro and to strip of the Avro schema for transfer: github.com/confluentinc/schema-registry/blob/master/…Fabien

2 Answers

1
votes

The Confluent Schema Registry will send Avro messages serialized without the entire Avro Schema in the message. I think this is what you mean by "schema less" messages.

The Confluent Schema Registry will store the Avro schemas and only a short index id is included in the message on the wire.

The full docs including a quickstart guide for testing the Confluent Schema Registry is here

http://docs.confluent.io/current/schema-registry/docs/index.html

0
votes

You can register the your avro schema first time with the help of below command from cmd

curl -X POST -i -H "Content-Type: application/vnd.schemaregistry.v1+json" \
        --data '{"schema": "{\"type\": \"string\"}"}' \
        http://localhost:8081/subjects/topic

You can see all versions of your topic using

curl -X GET -i http://localhost:8081/subjects/topic/versions

To see complete Acro schema for version 1 from all versions present in confluent schema registry use below command, will show schema in json format

  curl -X GET -i http://localhost:8081/subjects/topica/versions/1

Avro schema registration is task of Kafka producer

After having schema in confluent schema registry, you just need to publish avro generic records to specific kafka topic, in our case it is 'topic'

Kafka Consumer :Use below code to take latest schema for specific Kafka topic

val schemaReg = new CachedSchemaRegistryClient(kafkaAvroSchemaRegistryUrl, 100)
val schemaMeta = schemaReg.getLatestSchemaMetadata(kafkaTopic + "-value")
val schema = schemaMeta.getSchema
val schema =new Schema.Parser().parse(schema)

Above will be use to get schema and then we can use confluent to decode record from kafka topic.