You're talking about two different ways to work with Avro schemas:
- Having schema registry store the schemas for you.
- Generating an
.avsc
file and making that available to downstream consumers.
In the first method, your producer would have an .avsc
file that is used to serialize the messages and send them to Kafka, but if you're using schema registry, you don't need to worry about consumers needing the actual Avro definition, since the whole Avro schema is available from schema registry using the schema id. You don't have the actual generated classes, true, but you can still "walk" the entire message, and extract your data from that.
In the second method, without using a schema registry, the producer uses an .avsc
file to serialize the data sent to Kafka as a byte array, and that file is then made available to consumer/downstream applications, usually through source control. Of course, this means your producer and consumers have to be in sync whenever you make schema changes, or else your consumers won't be able to read the fields the producer has added or modified.
So, if you're using schema registry, Kafka consumers, if properly configured, will pull the schema that each message requires automatically, and you can then extract the data you need. Separately, you can also get the latest schema for any topic with something like this:
curl -X GET "http://schema-registry.company.com:8081/subjects/your_topic-value/versions/latest/schema"
If, however, you are not using the schema registry, the only way to get the full schema is to have access to the .avsc
file used to serialize the message, usually through source control, as mentioned above. You can also then share the auto-generated classes, if available, to deserialize your messages into classes directly.
For more information on how to interact with Schema Registry, here's a link to the documentation: https://docs.confluent.io/current/schema-registry/schema_registry_tutorial.html#using-curl-to-interact-with-schema-registry
And some reading on general schema compatibility and how it's handled/configured in Schema Registry - https://docs.confluent.io/current/schema-registry/avro.html