0
votes

I am learning about Confluent's Schema Registry for all Schema management needs.

And I do not quite understand their approach to versioning... There is a notion of a subject, which I see as a namespace. As far as I understand subject must be unique across Schema Registry.

Then there is schema id, or just id, which is also unique.

And, finally, there is a version.

Here is the snippet from documentation:

version: the schema version for this subject, which starts at 1 for each subject

id: the globally unique schema version id, unique across all schemas in all subjects

So, once I want to modify a schema under a particular subject, what happens to id and version fields? Does id change? Does the version get incremented?

Another quote:

When schemas evolve, they are still associated to the same subject but get a new schema ID and version

Does every change warrant a new id and a new version?

1

1 Answers

0
votes

Every subject has a list of versions. You can verify this in the source code, if you wish.

If two subjects share the same schema, the schema ID is the same, although the version within two different subjects could be different. See example below for this case.

Every unique schema (as defined by it's textual represenation) has a unique (possibly incremental) ID. They are MD5-hashed, or "fingerprinted" for uniqueness, then globally compared to one another across a Schema Registry cluster. This is done with the equivalent of ConcurrentHashMap<String, Schema> where the key is the hash of the value Schema object


example: Use sub, v, and s for subject, version and schema

  1. Create sub1, that makes v1:s1
  2. Update it to create v2:s2
  3. Take that same schema and use it to create sub2

sub1 : [ v1:s1, v2:s2 ]
sub2 : [ v1:s2 ]