There is a tool called Avro-Tools which ships with Avro and can be used to convert between JSON, Avro-Schema (.avsc) and binary formats. But it does not work with circular references.
We have two files:
circular.avsc (generated by Avro)
circular.json (generated by Jackson because it has circular reference and Avro doesn't like the same).
circular.avsc
{
"type":"record",
"name":"Parent",
"namespace":"bigdata.example.avro",
"fields":[
{
"name":"name",
"type":[
"null",
"string"
],
"default":null
},
{
"name":"child",
"type":[
"null",
{
"type":"record",
"name":"Child",
"fields":[
{
"name":"name",
"type":[
"null",
"string"
],
"default":null
},
{
"name":"parent",
"type":[
"null",
"Parent"
],
"default":null
}
]
}
],
"default":null
}
]
}
circular.json
{
"@class":"bigdata.example.avro.Parent",
"@circle_ref_id":1,
"name":"parent",
"child":{
"@class":"bigdata.example.avro.DerivedChild",
"@circle_ref_id":2,
"name":"hello",
"parent":1
}
}
Command to run avro-tools on the above
java -jar avro-tools-1.7.6.jar fromjson --schema-file circular.avsc circular.json
Output
2014-06-09 14:29:17.759 java[55860:1607] Unable to load realm mapping info from SCDynamicStore Objavro.codenullavro.schema? {"type":"record","name":"Parent","namespace":"bigdata.example.avro","fields":[{"name":"name","type":["null","string"],"default":null},{"name":"child","type":["null",{"type":"record","name":"Child","fields":[{"name":"name","type":["null","string"],"default":null},{"name":"parent","type":["null","Parent"],"default":null}]}],"default":null}]}?'???K?jH!??Ė?Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-union. Got VALUE_STRING at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697)
at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
Some other JSON values tried with the same schema but that did not work
JSON 1
{
"name":"parent",
"child":{
"name":"hello",
"parent":null
}
}
JSON 2
{
"name":"parent",
"child":{
"name":"hello",
}
}
JSON 3
{
"@class":"bigdata.example.avro.Parent",
"@circle_ref_id":1,
"name":"parent",
"child":{
"@class":"bigdata.example.avro.DerivedChild",
"@circle_ref_id":2,
"name":"hello",
"parent":null
}
}
Removing some of the "optional" elements:
circular.avsc
{
"type":"record",
"name":"Parent",
"namespace":"bigdata.example.avro",
"fields":[
{
"name":"name",
"type":
"string",
"default":null
},
{
"name":"child",
"type":
{
"type":"record",
"name":"Child",
"fields":[
{
"name":"name",
"type":
"string",
"default":null
},
{
"name":"parent",
"type":
"Parent",
"default":null
}
]
},
"default":null
}
]
}
circular.json
{
"@class":"bigdata.example.avro.Parent",
"@circle_ref_id":1,
"name":"parent",
"child":{
"@class":"bigdata.example.avro.DerivedChild",
"@circle_ref_id":2,
"name":"hello",
"parent":1
}
}
output
2014-06-09 15:30:53.716 java[56261:1607] Unable to load realm mapping info from SCDynamicStore Objavro.codenullavro.schema?{"type":"record","name":"Parent","namespace":"bigdata.example.avro","fields":[{"name":"name","type":"string","default":null},{"name":"child","type":{"type":"record","name":"Child","fields":[{"name":"name","type":"string","default":null},{"name":"parent","type":"Parent","default":null}]},"default":null}]}?x?N??O"?M?`AbException in thread "main" java.lang.StackOverflowError
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:212)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:216)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:216)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
Does anyone know how I can make circular reference work with Avro?