0
votes

I have a scenario where I have some set of avro files in HDFS.And I need generate Avro Schema files for those AVRO data files in HDFS.I tried researching using Spark (https://github.com/databricks/spark-avro/blob/master/src/main/scala/com/databricks/spark/avro/SchemaConverters.scala).

Is there any other than bringing the AVRO data file to local and doing HDFS PUT .

Any Suggestions are welcomed.Thanks !

1

1 Answers

1
votes

Every avro file incorporates in it avro schema that it was written with. You can extract this schema using avro-tools.jar(download from maven). You can download only one part(assuming all other files were written with same schema) and use avro tools(java -jar ~/workspace/avro-tools-1.7.7.jar getschema xxx.avro) to extract it