I am clear about the Serde available in Hive to support Avro schema for data formats. Comfortable in using avro with hive.
for say, I have found this issue against presto. https://github.com/prestodb/presto/issues/5009
I need to choose components for fast execution cycle. Presto and impala provide much smaller execution cycle. So, Anyone please let me clarify that which would be better in different data formats. Primarily, I am looking for avro support with Presto now.
However, lets consider following data formats stored on HDFS:
- Avro format
- Parquet format
- Orc format
Which is the best to use with high performance on different data formats. ?? please suggest.