22
votes

Is there any way to convert the Hive query result in JSON format?

5
Seems like I am not the only one looking for this solution. Thanks for sharing the project. Will see how it can be useful. - divinedragon
@Steve I checked out the hive-json-serde. It seems like it is for reading and processing the JSON data. I was looking for something in which my data is tab delimited, but when I query them from the hive, the query output is in the json format. - divinedragon
Hi divinedragon, have you got a solution already? I am looking for a solution as well, can you share if you got one plz, thanks! - Ming
Actually I have suspended that thing due to some other priority works. Will start looking onto that. - divinedragon

5 Answers

12
votes

This seems to come up quite often. Use the to_json UDFs' from Brickhouse (http://github.com/klout/brickhouse ). If you convert your results to a named_struct, it will interpret it as a JSON map, and output accordingly.

SELECT to_json( named_struct( "field1", field1 ,
            "field2", field2,
            "field3", field3 ) )
   FROM mytable;

The to_json will also interpret arrays and maps accordingly.

1
votes

I was using a tool called Apache Nifi. It has AvrotoJSON processor. The hive output which is in Avro format can be easily converted to JSON. The below link will be helpful: https://nifi.apache.org/

0
votes

My experience will be use jackson library(http://jackson.codehaus.org/), you create a POJO to map the json format. So once you get the ResultSet from your hive query, you iterate through it and create objects of the POJO using Jackson.

/**--JACKSON Class--**/
public class Item {
    @JsonProperty
    private String att1;
    @JsonProperty
    private String att2;
    public Item(String att1, String att2){
        this.att1 = att1;
        this.att2 = att2;
    }

}

/**--the class where u run your query--**/
List<Item> list = new ArrayList<Item>();
ResultSet rs = executeQuery(queryStr); // do your hive query here
while(rs.next){
    String att1 = rs.get("att1");
    String att2 = rs.get("att2");
    Item item = new Item(att1, att2);
    list.add(item);
}

then you can return a List of Item as the result, Jackson allows you to write it in json format very easily.

  1. create a ObjectMapper ObjectMapper mapper = new ObjectMapper(); The mapper gives you a lot options to write the object in json format to different target. eg. outputstream, bytebuffer etc.

  2. iterate through the list.

  3. use mapper to write the value in json format, eg. mapper.writeValue(out, Item).In this example, out is a OutputStream.

0
votes

Easiest way would be to first store as tsv and then use libraries like spray to convert to JSON.

Hive works fine as long as you don't try to customize too much. If you do have access to Spark in your cluster, use SparkSQL instead of Hive. The results of your query will be stored into a SchemaRDD and from there you can easily map to a JSon doing .map(_.toJson), assuming you have spray imported in your scope.

0
votes

list of json objects:

If you want to covent the output to json format and then making a collection out of it meaning: [{json1},{json2},{json3}]... you need to know that collect() function treat it as string which is wrong.

so what is needed for having a collection of json-formatted out put is wrapping them with from-json function too:

create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';
create temporary function to_json as 'brickhouse.udf.json.ToJsonUDF';
create temporary function from_json as 'brickhouse.udf.json.FromJsonUDF';
collect(from_json(to_json(named_struct("locale", locale, "createdtime",  created_time)), 'map<string,string>')) as list_json_object,