3
votes

I am using Apache Drill to run SQL queries on a HBase table. The value in one of the columns is:

0: jdbc:drill:schema:hbase:zk=localhost> select cast(address['street'] as varchar(20)) from hbase.students;
+------------+
|   EXPR$0   |
+------------+
| {"id": 123} |
+------------+
1 row selected (0.507 seconds)

I would like to access the id field using a query. Something like:

0: jdbc:drill:schema:hbase:zk=localhost> select tbl.address['street']['id'] from hbase.students as tbl;
+------------+
|   EXPR$0   |
+------------+
| null       |
+------------+

As you can see, this does not work. I am run to similar queries on JSON data in a file. My question is can I query JSON data in HBase.

2

2 Answers

1
votes

OK. I found the answer to this question, in case someone else has the same requirement.

The first step is to convert the HBase data to JSON using the built-in convert_from() function. A view can be created against which the queries can be run.

> create or replace view Street as select convert_from(Students.address.street, 'JSON') json from hbase.Customer;

Then, run query against the view

> select * from Street;
> select Street.json.id from Street;
1
votes

You can also use a subquery to convert the data in your HBase column into JSON:

select t.json.id 
from   (select convert_from(Students.address.street, 'JSON') json 
        from   hbase.Customer) t;