0
votes

We have a json data stored under a single column family and this has several name/value pairs. We query this data with different name/value combinations and these queries do not particularly incline towards any name/value pairs (which makes it difficult to break them into column families).

  1. What would be the best way to improve the performance of these queries? Would some thing like secondary indexes or impala or pheonix help?
  2. Would it help to divide them into multiple column families? Considering hbase works best for 2 or 3 column families, not sure if this is the right thing to do.
  3. What would be a good system to store nested data or json data to achieve good query performance? Would something like apache drill help?
1

1 Answers

0
votes

If you have json type data, i recommend you to read mongodb. It takes json files and supports secondary index in any depth in json.