I need to retrieve 10 million rows from Hive.
String sql = select * from table_name
List<Map<String, Object>> resultSet = jdbctemplate.queryForList(String sql)
The above method runs well to retrieve 1 million rows at once(single hit) with 2GB of Heap Memory. It takes 3-4 minutes only to select records from a table size of 30 MB(1 million rows).
But for more than 1 million records, there are memory issues, and takes more time.
I need to query Hive with OFFSET values, but for the 1.2.1 version, there is no OFFSET clause it seems.
Is there any other way to select records from Hive as Batch? Select the first 10K records and the next 10K like that?