I have a huge Hive table that MapReduce job fails to process as a result of insufficient Java heap size on a single local node installation. I can not increase YARN heap size because of the lack of physical memory on this node. As a work around I was thinking about splitting this huge table into several smaller ones of approximately equal size and the same structure (schema). Let's say 20 000 000 records into 5 tables with 4 000 000 records each.
What would be a SQL request to split a Hive table this way?