I need to read large excel files and import their data to my application.
Since POI takes up a large amount of heap to work, often throwing OutOfMemory
errors, I found out that there is a Streaming API for handling excel data in a serial fashion (rather than loading the file completely into memory)
I created a xlsx
workbook, with a single worksheet, and typed in several values in cells and came up with the following code to attempt reading it:
public static void main(String[] args) throws Throwable {
// keep 100 rows in memory, exceeding rows will be flushed to disk
SXSSFWorkbook wb = new SXSSFWorkbook(new XSSFWorkbook(new FileInputStream("C:\\test\\tst.xlsx")));
SXSSFSheet sheet = (SXSSFSheet) wb.getSheetAt(0);
Row row = sheet.getRow(0);
//row is always null
while(row.iterator().hasNext()){ //-> NullPointerException
System.out.println(row.getCell(0).getStringCellValue());
}
}
However, despite being able to get its worksheets properly, it always comes with empty (null
) rows.
I have researched and found out several examples of the Streaming API in the internet, but none of them are about reading existing files, they're all about generating excel files.
Is it actually possible to read data from existing .xlsx
files in a stream?
-Xmx
to increase heapspace, I've already increased it to 2GB, and POI couldn't read a 20MB file, so this makes it not feasible. but thanks anyway! – bruno_cw