2
votes

I have a huge excel file with tons of columns which looks like this :-

Column1 Column2 Column3 Column4 Column5
abc             def             ghi
        mno             pqr
......

The output generated by my code when I print all the values in excel is :-

abc;def;ghi;null;null

mno;pqr;null;null;null

So, If we look at the output above we can note that the cells where I left blank values were not picked up by the POI library. Is there a way in which I can get these values as null? Or a way to recognize that the values presented skipped blank cells?

Please note: I am not using the usermodel (org.apache.poi.ss.usermodel) but an Event API to process xls and xlsx files.

I am implementing HSSFListener and overriding its processRecord(Record record) method for xls files. For xlsx files I am using javax.xml.parsers.SAXParser and org.xml.sax.XMLReader.

I am using JDK7 with Apache POI 3.7. Can someone please help?

I have already seen this possible duplicate How to get an Excel Blank Cell Value in Apache POI? But this doesn't answer my question as I am using Event API.

1
What Event API are you using?Buhake Sindi
It's an Event API of Apache POI to address memory footprint issue.ParagJ

1 Answers

2
votes

Yes, it can be done, and there are several examples of it which ship with Apache POI. They all relate to Event based xls / xlsx -> CSV, which looks very close to what you're doing. That makes me worry you may be re-inventing the wheel...

For HSSF event model processing, the example you want to look at is XLS2CSVmra. That is powered by MissingRecordAwareHSSFListener

For XSSF event model, the example you need is XLSX2CSV