1
votes

I am trying to use Apache POI to read in old (pre-2007 and XLS) Excel files. My program goes to the end of the rows and iterates back up until it finds something that's not either null or empty. Then it iterates back up a few times and grabs those cells. This program works just fine reading in XLSX and XLS files made in Office 2010.

I get the following error message:

Exception in thread "main" java.lang.NumberFormatException: empty String
    at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
    at java.lang.Double.parseDouble(Unknown Source)

at the line:

num = Double.parseDouble(str);

from the code:

str = cell.toString();

if (str != "" || str != null) {
    System.out.println("Cell is a string");
    num = Double.parseDouble(str);
} else {
    System.out.println("Cell is numeric.");
    num = cell.getNumericCellValue();
}

where the cell is the last cell in the document that's not empty or null. When I try to print the first cell that's not empty or null, it prints nothing, so I think I'm not accessing it correctly.

3
Why aren't you using the cell type to identify if it's a string or a number or empty? - Gagravarr
I tried that, when i do it returns an answer, and when i treat it as the returned answer the cell is still showing up as "blank". I've open the excel file its referencing and the numbers are correct and the cells that its trying to read are not blank. - Nick
Are you sure you're not off-by-one? Excel starts at A1, POI starts at Row 0, Column 0. (POI has utility methods to convert excel references to row/column to avoid this sort of mistake) - Gagravarr
I actually checked the off-by-one idea by over-stepping it a few spaces. The data starts at row 54 or something like that and runs all the way back up to 10, and i rolled it back to the 40's and it was still returning "" on every one. - Nick

3 Answers

3
votes

Perhaps the reason why you're reading in blank cells is due to the usage of the right subcomponent of Apache POI to read Excel files. Use HSSF (Horrible SpreadSheet Format) for XLS formats and XSSF (XML SpreadSheet Format) for XLSX formats.


As for the code itself, you may want to refine your boolean expression. The way you have it now, since you're using the or operator (||),

  • the first part of your if statement will execute if str != null, and
  • the else part of your if statement will execute if str == null.

The first part of the if statement will throw a NumberFormatException in calling Double.parseDouble if str cannot be parsed as a number.

Maybe the following code snippet will help you:

if (str == null || str.trim().isEmpty()) {
    // handle null and empty strings
} else if (cell.getType() == Cell.CELL_TYPE_NUMERIC) {
    System.out.println("Cell is numeric.");
    num = cell.getNumericCellValue();
} else {
    // If the cell is not numeric, Double.parseDouble(str) 
    // will most likely throw a NumberFormatException
}

To find out more about Cell, read its Javadoc.

2
votes

It would be better to evaluate the cell type and then do what you need. I use this code to handle the cell data (check that I even handle blank cells):

switch (cell.getCellType()) {
    case Cell.CELL_TYPE_STRING:
        str = cell.toString().trim();
        break;
    case Cell.CELL_TYPE_NUMERIC:
        if (DateUtil.isCellDateFormatted(cell)) {
            //you should change this to your application date format
            objSimpleDateFormat = new SimpleDateFormat("dd/MM/yyyy");
            str = objSimpleDateFormat.format(cell.getDateCellValue());
        } else {
            num = cell.getNumericCellValue();
            str = String.valueOf(cell.getNumericCellValue());
        }
        break;
    case Cell.CELL_TYPE_BLANK:
        str = "";
        break;
    case Cell.CELL_TYPE_ERROR:
        str = "";
        break;
    case Cell.CELL_TYPE_BOOLEAN:
        str = String.valueOf(cell.getBooleanCellValue());
    break;
}
0
votes

It would be nice if we all know what code line number caused the exception.

I suspect your first line of code is the cause. Object cell could be null, and a null address can not be converted to a String type. You can check by code.

Note: It's good that the code works with Office 2010 but I think this type of problem can happen in any Excel version.