7
votes

Does Apache POI provide any function for us to count the number of "not-null" rows in a sheet of an Excel file?
At the first time, I have an Excel sheet with 10 data rows, the function worksheet.getPhysicalNumberOfRows() returns the exact number (10). But after that, I delete 3 rows, then that function still gets 10 rows. Maybe the total number of rows was cached anywhere by POI. What does getPhysicalNumberOfRows() mean? As its API described: "Returns the number of physically defined rows (NOT the number of rows in the sheet)", but I do not understand what "physically defined" mean. Can you help me on this issue?
Thank you so much!

5

5 Answers

3
votes

If you delete the rows via worksheet.removeRow(Row row), then the physical row count should be 7.

POI uses a map to store the rows of a sheet. This map is the physical part. See http://www.google.com/codesearch/p?hl=de#WXzbfAF-tQc/trunk/src/java/org/apache/poi/hssf/usermodel/HSSFSheet.java

As to the logically null rows, try

int notNullCount = 0;
Sheet sheet = wb.getSheetAt(0);
for (Row row : sheet) {
    for (Cell cell : row) {
        if (cell.getCellType() != Cell.CELL_TYPE_BLANK) {
            if (cell.getCellType() != Cell.CELL_TYPE_STRING ||
                cell.getStringCellValue().length > 0) {
                notNullCount++;
                break;
            }
        }
    }
}
0
votes

If you look at the code of the POI 3.8 beta 3, you'll see that removing a row should also remove its record. As such, the number of physical rows in the file should go down too

I'd suggest you try with a newer version of POI

To count the number of non empty rows in a file, do something like Amadeus suggests and loop over the rows on the sheet and check if they have cells.

0
votes

If you are deleting manually, make sure you use delete row and not just delete the data in cells then it will return the correct value.

0
votes

I had the same problem. If I delete the rows manually, still the rowcount did not decrease, when I checked using sheet.getPhysicalNumberOfRows().

When I deeply went into that issue I found the exact problem. I had an email column in my row, and when I entered an email address MS Office automatically detects that as an email address. And when I delete that entire row manually, the cell that carried the email address still had "" as its value(it will not be visible, but I found that value getting initialised when I read it through java). So since this cell has a not null value(""), this entire row is getting declared(sort of) and the row count is increased.

The funny part is that when I dont enter an email address, and just enter some string and then delete the row, the cell is not getting initialised and the ROWCOUNT GOT DECREASED actually. This is what I found as a result of my problem.

Finally I solved that, by not only addding null check for the cells, but also

if(cell != "")

Hope this might be useful for you

0
votes

We can write a custom method that will ignore blank rows to give row count. Probably we can make some assumptions based on out requirement. For example in my case, a row can be considered blank if its first column value is empty and count is needed only till first blank row.

So following snippet could be useful:

    public int getNonBlankRowCount(String sheetName){
    int rowCount = 0;
    int index = workbook.getSheetIndex(sheetName);
    if(index==-1){
        rowCount = -1;
        return rowCount;
    }else{
        sheet = workbook.getSheetAt(index);
        Iterator<Row> rowIterator = sheet.rowIterator();
        rowCount = 0;
        while (rowIterator.hasNext()) {             
            Row row = (Row) rowIterator.next();
            cell = (HSSFCell) row.getCell(0);
            String cellValue = cell.getStringCellValue();
            if (cellValue.isEmpty()) {
                break;
            }
            rowCount++;
        }
        return rowCount;
    }
}