3
votes

My application creates a spreadsheet, this can be rather large (upto 500,000 rows, each row spread over six sheets, each with about 20 columns).

I use Apache Poi and moved to using SXSSFWorkBook as writes data to temporary files so that memory used is not proportional to size of the final spreadsheet and this works wells.

But the spreadsheet represents metadata changed, and when metadata has changed I want to add a comment to the cell showing the old value. I got this working for small spreadsheets but it always fails with heap memory errors when attempted for large files.

I am not sure if the problem is due to limitation of Poi storing all comments in memory or if I am doing it wrong. Worksheet is my own wrapper class, I only create one DrawingPatriach class per sheet, but it looks like I have to create an anchor for each comment I need.

private void addCellComment(Row r, Cell c, Worksheet sheet, String value)
{
    String formattedValue    =  value.replace('\u0000', '\n');
    int    rowCount         =  value.split("\\\\u000").length;
    ClientAnchor anchor = factory.createClientAnchor();
    anchor.setCol1(c.getColumnIndex());
    anchor.setCol2(c.getColumnIndex()+2);
    anchor.setRow1(r.getRowNum());
    anchor.setRow2(r.getRowNum()+rowCount);

    Drawing drawing = sheet.getDrawing();
    Comment comment = drawing.createCellComment(anchor);
    RichTextString str = factory.createRichTextString(formattedValue);
    comment.setString(str);
    c.setCellComment(comment);
} 
1
I am not sure how we could help with this. In such situations, you might have to turn to a really good profiler and well, profile your application to understand exactly where and how and why memory is consumed.GhostCat
How much memory it takes? what are your java memory settings? Which java do you use? How much memory your workstation has?Krzysztof Cichocki
Im looking for someone with Poi expertise rather than memory profiling expertise. I allocate 1GB of memory, if I remove the create comment part I can create a spreadsheet of 20,000 rows and memory never hits 300mb, but add comments back in and it always fails on heap memory. Im using jre1.8.0_111 on Windows 10Paul Taylor
Please can you add more code i'm wondering how you initialize SXSSFWorkbook what is the window size? Have you turn off auto-flushing and using manually control how rows are flushed to disk?Saulius Next
@SauliusNext i just do workbook = new SXSSFWorkbook(1000) - I havent turned off flushing, if I don't add commentcells then it works fine using substantialy less memory than if I just used HSSFWorkbook.the issue is specific to comment cellsPaul Taylor

1 Answers

3
votes

Well, from Apache POI documentation (emphasis mine):

Please note that there are still things that still may consume a large amount of memory based on which features you are using, e.g. merged regions, hyperlinks, comments, ... are still only stored in memory and thus may require a lot of memory if used extensively.

So I guess it is a limitation of Apache POI, and you will have to work around that.