0
votes

Open XML is generating .xlsx files that can be read by Open Office, but not by Excel itself.

With this as my starting point( Export DataTable to Excel with Open Xml SDK in c#) I have added code to create a .xlsx file. Attempting to open with Excel, I'm asked if I want to repair the file. Saying yes gets "The workbook cannot be opened or repaired by Microsoft Excel because it's corrupt." After many hours of trying to jiggle the data from my table to make this work, I finally threw up my hands in despair and made a spreadsheet with a single number in the first cell.

Still corrupt.

Renaming it to .zip and exploring shows intact .xml files. On a whim, I took a legit .xlsx file created by Excel, unzipped it, rezipped without changing contents and renamed back to .xlsx. Excel declared it corrupt. So this is clearly not a content issue, but file a format issue. Giving up on Friday, I sent some of the sample files home and opened them there with Libre Office. There were no issues at all. File content was correct and Calc had no problem. I'm using Excel for Office 365, 32 bit.

// ignore the bits (var list) that get data from the database. I've reduced this to just the output of a single header line
List< ReportFilingHistoryModel> list = DB.Reports.Report.GetReportClientsFullHistoryFiltered<ReportFilingHistoryModel>(search, client, report, signature);

MemoryStream memStream = new MemoryStream();
using (SpreadsheetDocument workbook = SpreadsheetDocument.Create(memStream, SpreadsheetDocumentType.Workbook))
{
    var workbookPart = workbook.AddWorkbookPart();

    workbook.WorkbookPart.Workbook = new Workbook();

    workbook.WorkbookPart.Workbook.Sheets = new Sheets();

    var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
    var sheetData = new SheetData();
    sheetPart.Worksheet = new Worksheet(sheetData);

    Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<Sheets>();
    string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);

    uint sheetId = 1;
    if (sheets.Elements<Sheet>().Count() > 0)
    {
        sheetId = sheets.Elements<Sheet>().Select(s => s.SheetId.Value).Max() + 1;
    }

    Sheet sheet = new Sheet() { Id = relationshipId, SheetId = sheetId, Name = "History" };
    sheets.Append(sheet);

    Row headerRow = new Row();

    foreach( var s in "Foo|Bar".Split('|'))
    {
        var cell = new Cell();
        cell.DataType = CellValues.Number;
        cell.CellValue = new CellValue("5");
        headerRow.AppendChild(cell);
    }
    sheetData.AppendChild(headerRow);
}
memStream.Seek(0, SeekOrigin.Begin);
Guid result = DB.Reports.Report.AddClientHistoryList( "test.xlsx", memStream.GetBuffer(), "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
return Ok(result);

This should just work. I've noticed other stack overflow discussions that direct back to the first link I mentioned above. I seem to be doing it right (and Calc concurs). There have been discussions of shared strings and whatnot, but by using plain numbers I shouldn't be having issues. What am I missing here?

1
Followup: I bumped into this (stackoverflow.com/questions/11082278/…) and tried one of the suggestions for re-assembling the .xslx and it did indeed work to manually re-zip the the content I created. So, apparently the content I'm putting in the sheets/cells is fine, but OpenXML is creating a zip file that Explorer, 7-Zip and Libre Office Calc can read, but Excel can't.Argle

1 Answers

0
votes

In working on this, I went with the notion that some extraneous junk on the end of a .zip file is harmless. 7-Zip, Windows Explorer and Libre Office all seem to agree (as does some other zip program I used at home whose name escapes me). Excel, however, does not. Using the pointer at memStream.GetBuffer() was fine, but using its length was not. (The preceding Seek() was unnecessary.) Limiting the write of the data to a length equal to the current output position keeps Excel from going off the rails.