4
votes

I have got column index for an excel column in a spreadsheet and need to delete the entire column using this column index. I am confined to use Open XML SDK 2.0.

using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

namespace openXMLDemo
{
    public class Program
    {
        public static void Main(string[] args)
        {
            string fileFullPath = @"path to the excel file here";
            string sheetName = "excelsheet name here";

            using (SpreadsheetDocument document = SpreadsheetDocument.Open(fileFullPath, true))
            {
                Sheet sheet = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == sheetName).FirstOrDefault();
                if (sheet != null)
                {
                    WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheet.Id.Value);

                    // This is where I am struggling.... 
                    // finding the reference to entire column with the use of column index
                    Column columnToDelete = sheet.GetFirstChild<SheetData>().Elements<Column>()
                }
            }
        }
    }    
}
2
Do you want to delete the column data, or do you want to delete the columns layout / style information? (Column elements only contain layout / styling information for a [range of] columns in an spreadsheet, the actual data is stored in Cell elements, contained in individual Row elements within the SheetData element)bassfader

2 Answers

7
votes

Open XML has no way of selecting columns unfortunately, so you need to iterate through each row and cells within to remove the data:

static void Main(string[] args)
{
    string fileFullPath = @"C:\Book1.xlsx";
    string sheetName = "Sheet1";

    // Specify your column index and then convert to letter format
    int columnIndex = 3;
    string columnName = GetExcelColumnName(columnIndex);

    using (SpreadsheetDocument document = SpreadsheetDocument.Open(fileFullPath, true)) {
        Sheet sheet = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>().Where(s => s.Name == sheetName).FirstOrDefault();

        if (sheet != null) {
            WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheet.Id.Value);

            // Get all the rows in the workbook
            IEnumerable<Row> rows = worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>();

            // Ensure that there are actually rows in the workbook
            if (rows.Count() == 0){
                return;
            }

            // Select all the cells from each row where the column letter is equal to index
            foreach (Row row in rows) {
                var cellsToRemove = row.Elements<Cell>().Where(x => new String(x.CellReference.Value.Where(Char.IsLetter).ToArray()) == columnName);

                foreach (var cell in cellsToRemove)
                    cell.Remove();
            }
        }
    }
}

Helper function courtesy of https://stackoverflow.com/a/182924/5309534:

static string GetExcelColumnName(int columnNumber)
{
    int dividend = columnNumber;
    string columnName = String.Empty;
    int modulo;

    while (dividend > 0) {
        modulo = (dividend - 1) % 26;
        columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
        dividend = (int)((dividend - modulo) / 26);
    }

    return columnName;
}
1
votes

@Shimmy Weitzhandler. OpenXML does not emulate Excel. The code proposed by @Chawin would indeed remove the cells. If you open the workbook as a zip archive and open the sheet from which cells were removed, you will no longer see those cells. But, if you have cells to the right from the removed cells, they will remain where they were. To move these cells left, you will need to adjust their Cell.CellReference property. Depending upon the contents of the sheet, you might need to also adjust other things, like formulas, formatting, data validation, ignored errors, and so on.