0
votes

I have code like this for reading an Excel file:

string connStr = "Provider=Microsoft.ACE.OLEDB.12.0;" +
    "Data Source=" + path + ";Extended Properties=\"Excel 12.0;HDR=YES\";";

using (OleDbConnection conn = new OleDbConnection(connStr))
{
    conn.Open();
    DataTable dtSchema = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
    string sheetName = dtSchema.Rows[0].Field("TABLE_NAME");
    OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "]", conn);
    sheetAdapter.Fill(sheetData);

    DataTable dtColumns = conn.GetSchema("Columns", new string[] { null, null, sheetName, null });

    ...
}

My code needs to use/look at the column headers. The above only works if the column headers are the first row. Sometimes the Excel files that we receive from clients have a couple rows above the column headers with some metadata about the data in the excel. When this happens the column headers will be on something like row 10.

I can open the Excel file and manually delete the extra rows above the column headers and this solves the issue. But we want to remove this manual step.

Is there any easy way to remove/ignore these extra starting rows above the column headers? Or do I have to come up with custom code? The best way I can think of is to turn off HDR and then the first row that has a value in every column is the column header row. Is there an easier way?

2

2 Answers

1
votes

I have code that reads from Excel, needs to ignore the first 11 rows in the worksheet, and read from columns A through P for up to 64000 rows.

// Read columns A - P after skipping 11 rows to read the header row
string ExcelDataQuery = string.Concat("SELECT * FROM [", sheetname, "A12:P64012]");
0
votes

As far as i know (checked that issue in the past) there is no way to select a table with System.Data.OleDb from excel file using SQL query if headers are not placed in row 1.
the solution for me (like you do) is to delete all the rows above the header row before querying the worksheet - just opening the workbook with Microsoft.Office.Interop deleting the extra rows, closing it and than querying it.

Excel is a very powerful tool but was never designed to behave like database (SQL \ access file for example).