1
votes

I am using MS SQL 2012 to read an Excel sheet (excel 2010). The excel sheet may have variable number of columns and variable schema which may range from a hundred to over a thousand. I need to read each record into a temp table and then normalize it. I started out by using OpenRowset for this

> Insert into #temp FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
>     'Excel 12.0;HDR=YES;Database=c:\dir\test.xlsx',
>     'SELECT * FROM [sheet1$]');

I quickly realized that the maximum number of columns that this approach allows me to read at one go is 255. Some quick research reviles that this is the limitation of the driver 'Microsoft.ACE.OLEDB.12.0'. So when I have excel files with large number of columns (> 255) this approach fails. Besides loading so may columns in a single table would pose another problem of exceeding the 8060 page size limit for a table. So I altered my approach and decided to do multiple reads and loop the openrowset command such that it would read 200 columns at a time and load it to a temp table. I could specify the range by appending the excel column range to the sheet name [sheet1$A:GR] the command would look like

> Insert into #temp01 FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
>     'Excel 12.0;HDR=YES;Database=c:\dir\test.xlsx',
>     'SELECT * FROM [sheet1$A:GR]');

Once I put it in the proper loop I am able to read the entire excel sheet and create several temp tables which when put together will give me all the columns. To put these temp tables together to get one complete record I intend to use a Join and perhaps a view and here is where I have stumbled into a problem.

  • Will OpenRowset maintain the read order such that record one in excel will be record 1 in all temp tables I created and so on for all other records?
  • If the order is maintained I can simply join on the SQL row numbers, If not how should I go about this?

Appreciate any help / suggestions in this area.

1

1 Answers

1
votes

There is no real concept of the 'order' of rows in a table. There can be no guarantees of the order any select will present the results in, unless you add an ORDER BY clause to the select.

That said, if there is no ORDER BY, a simple select will generally bring the rows back in the order that they were added but there are so many things going into that that it is in no way guaranteed.

The best thing is to add an int column to each of your temp tables called e.g. Row. Put in it the Excel row number of the source data, and then you can identify the row each bit of data came from and match data across temp tables with the source excel row.