1
votes

I have a very large excel spreadsheet about 200 MB in size with 12 columns and a million rows.

There are 5 different types of cells that can identified by looking for a specific string pattern in each.

I want each of these 5 buckets of cells to have different interior colors. Also the ranges for each cell type are not contiguous.

Here is what I have tried and it works but is painfully slow.

Dump the used range to a variant array, loop through the array, identify each cell bucket type and set interior color index once per each cell depending on type of bucket it belongs to.

Dump the used range to a variant array, loop through the array, identify each cell bucket type and create a new range using union, set interior color index only once per each type of range.

Both these approaches take about 90 minutes to color the cells.

Any other ways to optimize this?

1
Does the same process run faster if you do it in VBA (since it's running in the same process it may be)? If Yes then you could load an add-in into your Excel instance from C# - the add-in does the update using VBA. Why would you need to color 12 million cells though? - Tim Williams

1 Answers

1
votes

Every single Interop call to Excel from outside is very slow. So you can get quite fast the values of a large range by a single call getting a variant range. But I know nothing like this for getting or setting formats. My ideas are:

  • Do the calls from inside: use VBA
  • Export your workbook in Excel 2003 XML format, modify the formats in XML (ok, it's a little bit more complicated) and reload it in Excel
  • Build unions as you mentioned it but don't add single cells to that union but bundle the calls by building subranges (Range("A1,A2,A3,A4,A5.....")). Unfortunately you have to regard that these single strings don't get longer than 255 characters.
  • Conditional formatting possible??