1
votes

I want to scan an excel sheet, and replace any occurrences of social security numbers with zeros... I would like to do this using Excel.Interop if at all possible, but I'm open to anything at this point... here's some of my code... I'm banging my head on the desk for the past few months...

 // Get range and convert it to a string variable 
            Excel.Range _range =(Excel.Range)_excelApp.get_Range("A1:K1",Type.Missing);

            // convert the value of our cells in our range
            // to the string variable
            string myString = _range.Cells.Value2.ToString();          

            // match any SSN e.g. 1236780909, 123-33-2445
            if (Regex.IsMatch(myString, @"\b\d{3}\b\d{2}\b\d{4}"));
            {                
                _range.Cells.Value2 = replaceSSN;
            }

            // save our workbook with a new name and create a backup
            _excelWorkbook.SaveAs("Results.xls", Type.Missing, Type.Missing, Type.Missing, Type.Missing, true, Excel.XlSaveAsAccessMode.xlNoChange, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

            // close workbook
            _excelWorkbook.Close(false, Type.Missing, Type.Missing);

            // send quit signal to app
            _excelApp.Quit();

            // report success            
            MessageBox.Show("File masked successfully.", "Mask Data", MessageBoxButtons.OK);

            // release memory
            //System.Runtime.InteropServices.Marshal.ReleaseComObject(_excelApp);

            // change label back to being blank
            lblActivity.Text = "";
        }      
2

2 Answers

7
votes

The problem is here

        string myString = _range.Cells.Value2.ToString();          

And here

        // match any SSN e.g. 1236780909, 123-33-2445
        if (Regex.IsMatch(myString, @"\b\d{3}\b\d{2}\b\d{4}"));
        {                
            _range.Cells.Value2 = replaceSSN;
        }

I think you are misunderstanding what Value2 is, Its the array reprentation of the range you have defined above.

Excel.Range _range =(Excel.Range)_excelApp.get_Range("A1:K1",Type.Missing);

Value2 Property of the Range class,returns an array of values. what you need to do is perhaps declare a empty array and get the Value2 of the range, loop each item in the array and run the regex, if it finds a match replace the item in the array. Then you could set the array back to the Value2 of the Range, which will update the cells values.

EDIT: Please find some sample code below

                var excelApp = new Application();
                excelApp.Workbooks.Open("c:\\Test.xls",Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing);
                var ws = excelApp.Worksheets;
                var worksheet =(Worksheet) ws.get_Item("Sheet1");
                Range range = worksheet.UsedRange;
                // In the following cases Value2 returns different types
                // 1. the range variable points to a single cell
                // Value2 returns a object
                // 2. the range variable points to many cells
                // Value2 returns object[,]

                object[,] values = (object[,])range.Value2;

                for (int row = 1; row <= values.GetUpperBound(0); row++)
                    for (int col = 1; col <= values.GetUpperBound(1); col++)
                {
                    string value = Convert.ToString(values[row, col]);
                    //Also used a different regex, found yours not to match on your given criteria
                    if (Regex.IsMatch(value, @"^\d{3}-\d{2}-\d{4}$"))
                    {
                        range.Cells.set_Item(row,col,"0");
                    }
                }

            excelApp.Save("C:\\Out.xls");
            excelApp.Quit();

            Marshal.ReleaseComObject(worksheet);
            Marshal.ReleaseComObject(ws);
            Marshal.ReleaseComObject(excelApp);
4
votes

Here you can have a small understanding of code snippet but I have not tried it so it may contain syntax errors.

Excel.WorkSheet currentSheet = ApplicationInstance.ActiveSheet as Excel.Worksheet;

foreach(Exce.Range r1 in currentSheet.UsedRange)
{
     if(Regex.IsMatch(r1.Value2.ToString(), @"\b\d{3}\b\d{2}\b\d{4}")) // check r1.Value2 should not be null
     {
         r1.Value2 = Your_New_Value;
     }
}