1
votes

I have a database with a lot of records and each record has 5 fields namely INVOICE NUMBER, DATE, AMOUNT, SHIPMENT NUMBER and QUANTITY.

How do i write a VBA script that searches for duplicates. The way to searching duplicates is that if any three fields of the record match then the record is highlighted in a separate colour. The fields that are suppose to be matched are not specified, they can be any three fields.

For example, if INVOICE NUMBER, SHIPMENT NUMBER and QUANTITY match in some records then its shown as duplicates and highlighted. Similarly if INVOICE NUMBER, QUANTITY and DATE matches in a few records then they are listed as duplicates as well.

So in the database, out of the 5 fields if any 3 match then they are listed as duplicates and highlighted in a different colour.

Can anyone please help me write a VBA script that does that?

1
Are you sure that is what you want? If any 3 columns match, then call it a duplicate? I would be fearful that you would falsely identify matches that have the same Date,Amount, and Quantity, but are seperate orders. Shouldn't your shipment Number or Invoice number be unique enough to group orders by? - user2140261
Alright, here is the deal. I have an existing database of records. Each recored has the fields Invoice Number, Vendor Code, Currency, Amount and Date. My job is to extract new data records from the SAP System and add them to the database. Now, when i add records i need to check if any three fields match between the existing database and the new records. When i add them on my excel sheet, i have to make sure that no 3 fields of the new records match with the existing ones. If any there do then i highlight them as duplicates. - abhijeet_chib
Suppose i need to add 3 records today with the same fields Invoice Number, Vendor Code, Currency, Amount and Date. Suppose for the first record Vendor Code, Invoice Number and Amount match with any record in the excel file then its highlighted as a duplicate and i forward it to be checked. For the next record suppose Vendor Code, Amount and Currency match then it shows as a potential duplicate as well and is highlighted and i forward it again to be checked before the final payments are made. - abhijeet_chib

1 Answers

0
votes

There isn't really an easy way to do this. With five fields, there are 10 possible ways that any 3 of those fields can be evaluated. So, you could create a macro that basically cycles through all of the columns, searching for duplicates, but it won't be very easy.

I have done something like this in the past, but relied instead on a formula in a cell to do the work. I used the SUMPRODUCT function to look up values. Here is what it would look like for you.

=SUMPRODUCT((A$2:A2=A3)*(B$2:B2=B3)*(C$2:C2=C3)) + SUMPRODUCT((A$2:A2=A3)*(B$2:B2=B3)*(D$2:D2=D3)) + + SUMPRODUCT((A$2:A2=A3)*(B$2:B2=B3)*(E$2:E2=E3)) + SUMPRODUCT((A$2:A2=A3)*(C$2:C2=C3)*(D$2:D2=D3)) + SUMPRODUCT((A$2:A2=A3)*(C$2:C2=C3)*(E$2:E2=E3)) + SUMPRODUCT((A$2:A2=A3)*(D$2:D2=D3)*(E$2:E2=E3)) + SUMPRODUCT((B$2:B2=B3)*(C$2:C2=C3)*(D$2:D2=D3)) + SUMPRODUCT((B$2:B2=B3)*(C$2:C2=C3)*(E$2:E2=E3)) + SUMPRODUCT((B$2:B2=B3)*(D$2:D2=D3)*(E$2:E2=E3)) + SUMPRODUCT((C$2:C2=C3)*(D$2:D2=D3)*(E$2:E2=E3))

Note that this assumes that your 5 fields are in columns A to E, and never change. The above formula is also designed to be put in row 2 of whatever column you want it in. Then just copy down the formula to have it auto-adjust values for the current row. Row 2 doesn't need it, since that should be your first record (assuming there are headers of course).

Oh, since I didn't mention it, if this formula returns a 1, then that indicates duplicate data. A 0 indicates that it is unique (so far).

Here is some psuedocode to assist with creating a macro.

For each row from 2 to currentRow - 1
   Dim numMatches as integer
   numMatches = 0
   if activeSheet.Range("A" & row) = activeSheet.Range("A" & currentRow) then
      nuMatches = numMatches + 1
   Endif
   'do the above if statement for each compare
   if numMatches >= 3 then
      'format the currentRow to indicate duplicate
   endif
Next Row