4
votes

I am working on a module which will de-duplicate contact records on insert/update but I am hitting into Apex CPU time limit exceeded error. I understand that to overcome it we need to optimize code a little but in the following code block there seems very little scope of optimizing it. Any help will be appreciated.

On Account we have a mulipicklist from where we can select the fields based on which the uniqueness of the contacts under this account will be defined. It can be different for difference account. The following code is a part of a handler class for trigger in which we have a map of old contact list with account Id as key (mapOfAccountIdWithItsContact) and a map with new contact list and account as key (newContactWithAccountMap) we iterate over these maps based on set of account whose contacts we have get in triiger. We have a map to stroe Lisst of fields to be used for contact uniqueness for each account (mapOfAccountWithFilters).

Here is the code snippet:

for(String accountId : accountIdSet){
        if(newContactWithAccountMap.get(accountId) != null){
            for(Contact newContact : newContactWithAccountMap.get(accountId)){
                for(Contact oldContact : mapOfAccountIdWithItsContact.get(accountId)){
                    //Check for duplication only in the respective account, also this should not apply on insertion of Office contact
                    matchingContactFound = false;
                    if(oldContact.id != newContact.id){  //while insert, newContact's id will be null and while update it will verify that it is not matching itself with its old record. 
                        for(String filterFieldName : mapOfAccountWithFilters.get(accountId)){
                            if(oldContact.get(filterFieldName) == newContact.get(filterFieldName)){
                                matchingContactFound = true;
                                //If match is found update last de duplication date to today on old contact
                                oldContact.Last_De_Duplication_Date__c = System.Today();
                                oldContactsToUpdateSet.add(oldContact);
                            }else{
                                matchingContactFound = false;
                                break; //get another "old contact"
                            }
                        }
                    }
                    if(matchingContactFound){                               
                        //stop it from being inserted
                        duplicateContactSet.add(newContact.Id);
                        //newContact.addError('Contact cannot be inserted because a contact is already present based on the Master Target Identifier at client level.');
                        break; //get another "new contact"
                    }                       
                }
            }
        }
    }  

Any help in avoiding 4 loops or an alternate approach will be greatly appreciated. Thanks in advance.

2

2 Answers

4
votes

Great question!

But it's hard to say something without seeing more of your code...

  1. How did you narrow it down to this particular snippet, have you made any timing tests (debug logs? setting of breakpoints in developer console)?
  2. How do you populate your variables and of what type are they (I prefer to see Map<Id, List<Contact> etc than try to figure it out from the description)
  3. Maybe you're querying for too much data, maybe some smarter filtering upfront could greatly reduce the execution time... For example you have there a comment about this should not apply on insertion of Office contact - do you already filter this out before you even enter this method?

Maybe a bit of preparing upfront would help?

Another thing to consider is how many fields are in the multipicklist? Have you thought about making a bit of preparation upfront? For example on every insert & every update save the values from "important fields" to a helper Text(255) field on Contact, let's call it "duplicate category / segment / tag" or something like that. Mark that field as external id (it doesn't have to be unique, ext. id is enough to get it indexed).

Then you should be able to pretty fast select contacts where Account id and that "category" matches. If the field value is identical or the length is 255 it means it was truncated and you have to run exact field-by-field comparison. But if there are no Contacts with identical "category" - you should be safe to say that there are no duplicates.

Such thing would have to be carefully designed (trigger on Contact but also on Account to "recalculate" this field every time the field definitions change... and if it's possible to point from Contact to other records (say User) then you'd need to protect yourself from user changes too...) but I'd say definitely worth trying out. You don't even have to support all fields. Say just First & Last Name, Email, Phone, one of addresses. This should already be a massive help in "bucketing" them and then querying only from right buckets.


Code review ;)

Cache your calls

You have a lot of map get calls. And you call System.today() in a loop (I would kind of understand System.now... but today?). Check this interesting short video and do what you can to have more local variables and less script statements. The list of fields doesn't change for a given account - so why you painfully fetch it from the map with every iteration.

Check the logic

if(newContactWithAccountMap.get(accountId) != null) - this should never happen. if there are no Contacts for "that" acc being inserted - what are you doing in that function anyway. Are you being overly protective here or is this check actually required (which would suggest some bigger problems). Even for private contacts (with AccountId = null) you should not have fetched the Contacts and built the maps for Accounts that are unaffected.

So in first iteration I'd do something like that (it's minor improvement but who knows, might help):

for(String accountId : accountIdSet){
    List<String> fields =  mapOfAccountWithFilters.get(accountId);  // it's always same for that Account, isn't it?
    for(Contact newContact : newContactWithAccountMap.get(accountId)){
        for(Contact oldContact : mapOfAccountIdWithItsContact.get(accountId)){
            if(oldContact.id != newContact.id){
                Boolean allFieldsMatch = true;
                for(String fieldName : fields){
                    if(allFieldsMatch &= (oldContact.get(fieldName) == newContact.get(fieldName))){
                        oldContactsToUpdateSet.add(oldContact);
                    }else{
                        break;
                    }
                }
                if(allFieldsMatch) {
                    duplicateContactSet.add(newContact.Id);
                    break;
                }
            }
        }
    }
}
Date today = System.today();
for(Contact c : oldContactsToUpdateSet){
    c.Last_De_Duplication_Date__c = today;
}

2nd pass

If it doesn't help - you might try to incorporate bits of my idea here. Let's say you're inserting 3 contacts to account that already has 10. That's 3 * 10 = 30 comparisons in the inner loop (they're both brand new so the trick with comparing old & new contact's id doesn't help).

But if you'd prepare some kind of composite key like this "bucket" I've talked about you're really flattening it out.

for(String accountId : accountIdSet){
    List<String> fields =  mapOfAccountWithFilters.get(accountId);
    Set<String> oldContactBuckets = new Set<String>();
    /* I'm cheating here a bit. 
        All I want to know is whether there was a match. I don't care with which Contact.
        If you care - you'd have to convert this Set<String> to Map<String, Set<Id>> for example.
        Looks like you do care because you're setting this Last_De_Duplication_Date__c
        but I'll leave it as exercise for the reader :P
    */
    for(Contact oldContact : mapOfAccountIdWithItsContact.get(accountId)){
        oldContactBuckets.add(buildKey(oldContact, fields));
    }
    for(Contact newContact : newContactWithAccountMap.get(accountId)){
        String key = buildKey(newContact, fields);
        if(oldContactBuckets.contains(key)){
            duplicateContactSet.add(newContact.Id);
        }
    }
}

private String buildKey(Contact c, List<String> fields){
    List<String> temp = new List<String>();
    for(String fieldName : fields){
        temp.add(String.valueOf(c.get(fieldName)));
    }
    return String.join(temp, '\n'); // pick a field separator that's unlikely to appear in your real data. Tab maybe?
}

This would build the 10 keys first, then compare with 3 keys for incoming data. Meaning only 13 equivalents of your "innermost" loop.

If this won't help in your situation - you can always consider rewriting it to batch apex I guess...

0
votes

Part of the answer may be to make this a batch process. When batch code is run, each separate batch of records processed gets its own time limit. There's more on batch apex here.

You don't show the SOQL queries or give an idea of how many contacts an account might have. An overall number of account and contacts would be useful too. Without these, it's hard to pin down where the problem might be. You might need to look at SOQL optimization. For example, there are operators like "not equals" ("!=") that are inefficient. There's much more that could be said about this topic and you can read it here.

Another possibility is that you could replace multiple queries with one query. In a similar situation, I could have written code that did one query per site (a custom object). That would have limited me to 100 sites. Instead, I did a query for all sites (we have far fewer than the 50,000 limit) and then created a map to store the results of the query via a unique key.