0
votes

I'm taking emails from gmail, parsing them using jquery, then I want to send the scraped data from each email into its own row in the google sheet. I have the code working up to the point where I have the necessary data scraped, and I push it into an array. Then I want to check for duplicates if there are any matches in the existing google sheet rows and if not then put the data into rows in the sheet.

I used the following post as a model for the array part but can't get what he did to work for my data: Google script is reprocessing emails

I'd appreciate any insights into getting the array to dump into the sheet. The unique value I'll be using to check for duplicates would be the person's name (var name) which is column 3 in the spreadsheet

function appendLead() {
    var url = 'https://docs.google.com/spreadsheets/d/1zZDKMISf8Hbw7ERfP6h-Q0pztQSMqsN-mHfeM3E_Ywg/edit#gid=0';
    var ss = SpreadsheetApp.openByUrl(url);
    var leadsData = ss.getSheetByName('Leads Data');
    var myLabel = "Buyer Inquiries/1. To Process"; // Name of current label being processed, this label is set via a filter in Gmail
    var newLabel = "Buyer Inquiries/2. Processed"; // Name of "New" filter, this label needs to be created in Gmail first
    var label = GmailApp.getUserLabelByName(myLabel);
    var label2 = GmailApp.getUserLabelByName(newLabel);
    var threads = label.getThreads();
    var data2 = [];
    var newData = [];
  
    for (var i = 0; i< threads.length; i++) {
      var message = GmailApp.getMessagesForThread(threads[i])[0];
      
    
      // move thread from label to label2
      label2.addToThread(threads[i]);
      label.removeFromThread(threads[i]);
      
      var messageBodyhtml = message.getBody()
      
      var $ = Cheerio.load(messageBodyhtml);
      var date = $('span:contains("Date:")').siblings('span').text().trim();
      var property = $('span:contains("Regarding:")').siblings('span').text().trim();
      var name = $('span:contains("From:")').first().siblings('span').text().trim();
      var phone = $('span:contains("Contact Information:")').siblings('span').children().first().text().trim();
      var email = $('span:contains("Contact Information:")').siblings('span').children().last().text().trim();
      var comments = $('span:contains("Comments:")').siblings('span').text().trim();
      var source = $('span:contains("Lead From:")').siblings('span').text().trim();
  
      var array = [date,property,name,phone,email,comments,source]
      Logger.log(array);
      data2.push([array[i]]);
    }
    // Compare the information in the data2 array to existing information in the sheet
    var data = leadsData.getRange("A2:G").getValues(); //Change this to fit your data ranges
    for(i in data2){
      var row = data2[i];
      var duplicate = false;
      for(j in data){
        if(row.join() == data[j].join()){
          duplicate = true;
        }
      }
      if(!duplicate){
        newData.push(row);
      }
    }
    if (newData.length){  // runs the below code only if there is newData, this stops an error when newData is empty
      leadsData.getRange(leadsData.getLastRow()+1, 1, newData.length, newData[0].length).setValues(newData); //writes newData to end of sheet
    }
}

This is the result on the sheet after running - only the first array segment pastes on the sheet - and it's not pasted in as a row - it's pasted in down one column

10/19/2020 9:51:16 AM
address
Sandra

[email protected]
ACTION: Tour Request 10/03/2020 10:00AM   
Source website

I can't get the duplicate checker portion to work (e.g. if the row already exists, and I process the same email again as a test, the info gets appended as a row in the spreadsheet even if it's a duplicate)

var data = leadsData.getRange("A2:G").getValues(); //Change this to fit your data ranges
    data.forEach(function(row) {
      var duplicate = false;
      data2.forEach(function(row2) {
        if (row.join() == row2.join()) {
          duplicate = true;
        }
      });
      if (!duplicate) {
        newData.push(row);
      }
    });
1
Define but can't get what he did to work for my data - what is the issue? - Rafa Guillermo
I can see in the console log that up to line data2.push([array[i]]) everything is going to the array correctly. The part after that I want to check for duplicates by comparing the array of new data to the already existing data to see if there are duplicates. If a row of data is not a duplicate, push it from array data2 to newArray. Then append the newArray rows to the sheet (using setValues not using appendRow). It's this bottom part to check duplicates and paste the array to the sheet that's not pasting any data. Only the first array pastes - and not as a row - see edited question - 823g4n8901

1 Answers

1
votes

Answer:

The array structure for setting values to a Spreadsheet range is Array[rows][columns].

Structural Information:

Let's say you want to insert two rows of data to a sheet using .setValues().

The data for row 1 is 8 cells in length, each cell being a value from v1-v8. The data for row 2 is also 8 cells in length, but only columns A, B, G and H have data.

The array structure should be as follows:

var arr = [
            ["v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8"], // row 1
            ["A-v", "B-v", "", "", "", "", "G-v", "H-v"]      // row 2
          ];

And this will need to be set with:

SpreadsheetApp.getActiveSpredsheet()
              .getSheetByName("Sheet-name")
              .getRange("A1:B8")
              .setValues(arr);

This will appear in the Sheet as:

         |    A   |    B   |    C   |    D   |    E   |    F   |    G   |    H   |    
=========+========+========+========+========+========+========+========+========+
    1    |   v1   |   v2   |   v3   |   v4   |   v5   |   v6   |   v7   |   v8   |
---------+--------+--------+--------+--------+--------+--------+--------+--------+
    2    |   A-v  |  B-v   |        |        |        |        |  G-v   |  H-v   |
---------+--------+--------+--------+--------+--------+--------+--------+--------+
    3    |        |        |        |        |        |        |        |        |
---------+--------+--------+--------+--------+--------+--------+--------+--------+
    4    |        |        |        |        |        |        |        |        |
---------+--------+--------+--------+--------+--------+--------+--------+--------+

Code-specific Fix:

Use a nested forEach() instead of for... in:

data.forEach(function(row) {
  var duplicate = false;
  data2.forEach(function(row2) {
    if (row.join() == row2.join()) {
      duplicate = true;
    }
  });
  if (!duplicate) {
    newData.push(row);
  }
});

I hope this is helpful to you!