0
votes

I used to be quite proficient in VBA with excel, but I'm currently trying to do something with Google Scripts and I am well and truly stuck.

Basically, I am trying like to extract data out of a standardised email from Gmail into a Google sheet. There are a couple of other threads on the subject which I have consulted so far, and I can get the body of the email into the sheet but cannot parse it.

I am new to regex, but it tests OK on regex101

I am also brand new to Google Script, and even the debugger seems to have stopped working now (it did before, so would be grateful if anyone can suggest why this is).

Here is my basic function:

function processInboxToSheet() {
  var label = GmailApp.getUserLabelByName("NEWNOPS");
  var threads = label.getThreads();
  // Set destination sheet
  var sheet = SpreadsheetApp.getActiveSheet();
  // Get all emails labelled NEWNOPS 
  for (var i = 0; i < threads.length; i++) {
    var tmp,
      message = threads[i].getMessages()[1],  // second message in thread
      content = message.getPlainBody();  // remove html markup
    if (content) {

      // search email for 'of:' and capure next line of text as address
      // tests OK at regex101.com
      property = content.match(/of:[\n]([^\r\n]*)[\r\n]/); 

      // if no match, display error
      var property = (tmp && tmp[1]) ? tmp[1].trim() : 'No property';
      sheet.appendRow([property]);
    } // End if
  // remove label to avoid duplication 
  threads[i].removeLabel(label)
  } // End for loop
}

I can append 'content' to the sheet Ok, but cannot extract the address text required by the regex. Content displays as follows:

NOPS for the purchase of:
123 Any Street, Anytown, AN1 1AN

DATE: 05/05/2017
PRICE: £241,000

Seller’s Details
NAME: Mrs Seller

Thanks for reading :)

2

2 Answers

1
votes

The return value of .match() is an array. The first captured group, containing the address, will be at index 1.

Based on the following line after your call to .match(), it looks like the tmp variable should have been assigned that array, not the property variable.

var property = (tmp && tmp[1]) ? tmp[1].trim() : 'No property';

That line says, if .match() returned something that isn't null and has a value at index 1, then trim that value and assign to property, otherwise assign it the string 'No property'.

So, try changing this line:

property = content.match(/of:[\n]([^\r\n]*)[\r\n]/);  

To this:

tmp = content.match(/of:[\n]([^\r\n]*)[\r\n]/);
0
votes

Thanks Kevin, I think I must have changed it while debugging.

The problem was with my regexp in the end. After a bit of trial and error the following worked:

tmp = content.match(/of:[\r\n]+([^\r\n]+)/);