1
votes


I'm using a Google spreadsheet to store contact info I collect from a website (GDPR compliant, don't worry) and I synchronize this sheet with Mailchimp using a custom App script, which runs in the background.

More in detail, I'm using a temporary sheet as a buffer which is filled with the data users send via the site's forms (I'm using Contact Form 7 Google Sheets Connector). My script runs each time the event INSERT_ROW is triggered in the "buffer" sheet. It analyzes the data, stores them in the final sheet and, eventually, it emptys the "buffer".
Now, aware of potential concurrency issues I've written the code in the following form:

function onChange(e) {
  // Script initializaztion

  if (!formNames.includes(formName) || e.changeType != "INSERT_ROW") return;

  let lock = LockService.getScriptLock();

  try {
    lock.waitLock(60000);

    // Get new stuff from Form sheet
    .
    .
    .

    // Get existing data and compare with new
    compareData();

    // New entries
    insertNewDataIntoFinalSheet();

    // Updates
    updatesExistingDataInFinalSheet();

    // Clean up the buffer sheet
    deleteRowsFromSiteForms();

    console.log(formName);

  } catch(e) {
    console.log(e);
  }

  lock.releaseLock();
}

But I'm not sure the Lock is doing what I'm expecting it to do. And Google Docs are not so clear about its use.
If I'm not understanding Google docs wrong, any Lock is

preventing sections of code from running concurrently. This service allows scripts to prevents concurrent access to sections of code. This can be useful when you have multiple users or processes modifying a shared resource and want to prevent collisions (Google docs).

Now Imagine the following scenario: User A in his browser fills in the form. At the very same time (well, almost) User B is doing the same in another browser somewhere else in the world. The CF7 connector of User A's browser sends data to my "buffer" sheet and it gets the lock so that my script begins its work. A nanosecond later, users B's form inserts other data into the "buffer" triggering thus a script execution. This new script instance tries to acquire the lock but is blocked by user A's lock.

In the mean time, User A's script instance is finishing its job and cleans up everything, deleting also the data input by User B! When user A's script instance releases the lock, user B's script kicks in, but it finds its data swept away by user A's script instance.

Ideally, what I need is a way to "freeze" any modification to the "buffer" sheet (including insertion of new rows) until each script instance has finished its job.

But as far as I understand, and also judging by some strange errors I have in my logs, this is not what getScriptlock() is doing (and getUserlock() or getDocumentlock() neither).

Can anyone help me understanding how I should use Google LockService for that?

Thank you in advance

EDIT

Those who have commented my post (@TheMaster in particular) have caught the very core of the problem, so let me add some further information/thought. This is the the error I'm getting (sometimes) in my Stackdriver Log

Error log from Stackdrive

Apparently, my script is trying to access data when it is not supposed to. I suspect it is the scenario I've depicted: the second instance of the script finds the "buffer" emptied by the first instance.

As @TheMaster says, the problem is that there's no way to really "lock" a document if the document is accessed via some external resource, like Zapier, for example, or, as in my case, a Wordpress connector, or any API request (Google developers, shouldn't you address this issue?).

1
@Tanaike OP probably means userB logs are found in between userA logs or something like that. Just like OP, I was never able to get a lock when 1. Data comes from outside through sheets api or addon/Zapier(non -human trigger) 2. The function trigger is a installed onChange. To reproduce, those two conditions must be satisfied. It's like several separate instances of the script are running at the same time. In any case, @Raff Clarify those errors as requested by Tanaike preferably with proof/screenshots. - TheMaster
@TheMaster Thank you for your support. When users are accessed to Spreadsheet using each browser, the lock service can be used. But when the Spreadsheet is directly edited by Sheets API, the lock service cannot be used. In this case, when the script using Sheets API is run, it is required to protect the Spreadsheet using the script. And the script is required to include the process for not accessing to the protected Spreadsheet. I'm not sure whether this is the direction OP wants. - Tanaike
@Tanaike "when the Spreadsheet is directly edited from outside by Sheets API, OnChange event trigger is not run" - Not true. OnChange event is captured, even if the edit is made by sheets api and LockService.getScriptLock() is run, but it doesn't actually lock. This is the problem for op. - TheMaster
@TheMaster About this, it might be required to test it at the actual situation whether this can be used. - Tanaike
@AndresDuarte The idea of "protecting" temporarily the buffer sheet is something I've thought of, but eventually I've discarded the idea. If some external API tries to insert data into the buffer while the script is running (i.e. the buffer sheet is protected) the insertion will fail. A well conceived lock mechanism is meant to temporarily freeze and properly enqueue write request, not to let them fail. - user4096537

1 Answers

1
votes

Google developers, shouldn't you address this issue

I would've reported it, but, you see, the restrictions of a installable trigger are clear:

Script executions and API requests do not cause triggers to run.

So, I didn't want them to "fix" it by removing this functionality altogether. Then you'd have to poll at regular intervals, which would be a pain.

If two INSERT_ROWs are done by the api at say,

  • rows 21 to 25 and
  • at rows 26 to 30.

Even though the external api inserts the rows in order, sometimes, the latter is triggered first(say at 23:00) and the former is triggered last(at 23:01). The former's sheet.getLastRow() may still be 30(even though it was triggered for row insertion 21 to 25, because it was triggered late). One solution would be to let everything run async: Inside the change event trigger function, The active range refers to the range currently added by the api. The dimensions of the "INSERT_ROW" can be obtained by:

  const rng = ss.getActiveRange(); //Currently Active Range
  const row = rng.getRow(); //Active Row number
  const col = rng.getColumn(); //Active Column number
  const lrow = rng.getLastRow(); //Last Row in Active range

Notice that the last row is inferred by rng.getLastRow() and not sheet.getLastRow(). So, even if the latter is triggered first, We only get the currently added rows in that instance of script execution(i.e., rows 26 to 30). You maybe able to leverage that to get a true async workflow.