0
votes

The following code adapted from elsewhere:

function importCSVFromWeb() {
  var csvUrl = "https://drive.google.com/uc?export=download&id=xxxxxxxxxxxxxxxxxxxxxxxxxx";
  var csvContent = UrlFetchApp.fetch(csvUrl).getContentText();
  var csvData = Utilities.parseCsv(csvContent);
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = ss.getSheetByName('data');
  sheet.clear();
  sheet.getRange(1, 1, csvData.length, csvData[0].length).setValues(csvData);
}

executes quickly until the SetValues() in the last line:

[19-12-07 13:45:09:093 PST] Starting execution
[19-12-07 13:45:10:463 PST] UrlFetchApp.fetch([https://drive.google.com/uc?export=download&id=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]) [1.36 seconds]
[19-12-07 13:45:10:538 PST] UrlFetchApp.HTTPResponse.getContentText() [0.074 seconds]
[19-12-07 13:45:10:976 PST] Utilities.parseCsv([ColAHeader,ColBHeader,ColCHeader,ColDHeader,ColEHeader,ColFHeader,ColGHeader,ColHHeader,ColIHeader,ColJHeader,ColKHeader,ColLHeader,ColMHeader
aaa,bbb,ccc,ddd,eee,fff,ggg,hhh,...) [0.426 seconds]
[19-12-07 13:45:11:504 PST] SpreadsheetApp.getActiveSpreadsheet() [0 seconds]
[19-12-07 13:45:11:715 PST] SpreadsheetApp.Spreadsheet.getSheetByName([data]) [0.21 seconds]
[19-12-07 13:45:11:716 PST] SpreadsheetApp.Sheet.clear() [0 seconds]
[19-12-07 13:45:11:716 PST] SpreadsheetApp.Sheet.getRange([1, 1, 106974, 13]) [0 seconds]
[19-12-07 13:51:23:063 PST] SpreadsheetApp.Range.setValues([[[ColAHeader,ColBHeader,ColCHeader,ColDHeader,ColEHeader,ColFHeader,ColGHeader,ColHHeader,ColIHeader,ColJHeader,ColKHeader,ColLHeader,ColMHeader], [aaa, bbb, ccc, ddd...) [370.621 seconds]
[19-12-07 13:51:23:241 PST] Execution failed: Exceeded maximum execution time [373.963 seconds total runtime]

Other info:

  • The array comes from a parsed CSV. This array has 106974 rows and 13 columns.
  • SetValues() writes approximately 35000 rows to the sheet 'data' before the script reaches the execution limit.

The timeout is the reason for wanting to split the array. I'd like to process, say, 20,000 rows and then exit. Then create a new script that would process rows 20,001-40,000.

I have tried manually changing the third parameter of the getRange to 20,000 but I apparently don't understand the command well enough as it gives me an error:

The number of rows in the data does not match the number of rows in the range. The data has 106974 but the range has 20000.

Is there a way to split the array so I can process it in batches?

2
Can I ask you about your question? 1. I cannot understand about Approximately 35000 rows are successfully although you say This array has 106974 rows and 13 columns. 2. If your goal is to put all CSV values to Google Spreadsheet, although I'm not sure whether this is the direction you want, in your case, how about directly converting the CSV file to Google Spreadsheet using Drive API? If I misunderstood your question, I apologize. - Tanaike
Hi Tanaike - thank you for your comment and apologies for my difficult-to-understand question. Regarding 1), I have updated the wording. Basically, if I view my sheet in Google Sheets the last row is incomplete, and is around row 35000. Every time it is different, but it is approximately row 35,000. There should be 106,974 rows. Regarding 2), the data in the original CSV will change weekly, so this script will be run weekly. Directly converting, I think you are referring to a "once-off" case, right? Thank you for replying. Please let me know if you require further clarification. - dbros
What Tanaike is saying is rather than opening the CSV and copying and pasting the values into a Google Sheet is to use the Drive API to make a copy of the CSV itself in Google Sheets format. - James D
Ahh, yes.. That would break the formulas as it would have a different file ID each time. No, not what I was talking about. - James D
Use API file import to create the new Google sheet from CSV directly, and then use return value (i.e. metadata of the created sheet) to copy the range values from the created file to the desired sheet. - tehhowch

2 Answers

2
votes
  • You want to put the CSV values to the active Spreadsheet.
    • In your situation, the CSV values are large which is 106,974 rows and 13 columns.
  • You want to achieve this using Google Apps Script.

If my understanding is correct, how about this answer? In this answer, I would like to propose 2 patterns. Please think of this as just one of several possible answers.

Pattern 1:

In this pattern, the CSV data is directly put to the active Spreadsheet using Sheets API. When the benchmark for CSV Data to Spreadsheet using Google Apps Script is measured, it was found that the process of Sheets API is faster than that of Spreadsheet service which is using in your script. So as one of pattern, I proposed this way.

Sample script:

When your script is modified for using Sheets API, it becomes as follows. Before you run the script, please enable Sheets API at Advanced Google services.

function importCSVFromWeb() {
  var csvUrl = "https://drive.google.com/uc?export=download&id=xxxxxxxxxxxxxxxxxxxxxxxxxx";
  var csvContent = UrlFetchApp.fetch(csvUrl).getContentText();
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheetId = ss.getSheetByName('data').getSheetId();
  var resource = {requests: [
    {updateCells: {range: {sheetId: sheetId}, fields: "*"}},
    {pasteData: {data: csvContent, coordinate: {sheetId: sheetId}, delimiter: ","}}
  ]};
  Sheets.Spreadsheets.batchUpdate(resource, ss.getId());
}
  • Please set the file ID of the CSV file to the variable of csvUrl.

Pattern 2:

In this pattern, the CSV data is converted to Spreadsheet using the method of Files: copy of Drive API and copy the sheet to the active Spreadsheet.

Sample script:

When your script is modified, it becomes as follows. Before you run the script, please enable Drive API at Advanced Google services.

function importCSVFromWeb() {
  var fileIdofCSVFile = "xxxxxxxxxxxxxxxxxxxxxxxxxx";
  var id = Drive.Files.copy({title: "temp", mimeType: MimeType.GOOGLE_SHEETS}, fileIdofCSVFile).id;
  var sheetName = "data";
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = ss.getSheetByName(sheetName).setName(sheetName + "_temp");
  var tempSheet = SpreadsheetApp.openById(id).getSheets()[0];
  var copiedSheet = tempSheet.copyTo(ss).setName(sheetName);
  ss.deleteSheet(sheet);
  DriveApp.getFileById(id).setTrashed(true);
}
  • Please set the file ID of the CSV file to the variable of fileIdofCSVFile.

Note:

  • In your script, it supposes that the CSV file of https://drive.google.com/uc?export=download&id=xxxxxxxxxxxxxxxxxxxxxxxxxx is publicly shared. Please be careful this.

References:

Added: Pattern 3

In this sample script, the script of pattern 2 was modified. In this case, Drive API and Sheets API are used. So please enable both APIs at Advanced Google services.

Sample script:

function importCSVFromWeb() {
  var fileIdofCSVFile = "xxxxxxxxxxxxxxxxxxxxxxxxxx";
  var id = Drive.Files.copy({title: "temp", mimeType: MimeType.GOOGLE_SHEETS}, fileIdofCSVFile).id;
  var sheetName = "data";
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = ss.getSheetByName(sheetName);
  sheet.clear();
  SpreadsheetApp.flush();
  var tempSheet = SpreadsheetApp.openById(id).getSheets()[0];
  var copiedSheet = tempSheet.copyTo(ss).getSheetId();
  var resource = {requests: [
    {copyPaste: {
      source: {sheetId: copiedSheet, startRowIndex: 0, startColumnIndex: 0},
      destination: {sheetId: sheet.getSheetId(), startRowIndex: 0, startColumnIndex: 0}
    }},
    {deleteSheet: {sheetId: copiedSheet}}
  ]};
  Sheets.Spreadsheets.batchUpdate(resource, ss.getId());
  DriveApp.getFileById(id).setTrashed(true);
}
0
votes

This function allows you to break up your array as you requested. You need to run initializeForAppCSVData first and setup the page length. It will setup the timebased trigger for you and you can adjust the time if necessary. Once the function gets to array.length it automatically deletes the trigger.

If you wish you can setup a periodic trigger to run the initializeForAppCSVData() function.

function initializeForAppCSVData() {
  var csvUrl = "https://drive.google.com/uc?export=download&id=xxxxxxxxxxxxxxxxxxxxxxxxxx";
  var csvContent = UrlFetchApp.fetch(csvUrl).getContentText();
  var array = Utilities.parseCsv(csvContent);
  /*var array=Utilities.parseCsv(DriveApp.getFileById("id").getBlob().getDataAsString());*/
  PropertiesService.getScriptProperties().setProperty("StartIndex", 0);
  PropertiesService.getScriptProperties().setProperty("PageLength", 1000);
  PropertiesService.getScriptProperties().setProperty("ArrayLength", array.length);
  var ss=SpreadsheetApp.getActive();
  var sh=ss.getSheetByName('Data');
  sh.clearContents();
  setupTimeBasedTrigger();
}

function appendArrayData() {
  var ss=SpreadsheetApp.getActive();
  var sh=ss.getSheetByName('Data');
  var csvUrl = "https://drive.google.com/uc?export=download&id=xxxxxxxxxxxxxxxxxxxxxxxxxx";
  var csvContent = UrlFetchApp.fetch(csvUrl).getContentText();
  var array = Utilities.parseCsv(csvContent);
   /*var array=Utilities.parseCsv(DriveApp.getFileById("id").getBlob().getDataAsString());*/
  var startIndex=Number(PropertiesService.getScriptProperties().getProperty("StartIndex"));
  var pageLength=Number(PropertiesService.getScriptProperties().getProperty("PageLength"));
  var arrayLength=Number(PropertiesService.getScriptProperties().getProperty("ArrayLength"));
  if((startIndex + pageLength)<arrayLength) {
    var start=startIndex;
    var end=startIndex+pageLength;
  }else{
    var start=startIndex
    var end=arrayLength;
    deleteTrigger('appendArrayData');
  }
  var vA=[];
  for(var i=start;i<end;i++) {
    vA.push(array[i]);
  }
  sh.getRange(startIndex+1,1,vA.length,vA[0].length).setValues(vA);
  PropertiesService.getScriptProperties().setProperty("StartIndex", startIndex+pageLength);
}

function setupTimeBasedTrigger() {
  if(!isTrigger('appendArrayData')) {
    ScriptApp.newTrigger('appendArrayData').timeBased().everyMinutes(5).create();
  }
}

function deleteTrigger(triggerName){
  var triggers=ScriptApp.getProjectTriggers();
  for (var i=0;i<triggers.length;i++){
    if (triggerName==triggers[i].getHandlerFunction()){
      ScriptApp.deleteTrigger(triggers[i]);
    }
  }
}

function isTrigger(funcName){
  var r=false;
  if(funcName){
    var allTriggers=ScriptApp.getProjectTriggers();
    for(var i=0;i<allTriggers.length;i++){
      if(funcName==allTriggers[i].getHandlerFunction()){
        r=true;
        break;
      }
    }
  }
  return r;
}