5
votes

Introduction

Let me first introduce the goal of what I am trying to do.

  • I had a file split into two parts earlier

  • Size of both of these files together may exceed 50 MB (as a long term goal). Since UrlFetchApp.fetch() has restriction regarding the size of the request, I want to upload them separately, where each file will be less than 50 MB and consequently merge them. For now (to try the Drive API), I am using small files.

  • First file is of 640000 bytes (a multiple of 256) 524288 bytes. I realise I did a mistake earlier i.e I was using the size of the file as a multiple of 256 but it should be a multiple of 256*1024

  • Second file is of 47626 bytes 163339 bytes.

  • I had split the files using curl and uploaded them to my drive (normal web upload).

  • My intention is to upload the partial files one by one using Resumable Upload to Google Drive using the Google Drive API from Google Apps Script so that they maybe merged into one file.

What I have tried so far?

  • Yesterday, I had asked a question here. I was trying to perform a resumable upload using Drive.Files.insert and a user pointed out it is not possible using Drive.Files.insert which is quoted below.

Unfortunately, in the current stage, the resumable upload cannot be achieved using Drive.Files.insert. It seems that this is the current specification at Google side

  • What I am trying now is using Google Drive API. Enclosed below is the code.
function myFunction() {
    var token = ScriptApp.getOAuthToken();

    var f1_id = '1HkBDHV1oXXXXXXXXXXXXXXXXXXXXXXXX';
    var f2_id = '1twuaKTCFTXXXXXXXXXXXXXXXXXXXX';
    
    var putUrl = 'https://www.googleapis.com/drive/v3/files?uploadType=resumable';
  
    var fileData = {
        name : 'Merged-file-from-GAS',
        file : DriveApp.getFileById(f1_id).getBlob()
    }
    
    var options = {
      method : 'put',
      contentType:"application/json",
      headers : {
        Authorization: 'Bearer ' + token,
        'X-Upload-Content-Type' : 'application/octet-stream',
        'Content-Type' : 'application/json; charset=UTF-8'
      },
      muteHttpExceptions: true,
      payload : fileData
    };
  
    var response = UrlFetchApp.fetch(putUrl, options);
    Logger.log(response.getResponseCode());
    Logger.log(response.getAllHeaders()); 
}

  • I also tried changing the method to patch

  • I added Content-Length : 640000 inside headers and in that case I receive an error as provide below.

Exception: Attribute provided with invalid value: Header:Content-Length

  • I tried to create a file using Drive.Files.insert(resource) using blank resource. Then I tried to update it using UrlFetchApp(patchUrl,options) while having the variable var patchUrl = 'https://www.googleapis.com/upload/drive/v3/files/' + fileId + '?uploadType=resumable';

Result

  • It does not create any file.
  • Logger logs for the result of the attached code (initial code) are provided below:

[20-05-12 21:05:37:726 IST] 404.0

[20-05-12 21:05:37:736 IST] {X-Frame-Options=SAMEORIGIN, Content-Security-Policy=frame-ancestors 'self', Transfer-Encoding=chunked, alt-svc=h3-27=":443"; ma=2592000,h3-25=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q049=":443"; ma=2592000,h3-Q048=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43", X-Content-Type-Options=nosniff, Date=Tue, 12 May 2020 15:35:37 GMT, Expires=Mon, 01 Jan 1990 00:00:00 GMT, X-XSS-Protection=1; mode=block, Content-Encoding=gzip, Pragma=no-cache, Cache-Control=no-cache, no-store, max-age=0, must-revalidate, Vary=[Origin, X-Origin], Server=GSE, Content-Type=text/html; charset=UTF-8}

Question

  • What is the proper way of initiating a upload of a file in Drive to Drive using Drive API from Apps Script while keeping the upload type as resumable?

  • What should subsequent requests be like? So that files above 50 MB can be subsequently uploaded to the merged file?

Edit 1

Tried it again using corrected file chunks sizes. Same problem persists.

Edit 2

To understand the code in the answer, I used the code in // 2 of Tanaike's code alone to understand how Location is retrieved.

function understanding() {
  var token = ScriptApp.getOAuthToken();
  const filename = 'understanding.pdf';
  const mimeType = MimeType.PDF;

  const url = 'https://www.googleapis.com/drive/v3/files?uploadType=resumable';
  
  const res1 = UrlFetchApp.fetch(url, {
    method: "post",
    contentType: "application/json",
    payload: JSON.stringify({name: filename, mimeType: mimeType}),
    headers: {authorization: "Bearer " + ScriptApp.getOAuthToken()
  }});
  const location = res1.getHeaders().Location;
  Logger.log(location);
}

This creates a file understanding.pdf of size 0 bytes. However, Logger.log(location) logs null.

Why is it so?

The mistake was in the end point. Setting it to https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable works. It retrieves the location.

2
In your situation, can you adjust the size of the initial each file? Because about First file is of 640000 bytes (a multiple of 256)., in this case, the file cannot be directly used for the resumable upload. Because at the resumable upload, each chunk is required to be the multiples of 256 KB (256 x 1024 bytes). How about this? Ref If each file size (except for the last file) is the multiples of 262,144 bytes and less than 52,428,800 bytes for all files, the script for resumable upload can be simpler. - Tanaike
@Tanaike I tried it again using file of 256x1024x2 = 524288 bytes. It still didn't upload. Question Edited with new file size - user13355752
Thank you for replying. I would like to confirm your current situation. 1. In your sample files, "file A" and "file B" are 524,288 bytes and 163,339 bytes, respectively. 2. Your tested script is the script showing in your question. 3. You want to merge the files using the resumable upload. Is my understanding correct? By the way, what is the mimeType of the merged file? - Tanaike
@Tanaike 1. Yes 2. Yes 3. Yes. By the way, what is the mimeType of the merged file? PDF - user13355752

2 Answers

7
votes

From your question and replying, I could understand your situation and goal like below.

  • In your sample files, "file A" and "file B" are 524,288 bytes and 163,339 bytes, respectively.
  • Your tested script is the script showing in your question.
  • You want to merge the files using the resumable upload.
  • The mimeType of the merged file is PDF.

For this, how about this answer?

Modification points:

  • Unfortunately, your script is incomplete for achieving the resumable upload. The flow of the resumable upload at Google Drive API is as follows. Ref

    1. Request for retrieving the location which is used as the endpoint for uploading data.
      • In your case, the new file is created. So it is required to use the POST method.
    2. Request to the retrieved location by including the data (in your case, it's each file.).
      • In this case, it is required to upload the data using a loop. And the PUT method is used.
      • Here, each file size is most important. If the file size except for the last file is not the multiples of 262,144 bytes, the resumable upload cannot be run by an error. Please be careful this.

For above flow, when the sample script is prepared, it becomes as follows.

Usage:

1. Enable Drive API.

In this case, Drive API is used. So please enable Drive API at Advanced Google Services. By this, the Drive API is automatically enabled at API console.

The flow of the sample script is as follows.

  1. Create an object for using at the resumable upload.
  2. Retrieve "location" for starting the resumable upload.
  3. Upload each file and merge them.

2. Sample script.

Please copy and paste the following script. And please set the file IDs. In this case, please set them in order for merging. Please be careful this.

function myFunction() {
  const fileIds = ["###", "###"];  // Please set the file IDs of the file "A" and "B" in order.
  const filename = "sample.pdf";
  const mimeType = MimeType.PDF;

  // 1. Create an object for using at the resumable upload.
  const unitSize = 262144;
  const fileObj = fileIds.reduce((o, id, i, a) => {
    const file = DriveApp.getFileById(id);
    const size = file.getSize();
    if (i != a.length - 1 && (size % unitSize != 0 || size > 52428800)) {
      throw new Error("Size of each file is required to be the multiples of 262,144 bytes and less than 52,428,800 bytes.");
    }
    o.files.push({data: file.getBlob().getBytes(), range: `bytes ${o.size}-${o.size + size - 1}\/`, size: size.toString()});
    o.size += size;
    return o;
  }, {size: 0, files: []});

  // 2. Retrieve "location" for starting the resumable upload.
  const url = "https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable";
  const res1 = UrlFetchApp.fetch(url, {
    method: "post",
    contentType: "application/json",
    payload: JSON.stringify({name: filename, mimeType: mimeType}),
    headers: {authorization: "Bearer " + ScriptApp.getOAuthToken()
  }});
  const location = res1.getHeaders().Location;

  // 3. Upload each file and merge them.
  fileObj.files.forEach((e, i) => {
    const params = {
      method: "put",
      headers: {"Content-Range": e.range + fileObj.size},
      payload: e.data,
      muteHttpExceptions: true,
    };
    const res = UrlFetchApp.fetch(location, params);
    const status = res.getResponseCode();
    if (status != 308 && status != 200) {
      throw new Error(res.getContentText());
    }
    if (status == 200) {
      console.log(res.getContentText())
    }
  });

  // DriveApp.createFile()  // This comment line is used for automatically detecting the scope of "https://www.googleapis.com/auth/drive" by the script editor. So please don't remove this line.
}

Result:

When the resumable upload is finished, the following result can be seen at the log. And you can see the merged file at the root folder.

{
 "kind": "drive#file",
 "id": "###",
 "name": "sample.pdf",
 "mimeType": "application/pdf"
}

Note:

  • This is a simple sample script. So please modify this for your actual situation.
  • I tested above script for your sample situation that "file A" and "file B" are 524,288 bytes and 163,339 bytes. So when several files with about 50 MB in size are merged using this script, an error occurs.
  • If the memory error occurs when the large files are used, in the current stage, it seems that this is the specification of Google side. So please be careful this.

Reference:

1
votes

Tanaike's answer is more than perfect. It's elegant and has even helped me to learn about array.reduce function. Before I asked this question, I had minimal knowledge about JavaScript and almost zero knowledge in using Google Drive API.

My intention was to learn the whole process of resumable upload step by step using Google Apps Script as the language. Using Tanaike's code as reference I wrote a script which instead of being productive, manageable, and elegant would provide myself (at least) an idea of how resumable upload works step by step. I have used no loops, no objects, and even no arrays.

Step 1 ( Declare the necessary variables )

  var fileId1 = "XXXXXXXXXXX"; //id of the first file
  var fileId2 = "YYYYYYYYYYY"; //id of the second file
  var filename = "merged.pdf"; //name of the final merged file
  var mimeType = MimeType.PDF; //Mime type of the merged file

Step 2 ( Initiate the resumable upload )

//declare the end point
const url = "https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable";

//Send the request
//Method to be used is Post during initiation
//No file is to be sent during initiation
//The file name and the mime type are sent
const res1 = UrlFetchApp.fetch(url, {
    method: "post",
    contentType: "application/json",
    payload: JSON.stringify({name: filename, mimeType: mimeType}),
    headers: {authorization: "Bearer " + ScriptApp.getOAuthToken()
  }});

Step 3 ( Save the resumable session URI )

const location = res1.getHeaders().Location;

Step 4 (a) ( Upload file 1 )

Note : Step 4 (a) and (b) can be performed using a loop. In my case, I used it two times without loop

  var file = DriveApp.getFileById(fileId1); //get the first file
  var data = file.getBlob().getBytes(); //get its contents in bytes array

//Method used is PUT not POST
//Content-Range will contain the range from starting byte to ending byte, then a slash
//and then file size
//bytes array of file's blob is put in data
  var params = {
    method : "put",
    headers : {
      'Content-Range' : `bytes 0-524287/687627`
    },
    payload : data,
    muteHttpExceptions: true
  }; 

//Request using Resumable session URI, and above params as parameter

  var result = UrlFetchApp.fetch(location,params);

Step 4 (b) ( Upload the second file )

//Almost same as Step 4 (a)
//The thing that changes is Content Range
file = DriveApp.getFileById(fileId2);
  data = file.getBlob().getBytes();

  params = {
    method : "put",
    headers : {
      'Content-Range' : `bytes 524288-687626/687627`
    },
    payload : data,
    muteHttpExceptions : true
  };

  result = UrlFetchApp.fetch(location, params);

Now instead of doing step 4 n number of times, it's better to use a loop.

Also, this code doesn't checks for possible error that might have occurred during the process.

Hope this code helps someone, even though it was more of a self-teaching experiment. :)