4
votes

I have a Node.js app running on a Google Compute VM instance that receives file uploads directly from POST requests (not via the browser) and streams the incoming data to Google Cloud Storage (GCS).

I'm using Restify b/c I don't need the extra functionality of Express and because it makes it easy to stream the incoming data.

I create a random filename for the file, take the incoming req and toss it to a neat little Node wrapper for GCS (found here: https://github.com/bsphere/node-gcs) which makes a PUT request to GCS. The documentation for GCS using PUT can be found here: https://developers.google.com/storage/docs/reference-methods#putobject ... it says Content-Length is not necessary if using chunked transfer encoding.

Good news: the file is being created inside the appropriate GCS storage "bucket"!

Bad News:

  1. I haven't figured out how to get the incoming file's extension from Restify (notice I'm manually setting '.jpg' and the content-type manually).

  2. The file is experiencing slight corruption (almost certainly do to something I'm doing wrong with the PUT request). If I download the POSTed file from Google, OSX tells me its damaged ... BUT, if I use PhotoShop, it opens and looks just fine.

Update / Solution

As pointed out by vkurchatkin, I needed to parse the request object instead of just piping the whole thing to GCS. After trying out the lighter busboy module, I decided it was just a lot easier to use multiparty. For dynamically setting the Content-Type, I simply used Mimer (https://github.com/heldr/mimer), referencing the file extension of the incoming file. It's important to note that since we're piping the part object, the part.headers must be cleared out. Otherwise, unintended info, specifically content-type, will be passed along and can/will conflict with the content-type we're trying to set explicitly.

Here's the applicable, modified code:

var restify = require('restify'),
    server = restify.createServer(),
    GAPI = require('node-gcs').gapitoken,
    GCS = require('node-gcs'),
    multiparty = require('multiparty'),
    Mimer = require('mimer');

server.post('/upload', function(req, res) {

    var form = new multiparty.Form();

    form.on('part', function(part){
        var fileType = '.' + part.filename.split('.').pop().toLowerCase();
        var fileName = Math.random().toString(36).slice(2) + fileType;

        // clear out the part's headers to prevent conflicting data being passed to GCS
        part.headers = null;

        var gapi = new GAPI({
            iss: '-- your -- @developer.gserviceaccount.com',
            scope: 'https://www.googleapis.com/auth/devstorage.full_control',
            keyFile: './key.pem'
        }, 
        function(err) {
            if (err) { console.log('google cloud authorization error: ' + err); }

            var headers = {
                'Content-Type': Mimer(fileType),
                'Transfer-Encoding': 'Chunked',
                'x-goog-acl': 'public-read'
            };

            var gcs = new GCS(gapi);

            gcs.putStream(part, myBucket, '/' + fileName, headers, function(gerr, gres){
                console.log('file should be there!');
            });
        });
    });
};
1
So this works and that's great but I'm still a little suspect (only because my level-of-knowledge isn't too strong) of the GCS module because it of its use of .pause() and .resume() ... can't this negatively impact the file stream? - Dan
Since Node's (v0.10.24) stream API is smart enough to wait for .pipe() to begin data streaming, I removed .pause() and .resume() and haven't see any problems. - Dan
Probably a silly question on an old topic, but when you are streaming the file, the chunks are stored in the app's memory and destroyed as they stream out to GCS? I'm just trying to imagine how this will effect memory requirements for my app. Thanks! - Askdesigners

1 Answers

3
votes

You can't use the raw req stream since it yields whole request body, which is multipart. You need to parse the request with something like multiparty give you a readable steam and all metadata you need.