How can I read an Append Blob from Azure Blob Storage to a string with Node.js SDK?

Question

I'm following the example here https://github.com/Azure/azure-storage-js/blob/master/blob/samples/basic.sample.js about reading a Blob from Azure Blob Storage to a string using the Node.js SDK.

The blob I'm trying to read is an Append Blob.

First of all reading the stream into a string takes a really long time and in the end I get a HTTP 412 error.

I also asked this question here: https://github.com/Azure/azure-storage-js/issues/51

I'm doing this with Node.js v10.14.1 and the SDK I'm using is @azure/[email protected].

My code is here:

const {
  Aborter,
  BlobURL,
  ContainerURL,
  SharedKeyCredential,
  ServiceURL,
  StorageURL,
} = require('@azure/storage-blob');
const format = require('date-fns/format');

async function streamToString(readableStream) {
  return new Promise((resolve, reject) => {
    const chunks = [];
    readableStream.on('data', (data) => {
      chunks.push(data.toString());
    });
    readableStream.on('end', () => {
      resolve(chunks.join(''));
    });
    readableStream.on('error', reject);
  });
}

async function run() {
  const accountName = 'xxxstor';
  const accountKey = 'omitted';
  const credential = new SharedKeyCredential(accountName, accountKey);
  const pipeline = StorageURL.newPipeline(credential);
  const serviceURL = new ServiceURL(
    `https://${accountName}.blob.core.windows.net`,
    pipeline
  );
  const containerName = 'request-logs';
  const containerURL = ContainerURL.fromServiceURL(serviceURL, containerName);
  const blobName = `${format(new Date(), 'YYYY-MM-DD[.txt]')}`;
  const blobURL = BlobURL.fromContainerURL(containerURL, blobName);
  console.log('Downloading blob...');
  const response = await blobURL.download(Aborter.none, 0);
  console.log('Reading response to string...');
  const body = await streamToString(response.);
  console.log(body.length);
}

run().catch((err) => {
  console.error(err);
});

The error that I'm getting is this:

{ Error: Unexpected status code: 412
    at new RestError (C:\projects\xxx\RequestLogViewer\node_modules\@azure\ms-rest-js\dist\msRest.node.js:1397:28)
    at C:\projects\xxx\RequestLogViewer\node_modules\@azure\ms-rest-js\dist\msRest.node.js:1849:37
    at process._tickCallback (internal/process/next_tick.js:68:7)
  code: undefined,
  statusCode: 412,
  request:
  WebResource {
    streamResponseBody: true,
    url:
      'https://xxxstor.blob.core.windows.net/request-logs/2019-01-04.txt',
    method: 'GET',
    headers: HttpHeaders { _headersMap: [Object] },
    body: undefined,
    query: undefined,
    formData: undefined,
    withCredentials: false,
    abortSignal:
      a {
        _aborted: false,
        children: [],
        abortEventListeners: [Array],
        parent: undefined,
        key: undefined,
        value: undefined },
    timeout: 0,
    onUploadProgress: undefined,
    onDownloadProgress: undefined,
    operationSpec:
      { httpMethod: 'GET',
        path: '{containerName}/{blob}',
        urlParameters: [Array],
        queryParameters: [Array],
        headerParameters: [Array],
        responses: [Object],
        isXML: true,
        serializer: [Serializer] } },
  response:
  { body: undefined,
    headers: HttpHeaders { _headersMap: [Object] },
    status: 412 },
  body: undefined }

Xiaoning Liu - MSFT Xiaoning Liu - MSFT · Accepted Answer · 2019-01-11T02:04:06

This questions has been resolved in GitHub issue https://github.com/Azure/azure-storage-js/issues/51 Copy solution from GitHub issue to here.

blobURL.download() will try to download a blob with a HTTP Get request into a stream. When stream unexpected ends due to such as network broken, a retry will resume the stream read from the broken point with a new HTTP Get request.

The second HTTP request will use conditional header IfMatch with the blob's ETag returned in first request to make sure the blob doesn't change when the 2nd retry happens. Otherwise, a 412 conditional header doesn't match error will be returned. This strict strategy is used to avoid data integrity issues, such as the blob maybe totally over written by someone others. However, this strategy seems avoiding you from reading a constantly updated log file when a retry happens.

While I don't think it's bug, but we need to make this scenario work for you. Please have a try with following solution: snapshot the append blob first, and read from the snapshot blob

const blobURL = BlobURL.fromContainerURL(containerURL, blobName);
console.log('Downloading blob...');
const snapshotResponse = await blobURL.createSnapshot(Aborter.none);
const snapshotURL = blobURL.withSnapshot(snapshotResponse.snapshot);
const response = await snapshotURL.download(Aborter.none, 0);
console.log('Reading response to string...', snapshotURL.blobContext.length);
const body = await streamToString(response.readableStreamBody);

How can I read an Append Blob from Azure Blob Storage to a string with Node.js SDK?

1 Answers