arangodb truncate fails on large a collection

Question

I get a timeout in arangosh and the arangodb service gets unresponsive if I try to truncate a large collection of ~40 million docs. Message:

arangosh [database_xxx]> db.[collection_yyy].truncate() ; JavaScript exception in file '/usr/share/arangodb/js/client/modules/org/arangodb/arangosh.js' at 104,13: [ArangoError 2001: Error reading from: 'tcp://127.0.0.1:8529' 'timeout during read'] ! throw new ArangoError(requestResult); ! ^ stacktrace: Error at Object.exports.checkRequestResult (/usr/share/arangodb/js/client/modules/org/arangodb/arangosh.js:104:13) at ArangoCollection.truncate (/usr/share/arangodb/js/client/modules/org/arangodb/arango-collection.js:468:12) at :1:11

ArangoDB 2.6.9 on Debian Jessie, AWS ec2 m4.xlarge, 16G RAM, SSD. The service gets unresponsive. I suspect it got stuck (not just busy), because it doesn't work until after I stop, delete database in /var/lib/arangodb/databases/, then start again.

I know I may be leaning towards the limits of performance due to the size, but I would guess that it is the intention not to fail regardless of size.

However on a non cloud Windows 10, 16GB RAM, SSD the same action succeeded well - after a while.

Is it a bug? I have some python code that loads dummy data into a collection if it helps. Please let me know if I shall provide more info. Would it help to fiddle with --server.request-timeout ?

Thanks in advance Søren

An update. Referring to the tests in my initial post. I repeated the truncate action on AWS ec2 m4.xlarge, but this time on ArangoDB version 2.7.0. The action succeeded correctly without going into a dead state. Something got fixed :-) It still took a longer than inserting the same data though. Cheers — sdy7

stj stj · Accepted Answer · 2015-10-06T14:28:39

Increasing --server.request-timeout for the ArangoShell will only increase the timeout that the shell will use before it closes an idle connection. The arangod server will also shut down lingering keep-alive connections, and that may happen earlier. This is controlled via the server's --server.keep-alive-timeout setting.

However, increasing both won't help much. The actual problem seems to be the truncate() operation itself. And yes, it may be very expensive. truncate() is a transactional operation, so it will write a deletion marker for each document it removes into the server's write-ahead log. It will also buffer each deletion in memory so the operation can be rolled back if it fails.

A much less intrusive operation than truncate() is to instead drop the collection and re-create it. This should be very fast. However, indexes and special settings of the collection need to be recreated / restored manually if they existed before dropping it.

For a document collection, it can be achieved like this:

function dropAndRecreateCollection (collectionName) {
  // save state
  var c          = db._collection(collectionName);
  var properties = c.properties();
  var type       = c.type();
  var indexes    = c.getIndexes();

  // drop existing collection
  db._drop(collectionName);

  // restore collection
  var i;
  if (type == 2) {
      // document collection
      c = db._create(collectionName, properties);
      i = 1;
  }
  else {
      // edge collection
      c = db._createEdgeCollection(collectionName, properties);
      i = 2;
  }

  // restore indexes
  for (; i < indexes.length; ++i) {
      c.ensureIndex(indexes[i]);
  }
}

arangodb truncate fails on large a collection

1 Answers