24
votes

What's the query or some other quick way to delete all the documents matching the where condition in a collection?
I want something like DELETE * FROM c WHERE c.DocumentType = 'EULA' but, apparently, it doesn't work.

Note: I'm not looking for any C# implementation for this.

5
As Gaurav Mantri mentioned, it is not supported currently. There is a feedback under review.Tom Sun - MSFT
Interestingly this is possible using the Graph APIs as you can do .drop() on the result of a traversal to remove all selected docs....Jesse Carter
@JesseCarter could u describe in a bit detailGorvGoyl
@JerryGoyal Unfortunately unless you're fully embracing graph I don't think it will work for your usecase as Cosmos expects a specific document format. But for example you could do something like g.V().has('DocumentType', eq('EULA')).drop() and clear out all docs that matchedJesse Carter
where do i run this queryGorvGoyl

5 Answers

18
votes

I want something like DELETE * FROM c WHERE c.DocumentType = 'EULA' but, apparently, it doesn't work.

Deleting documents this way is not supported. You would need to first select the documents using a SELECT query and then delete them separately. If you want, you can write the code for fetching & deleting in a stored procedure and then execute that stored procedure.

18
votes

This is a bit old but just had the same requirement and found a concrete example of what @Gaurav Mantri wrote about.

The stored procedure script is here:

https://social.msdn.microsoft.com/Forums/azure/en-US/ec9aa862-0516-47af-badd-dad8a4789dd8/delete-multiple-docdb-documents-within-the-azure-portal?forum=AzureDocumentDB

Go to the Azure portal, grab the script from above and make a new stored procedure in the database->collection you need to delete from.

Then right at the bottom of the stored procedure pane, underneath the script textarea is a place to put in the parameter. In my case I just want to delete all so I used:

SELECT c._self FROM c

I guess yours would be:

SELECT c._self FROM c WHERE c.DocumentType = 'EULA'

Then hit 'Save and Execute'. Viola, some documents get deleted. After I got it working in the Azure Portal I switched over the Azure DocumentDB Studio and got a better view of what was happening. I.e. I could see I was throttled to deleting 18 a time (returned in the results). For some reason I couldn't see this in the Azure Portal.

Anyway, pretty handy even if limited to a certain amount of deletes per execution. Executing the sp is also throttled so you can't just mash the keyboard. I think I would just delete and recreate the Collection unless I had a manageable number of documents to delete (thinking <500).

Props to Mimi Gentz @Microsoft for sharing the script in the link above.

HTH

8
votes

I wrote a script to list all the documents and delete all the documents, it can be modified to delete the selected documents as well.

var docdb = require("documentdb");
var async = require("async");

var config = {
  host: "https://xxxx.documents.azure.com:443/",
  auth: {
    masterKey: "xxxx"
  }
};

var client = new docdb.DocumentClient(config.host, config.auth);

var messagesLink = docdb.UriFactory.createDocumentCollectionUri("xxxx", "xxxx");

var listAll = function(callback) {
  var spec = {
    query: "SELECT * FROM c",
    parameters: []
  };

  client.queryDocuments(messagesLink, spec).toArray((err, results) => {
    callback(err, results);
  });
};

var deleteAll = function() {
  listAll((err, results) => {
    if (err) {
      console.log(err);
    } else {
      async.forEach(results, (message, next) => {
        client.deleteDocument(message._self, err => {
          if (err) {
            console.log(err);
            next(err);
          } else {
            next();
          }
        });
      });
    }
  });
};

var task = process.argv[2];
switch (task) {
  case "listAll":
    listAll((err, results) => {
      if (err) {
        console.error(err);
      } else {
        console.log(results);
      }
    });
    break;
  case "deleteAll":
    deleteAll();
    break;

  default:
    console.log("Commands:");
    console.log("listAll deleteAll");
    break;
}
1
votes

And if you want to do it in C#/Dotnet Core, this project may help: https://github.com/lokijota/CosmosDbDeleteDocumentsByQuery. It's a simple Visual Studio project where you specify a SELECT query, and all the matches will be a) backed up to file; b) deleted, based on a set of flags.

-1
votes
##### Here is the python script which can be used to delete data from Partitioned Cosmos Collection #### This will delete documents Id by Id based on the result set data.

Identify the data that needs to be deleted before below step

res_list = "select id from id_del"
res_id = [{id:x["id"]} 
             for x in sqlContext.sql(res_list).rdd.collect()]
config = {
   "Endpoint" : "Use EndPoint"
  "Masterkey" : "UseKey", 
      "WritingBatchSize" : "5000",
    'DOCUMENTDB_DATABASE': 'Database',
    'DOCUMENTDB_COLLECTION': 'collection-core'
}; 

for row in res_id:
# Initialize the Python DocumentDB client
  client = document_client.DocumentClient(config['Endpoint'], {'masterKey': config['Masterkey']})

# use a SQL based query to get   documents

## Looping thru partition to delete

  query = { 'query': "SELECT c.id FROM c where c.id = "+ "'" +row[id]+"'"   }
  print(query)
  options = {}
  options['enableCrossPartitionQuery'] = True
  options['maxItemCount'] = 1000
  result_iterable = client.QueryDocuments('dbs/Database/colls/collection-core', query, options)
  results = list(result_iterable)
  print('DOCS TO BE DELETED : ' + str(len(results)))
  if len(results) > 0 :
      for i in range(0,len(results)):
      #  print(results[i]['id'])
          docID = results[i]['id']
          print("docID :" + docID)
          options = {}
          options['enableCrossPartitionQuery'] = True
          options['maxItemCount'] = 1000
          options['partitionKey'] = docID
          client.DeleteDocument('dbs/Database/colls/collection-core/docs/'+docID,options=options)
          print ('deleted Partition:' +  docID)