How to treat Cloudant (CouchDB) as a document stack?

Question

We are using a CloudantDB as a document store, containing a list of data that we want to process.

At runtime, we basically want to get one document, process it, and if processed successfully remove it from the DB.

The only mechanisms I see are either to get the entire list of documents (which might not be good for us since it is likely to be a very large list), or individual document is we have the ID (which we won't have to start). If I were dealing with a conventional SQL database, I might have a cursor which I only advance when I want to process a document.

I am familiar with views, but I am not sure that helps here either.

Am I missing some option?

See my answer below. If there are other constraints that prevent you from doing it this way please update your question and I will be more than happy to take a look. — markwatsonatx

markwatsonatx markwatsonatx · Accepted Answer · 2016-04-15T15:38:08

There are a number of options for retrieving documents from Cloudant. Views are the underlying technology that allow you to query, sort, and aggregate documents. In your particular example it sounds like you just want to get the most (or least) recent document. You can do this with a view, or in Cloudant you can simply create an index.

Suppose you have a date field called create_date. In Cloudant you can create an index like so (go to Query then click edit next to "Your available indexes"):

{
  "index": {
    "fields": [
      "create_date"
    ]
  },
  "type": "json"
}

This will create a view and you will see it listed under "Design Documents". You can query that view in the dashboard as follows:

{
  "selector": {
    "create_date": {
      "$gt": 0
    }
  },
  "fields": [
    "_id",
    "_rev"
  ],
  "sort": [
    {
      "create_date": "desc"
    }
  ],
  "limit": 1
}

Note, I have limited my query to 1 document. This will return the most recent document added to Cloudant. To retrieve the earliest document added to Cloudant change the sort to "create_date": "asc".

You can run this outside of the dashboard using an HTTP POST call to /db/_find/. See this link for more information:

https://docs.cloudant.com/cloudant_query.html#finding-documents-using-an-index

UPDATE: Using text indexes and bookmarks

The above approach assumes you are going to delete each document and re-run the query every time. If you used an ascending sort you would always process the documents in order, but if you used a descending sort you could process newer documents as they are inserted.

An alternative approach would be to use bookmarks (as suggested by the OP in the comments below). To do see first create a text index in Cloudant:

{
  "index": {},
  "type": "text"
}

Run the same query as above. The results will now include a bookmarks field similar to the following:

{
  "docs":[{
    "_id":"aa279ae2835f51d8ea13ee3e6ae3a210",
    "_rev":"1-e90f3814f49b3e89158f8d2337de89cb"}
  ],
  "bookmark": "g1AAAAD4eJzLYWBgYM5gTmHQSElKzi9KdUhJMtRLytVNSczRLS5JzEtJLEox1EvOyS9NScwr0ctLLckB6mBKUgCSSfb____PAvPdHK_uzd_TwMCQKJ1Fuml5LECSYQGQAhq4H2HiAWEHoIkKaCaaE23iAYiJ9xEmHhY7AHZjFgAnFk_X"
}

In subsequent queries you can pass the bookmark to traverse the documents in order:

{
  "selector": {
    "create_date": {
      "$gt": 0
    }
  },
  "fields": [
    "_id",
    "_rev"
  ],
  "sort": [
    {
      "create_date": "desc"
    }
  ],
  "limit": 1,
  "bookmark" : "g1AAAAD4eJzLYWBgYM5gTmHQSElKzi9KdUhJMtRLytVNSczRLS5JzEtJLEox1EvOyS9NScwr0ctLLckB6mBKUgCSSfb____PAvPdHK_uzd_TwMCQKJ1Fuml5LECSYQGQAhq4H2HiAWEHoIkKaCaaE23iAYiJ9xEmHhY7AHZjFgAnFk_X"
}

More information about bookmarks can be found here:

https://docs.cloudant.com/cloudant_query.html#working-with-indexes

How to treat Cloudant (CouchDB) as a document stack?

2 Answers