1
votes

Im using couchbase for quite some time, but I never really experienced couchbase to be real fast. Its rather exceptionally slow. I wonder what setting am I missing.

I have a root Server with following specs

  • Intel® Xeon® E5-2680V4 (4 Cores)
  • 12 GB DDR4 ECC
  • 60 GB SSD

Im running Couchbase 4.5.1-2844 Community Edition (build-2844) with 7.05GB RAM Allocated

The bucket has 1 Data Node and uses 4.93 GB with 3,093,889 Documents. The bucket Type is "Couchbase" with Cache Metadata set to "Value Ejection". Replicas are disabled. Disk I/O Optimization is set to Low. Flushing is not enabled.

All 3 million documents look smiliar to this one:

{
  "discovered_by": 0,
  "color": "FFBA00",
  "updated_at": "2018-01-18T21:40:17.361Z",
  "replier": 0,
  "message": "Irgendwas los hier in Luckenwalde?????",
  "children": "",
  "view_count": 0,
  "post_own": "FFBA00",
  "user_handle": "oj",
  "vote_count": [
    {
      "timestamp": "2018-01-19 09:48:48",
      "votes": 0
    }
  ],
  "child_count": 3,
  "share_count": 0,
  "oj_replied": false,
  "location": {
    "loc_coordinates": {
      "lat": 0,
      "lng": 0
    },
    "loc_accuracy": 0,
    "country": "",
    "name": "Luckenwalde",
    "city": ""
  },
  "tags": [],
  "post_id": "59aef043f087270016dc5836",
  "got_thanks": false,
  "image_headers": "",
  "cities": [
    "Luckenwalde"
  ],
  "pin_count": 0,
  "distance": "friend",
  "image_approved": false,
  "created_at": "2017-09-05T18:43:15.904Z",
  "image_url": ""
}

And a query could look like this

select COUNT(*) from sauger where color = 'FFBA00'

without an Index it fails to execute (timeout) via the couchbase-webapp, but with an index

CREATE INDEX color ON sauger(color)

the result takes up to 16 seconds, but after some tries it takes 2 to 3 seconds each time.

There are 6 different "Color-Strings" (like "FFBA00") and the result of the query is 466920 (which is a 6th of the total documents)

An Explain of above said query gives me this:

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "IndexCountScan",
          "covers": [
            "cover ((`sauger`.`color`))",
            "cover ((meta(`sauger`).`id`))"
          ],
          "index": "color",
          "index_id": "cc3524c6d5a8ef94",
          "keyspace": "sauger",
          "namespace": "default",
          "spans": [
            {
              "Range": {
                "High": [
                  "\"FFBA00\""
                ],
                "Inclusion": 3,
                "Low": [
                  "\"FFBA00\""
                ]
              }
            }
          ],
          "using": "gsi"
        },
        {
          "#operator": "IndexCountProject",
          "result_terms": [
            {
              "expr": "count(*)"
            }
          ]
        }
      ]
    },
    "text": "select COUNT(*) from sauger where color = 'FFBA00'"
  }
]

Everything is set up correctly, but still such simple queries take awfully long (and there is nothing else writing or reading from the database, and the server its running on is totally idle)

1

1 Answers

3
votes

Make sure you don't have a primary index. That will consume a lot of the index service's memory. Your statement saying the query times out without the index makes me think there's a primary index, otherwise the query would fail immediately.

Edit: Adding more details on Primary Indexes from the Indexing Best Practices blog post

  1. Avoid Primary Keys in Production

Unexpected full primary scans are a possible and any possibility of such occurrences should be removed by avoiding primary indexes altogether in Production. N1QL Index Selection is a rule based system for now that checks for a possible index that will satisfy the query, and if there is no such, then it resorts to using the Primary Index. Primary index has all the keys of the documents, and hence query will fetch all keys from the primary index and then hop to Data Service to fetch the documents and then apply filters. As you can see, this is a very expensive operation and should be avoided at all costs.

If there are no Primary Indexes created, and the query is not able to find a matching index to serve the query, then the Query Service errors with the following message. This is helpful and should help you in creating the required Secondary index suitably:

“No index available on keyspace travel-sample that matches your query. Use CREATE INDEX or CREATE PRIMARY INDEX to create an index, or check that your expected index is online.”