1
votes

I want to update Solr suggester dictionary after uploading some initial data (in Solr 5.3.1). The suggester dictionary is updated after deleting all data and uploading the modified data back. But when I delete any particular record (eg. id=123), the suggester still returns the deleted record in result set.

For example,

  1. I upload following json data initially to mycollection:

json_data.json

[
    {
        "id": 1,
        "name": "New York"
    },
    {
        "id": 2,
        "name": "New Jersey"
    },
    {
        "id": 3,
        "name": "California"
    }
]

with this command:

curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/mycollection/update?commit=true' --data-binary @json_data.json
  1. Select query returns the following:

localhost:8983/solr/mycollection/select?q=%3A&wt=json&indent=true

{
    "responseHeader": {
        "status": 0,
        "QTime": 0,
        "params": {
            "q": "*:*",
            "indent": "true",
            "wt": "json",
            "_": "1448271522315"
        }
    },
    "response": {
        "numFound": 3,
        "start": 0,
        "docs": [
            {
                "id": "1",
                "name": [
                    "New York"
                ],
                "_version_": 1518622755181822000
            },
            {
                "id": "2",
                "name": [
                    "New Jersey"
                ],
                "_version_": 1518622755186016300
            },
            {
                "id": "3",
                "name": [
                    "California"
                ],
                "_version_": 1518622755187064800
            }
        ]
    }
}
  1. Suggester returns the following for "New" word:

localhost:8983/solr/mycollection/suggest?q=New&wt=json&indent=true

{
    "responseHeader": {
        "status": 0,
        "QTime": 0
    },
    "spellcheck": {
        "suggestions": [
            "New",
            {
                "numFound": 2,
                "startOffset": 0,
                "endOffset": 3,
                "suggestion": [
                    "New Jersey",
                    "New York"
                ]
            }
        ],
        "collations": [
            "collation",
            "(New Jersey)"
        ]
    }
}
  1. Now I delete "New Jersey" from mycollection with following command: curl http://localhost:8983/solr/mycollection/update?commit=true -H "Content-Type: text/xml" --data-binary '(id:2)'

  2. Now select query returns only the remaining 2 records (removed "New Jersey")

localhost:8983/solr/mycollection/select?q=%3A&wt=json&indent=true

{
    "responseHeader": {
        "status": 0,
        "QTime": 0,
        "params": {
            "q": "*:*",
            "indent": "true",
            "wt": "json",
            "_": "1448272061874"
        }
    },
    "response": {
        "numFound": 2,
        "start": 0,
        "docs": [
            {
                "id": "1",
                "name": [
                    "New York"
                ],
                "_version_": 1518622755181822000
            },
            {
                "id": "3",
                "name": [
                    "California"
                ],
                "_version_": 1518622755187064800
            }
        ]
    }
}
  1. But the suggest query still returns the "New Jersey" in the resultset

localhost:8983/solr/mycollection/suggest?q=New&wt=json&indent=true

{
    "responseHeader": {
        "status": 0,
        "QTime": 0
    },
    "spellcheck": {
        "suggestions": [
            "New",
            {
                "numFound": 2,
                "startOffset": 0,
                "endOffset": 3,
                "suggestion": [
                    "New Jersey",
                    "New York"
                ]
            }
        ],
        "collations": [
            "collation",
            "(New Jersey)"
        ]
    }
}

I tried after reloading the admin core and restarting the solr server but the result remains unchanged.

What could be the issue? Is there any cache used by suggester component (solr.SpellCheckComponent)? If yes, how can it be cleared?

Any help would be appreciated.

1

1 Answers

0
votes

Most of the spell-checkers and suggesters do not work directly off the index but build a parallel structure. There is a setting that builds on every commit and - if that setting is false - there is a flag you can pass in the request to trigger the rebuild. For spellchecker, this flag is spellcheck.build or spellcheck.reload.

So, check what your configuration has and try calling rebuild/reload explicitly.