Case Introduction
My case is store some words in elasticsearch index, every word get its ID. My query data is some message. When there are some punctuation marks in the query data message, Elasticsearch will return a wrong answer.
Example:
For instance, I stored keywords "banana,apple,pen" in the index. I stored it using the bulk_index API
Query data1: "is this banana?"
The right result should be hits keyword "banana", but now it hits nothing.
Query data2: ">> it is a book"
The result should be hits nothing, but now it hits all the keywords in the index.
Without the punctuation the query result will work correctly.
Code:
My code for storeToIndex:(python, pyelasticsearch as the client)
es=ElasticSearch('http://localhost:9200/')
rval = es.bulk_index('%s'%index_name,'json',doc, id_field="id")
My code for queryIndex()
query={"query":{"query_string":{"query":"%s"%query_data}}}
es=ElasticSearch('http://localhost:9200/')
search_result=es.search(query=query,index=index_name,doc_type='json')
Question:
I can use regular express to solve it but is there any solution using elasticsearch setups? Something like filter or API, etc.?
Environment configuration:
Ubuntu 12.04 desktop 64 bit
Elasticsearch server in Ubuntu, version 0.90.7,single node
Client: pyelasticsearch
Programing language: python
API used: bulk_index API, search API