1
votes

I have a graph database (Neo4j) in which I configured a property to be auto indexed with full-text. Everything is working great except that I have 1 row that is not returned when I execute a particular cypher query.

My property in the graph equals (I've put in bold the words I am using in my cypher query):

1pizzeriadeicomparipourlesamateursdevraiespizzasitaliennescestadireavecpastropdepateetcuitesaufeudeboislaplacenepayepasdeminesalleettablesassezpetitesetilfautsarmerdepatiencelessamedisoirssionnapasreserveenv15minutesdattentemaislespizzassontexcellentesrestaurantmontrealmontrealquebeccanada5148435411

If I execute the following cypher query:

START n1=NODE:node_auto_index('Search_Field:*res* AND Search_Field:*taurant* AND Search_Field:*411*')
RETURN n1.Search_Field

My row is returned! So far no problem!

But when I execute it by putting the word « restaurant » all together like this:

START n1=NODE:node_auto_index('Search_Field:*restaurant* AND Search_Field:*411*')    
RETURN n1.Search_Field

Then no rows are returned.

I tested a lot of stuffs in order to understand and try to find a pattern or something that can explain the problem. It seems like the length of my property value might play a role. I know it sounds strange but if I add 3 or more letters, let say « aaa », after the word restaurant in the property value, like this (look at the bold letters close to the end of the value):

1pizzeriadeicomparipourlesamateursdevraiespizzasitaliennescestadireavecpastropdepateetcuitesaufeudeboislaplacenepayepasdeminesalleettablesassezpetitesetilfautsarmerdepatiencelessamedisoirssionnapasreserveenv15minutesdattentemaislespizzassontexcellentesrestaurantaaamontrealmontrealquebeccanada5148435411

then, if I execute the same cypher query, the row is now returned.

Anyone had encountered similar problems! It's driving me crazy!

I have tested on both Neo4j-enterprise 2.2.1 and the latest Community 3.0.0-M02. Same result with both of them.

Any idea on where or what should I look for ?

1

1 Answers

0
votes

The query term get passed through the lucene analyzer - just like the contents you index. I'm not 100% sure but I think that the default analyzer "eats up" the digits, that's why you don't get the results.

You can supply an analyzer class when the index is created for the first time. Also you can use Java API to query the index - this allows to pass in instances of Lucene Query, see my example at http://blog.armbruster-it.de/2014/10/deep-dive-on-fulltext-indexing-with-neo4j/.