0
votes

My Java EE application uses Lucene 4. In Lucene index I have full name of 10 millions peoples. When I use search, Lucene returns different results. Same app works fine in development environment on Windows, test environment on AIX, but on production server Lucene search returns much less records. Same query returns 800 results in development and 20 results on production. We'll try AIX and Red Hat on production, but still no luck.

I copied Lucene index files from production to development environment, and has searched same query with same application: in my environment all works fine, there is 800 results. I start app with debug, copied Lucene query as text and use this query with Luke in my environment - have 800 results. There is high load on production, I'll try to load development environment, but Lucene works stable and always return 800.

Where to find the source of the problem?

1
This is usually due to different analyzers, especially if you mix between java and .net. Could you verify that you use identical analyzers in all environments, with same settings and stopwords?sisve
There is the same source code working in all environments, but different versions of JRE. When index updating, StandardAnalyzer used. During query, analyzer not indicated explicitly in source code. I'll try to use some different analyzers with query in Luke, but have allways 800 resultsbobzer
where are you storing your index in file system or memory?Yogesh
Index stored in file system, opened with FSDirectory.open(file). There is no transfer from file system to RAMDirectory.bobzer

1 Answers

0
votes

During the update installation, system administrators have specified a relative path to Lucene index in server configuration, and then start the app server. After first start, our application make full indexation of data in database, and then do increment indexing every two hours. Our servers are restarted every night by chron task, so after auto restart, relative path to Lucene index is changed. Next increment indexing create new index files in different folder and save changes there. When I asked sysadmins for index files, they give me first created big index, and i'll analize them, but in fact, server works with different index files.

So, the answer is: need to specify the full path to the Lucene index folder, not relative.