2
votes

There is a bis issue in solr parallel update and total indexing

Total Import syntax (working)
dataimport?command=full-import&commit=true&optimize=true 

Update syntax(working)
solr/update?softCommit=true' -H 'Content-type:application/json' -d '[{"id":"1870719","column":{"set":11}}]'

Issue: If both are run in parallel, then commit in b/w take place.

Example: i have 10k in total indexes.... i fire an solr query to update 1000 records and in between i fire a total import(full indexer).... what's happening is that in between commit is taken place... i.e untill total indexer runs i got limited records(1000).

How to solve this ?

1

1 Answers

1
votes

I faced a similar situation with Solr and solved it as follows:

A) Never run a full load on the live site. Only do a full load if the index is borken and needs to be deleted and rebuilt (with the main site down).

B) If a "refresh" is needed, do it as a background task - ie thread(s) - re-indexing each record individually, and deleting any new documents added (if appropriate).

The speed of processing bulk volumes can be greatly improved by using the multi-threaded indexer - see http://wiki.apache.org/lucene-java/ImproveIndexingSpeed