I'm using solr along with tomcat as servlet. I've setup solr to use only one core, and defined a DIH to import documents row by row from mysql tables. Everything is fine and works good. the documents get indexed correctly and I can search among 'em.
The problem is that I'm trying to use the suggester module but I have problem building what ever it needs to build for the first time using a url like this:
http://user:pass@localhost:port/solr/corename/suggest?q=whatever&spellcheck.build=true
I left out one important piece of information: the data being imported is 4.7 million records right now.
At first it couldn't event build the spellcheck dictionary(if that is what it's building) for 1 million documents because jvm would run out of heap memory with the following message:
java.lang.OutOfMemoryError: GC overhead limit exceededjava.lang.RuntimeException:
java.lang.OutOfMemoryError: GC overhead limit exceeded at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:793) at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:434) at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at
so I gradually increased the heap memory and right now it's about 2GB, which I supposed is a lot.
Of course the obvious solution is to increase the heap memory of java yet again, but i'm wandering if there's any way to divide and conquer the dictionary building process? Or any other solution for that matter.
Thanks a lot