I'm on Ubuntu 12.04, using jetty (9_M4), solr (4.0.0) through django-haystack (2.0beta) installed in a django 1.4.2 site.
I've had to make a number of jumps through hoops to get this up and running, as there is very little documentation for getting solr 4.0 up and running in Ubuntu with django-haystack. But how hard could it be?
My main confusion is between what Jetty is doing, and what Solr is doing.
So, I installed Jetty via this tutorial making a small adjustment to the init file as I note in the comment on that tutorial. Jetty is now running, I can see it in browser, even after a reboot.
Great.
Move onto installing Solr via this tutorial again with adjustments. Instead of:
cp -R apache-solr-4.0.0/example/solr /opt
I use:
cp -R apache-solr-4.0.0/example/* /opt/solr/
and therefore add the following to /etc/default/jetty:
JAVA_OPTIONS="-Dsolr.solr.home=/opt/solr/solr $JAVA_OPTIONS"
I can't exactly remember why I did that, but there was a reason at the time. I stop using that tutorial at that point, as I don't understand the solr concept of core very well, and I'm already flustered at how annoyingly difficult this is.
(For context, when I set up django-haystack 2.0 with solr 3.5 about 6 months ago it was terrifyingly easy and didn't require a separate jetty installation - all up took me about two hours)
Anyway, I go back to my Django installation, create the schema.xml, make the stopwords-en.txt changes, copy it across to /opt/solr/solr/collection1/conf.
I edit /opt/solr/solr/collection1/conf/solrconfig.xml to remove the reference to updateLog since any attempt I made to add version field to schema.xml failed dismally with some sort of character error. See here (lucene -solr-user mailing list) and here (django-haystack github) for more info on this.
Finally, I cd into /opt/solr and run it:
sudo java -jar start.jar
Ba-da-boom! I get some results (when I go to my django site and use the search I've set up). Fantastic. This is really great. Now I just need to make the starting of solr persistent.
I create an /etc/init/solr that looks like this:
description "Solr Search Server"
# Make sure the file system and network devices have started before
# we begin the daemon
start on (filesystem and net-device-up IFACE!=lo)
# Stop the event daemon on system shutdown
stop on shutdown
# Respawn the process on unexpected termination
respawn
# The meat and potatoes
exec /usr/bin/java -jar /opt/solr/start.jar >> /var/log/solr.log 2>&1
I restart the server and nothing - I can see solr running, but I'm not getting any results in my django search.
I remove the init file and try running from the cli again - yep, sweet.
So, my questions are:
What the hell have I done wrong?
How do I get solr to start at boot and respawn if it dies accidentally AND produce results through my Django/haystack interface
Why do I need jetty and solr running simultaneously, and what is the relationship of /opt/jetty/webapps/solr.war to my /opt/solr? Am I creating causing conflicts?
Why was this so easy with solr 3.5 and so difficult now? I ask this honestly - I don't want a list of excuses or explanations from solr developers - I want to know how my understanding can be so limited in the first instance (solr 3.5) and get it running in two hours and why I now need to have a comprehensively deeper understanding of jetty/solr architecture and cli/shell script hacking to get it to run?