I need to configure thinking sphinx with Spanish stemming and I can't get it to work.
I learned [1] that I needed to compile the sphinx source code with the libstemmer_c library and install it. Additionally, I had to change the configuration of thinking sphinx by adding the libstemmer_es stemmer to morphology.
In detail, this is what I did
Remove existing sphinx installation with apt-get
apt-get remove sphinxsearch
Download and unpack source code of sphinx and the libstemmer_c library and copy content of latter to libstemmer_c directory
wget http://sphinxsearch.com/files/sphinx-2.2.11-release.tar.gz tar xvf sphinx-2.2.11-release.tar.gz wget http://snowball.tartarus.org/dist/libstemmer_c.tgz tar xvf libstemmer_c.tgz cp -rf libstemmer_c/* sphinx-2.2.11-release/libstemmer_c/
Configure, compile and install sphinx with the libstemmer_c library
cd sphinx-2.2.11-release ./configure --with-mysql-includes=/usr/include/mysql --with-mysql-libs=/usr/lib/x86_64-linux-gnu --with-libstemmer make make install
Add libstemmer_es stemmer to morphology in thinking_sphinx.yml
development: mysql41: 3563 address: <%= ENV['SPHINX_HOST'] || '' %> enable_star: true charset_type: utf-8 min_infix_len: 2 morphology: libstemmer_es ...
Reconfigure sphinx and regenerate indices
bundle exec rake ts:configure bundle exec rake ts:generate
Restart docker containers and rails server
I'm working on a website with various products that are indexed with sphinx. With stemming enabled searching for "cameras" should yield all products with "cameras" or "camera". Currently, searching "cameras" only returns products with "cameras" in the string, but no products with "camera" only.
I'm using Rails 3.2, thinking-sphinx 3.2 and sphinx 2.2.11 on Ubuntu 14.04.4 LTS. Maybe worth to mention that I'm using docker containers. The searchd runs in a separate container apart from the rails application.
UPDATE 1: I can't do rake ts:regenerate
since I'm running searchd in a separate docker-container, i.e. my sphinx container. Instead I stop the sphinx container, enter a worker container, run rake ts:clear_rt
and rake ts:configure
, then restart the sphinx container which also restarts -searchd, enter the sphinx container and then finall run rake ts:generate
UPDATE 2: Content of log/development.searchd.log is
[Thu Mar 16 12:24:59.147 2017] [ 127] listening on all interfaces, port=3563
[Thu Mar 16 12:24:59.161 2017] [ 127] binlog: replaying log .../development/binlog.001
[Thu Mar 16 12:24:59.161 2017] [ 127] binlog: replay stats: 0 rows in 0 commits; 0 updates, 0 reconfigure; 0 indexes
[Thu Mar 16 12:24:59.162 2017] [ 127] binlog: finished replaying /opt/sharetribe/tmp/binlog/development/binlog.001; 0.0 MB in 0.000 sec
[Thu Mar 16 12:24:59.162 2017] [ 127] binlog: finished replaying total 1 in 0.001 sec
[Thu Mar 16 12:24:59.163 2017] [ 127] DEBUG: SaveMeta: Done.
[Thu Mar 16 12:24:59.163 2017] [ 127] accepting connections
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: ReadLock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: Unlock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: ReadLock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: Unlock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: ReadLock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: Unlock 0xe42ef8
... /* many more ReadLock and Unlock */
[Thu Mar 16 12:28:50.467 2017] [ 128] listening on all interfaces, port=3563
[Thu Mar 16 12:28:50.478 2017] [ 128] DEBUG: SaveMeta: Done.
[Thu Mar 16 12:28:50.478 2017] [ 128] accepting connections
[Thu Mar 16 12:28:55.503 2017] [ 128] DEBUG: ReadLock 0x1522ef8
[Thu Mar 16 12:28:55.503 2017] [ 128] DEBUG: Unlock 0x1522ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: ReadLock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: Unlock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: ReadLock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: Unlock 0xe42ef8
... /* many more ReadLock and Unlock */
[Thu Mar 16 12:29:09.806 2017] [ 128] caught SIGHUP (seamless=1, in queue=1)
[Thu Mar 16 12:29:09.806 2017] [ 128] DEBUG: CheckRotate invoked
[Thu Mar 16 12:29:09.806 2017] [ 128] DEBUG: /opt/sharetribe/db/sphinx/development/custom_field_value_core.new.sph is not readable. Skipping
[Thu Mar 16 12:29:09.806 2017] [ 128] DEBUG: /opt/sharetribe/db/sphinx/development/listing_core.new.sph is not readable. Skipping
[Thu Mar 16 12:29:09.806 2017] [ 128] WARNING: nothing to rotate after SIGHUP ( in queue=0 )
[Thu Mar 16 12:29:10.541 2017] [ 128] DEBUG: ReadLock 0x1522ef8
[Thu Mar 16 12:29:10.541 2017] [ 128] DEBUG: Unlock 0x1522ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: ReadLock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: Unlock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: ReadLock 0xe42ef8
[Thu Mar 16 12:25:04.175 2017] [ 127] DEBUG: Unlock 0xe42ef8
... /* many more ReadLock and Unlock */
UPDATE 3: I'm defining a real time index on listings of products with attributes such as title, description, author name etc.
ThinkingSphinx::Index.define :listing, :with => :real_time do
indexes title
indexes description
indexes custom_field_values_sphinx
indexes origin_loc.google_address
indexes author.given_name
indexes author.username
indexes location.province
...
This the underlying model
class Listing < ActiveRecord::Base
after_save ThinkingSphinx::RealTime.callback_for(:listing)
...
The Listing.search method is called in a public method of the model
Listing.search(
escaped_query,
:select => "*, #{SPHINX_WEIGHT_FUNCTION} as w",
:sql => {:include => params[:include]},
:star => true,
:with => with,
:with_all => with_all,
:order => params[:sort],
:per_page => per_page,
:page => page
)
[1] http://freelancing-gods.com/thinking-sphinx/advanced_config.html#word-stemming--morphology