Since you've already got Riak up and running, you just need to get search going:
First make sure you have search enabled in your app.config file on every node in your cluster:
{riak_search, [
{enabled, true}
]},
If you changed that you will need to restart Riak for it to take effect.
Then from the command line, install the search hook on the bucket you want indexed:
# search-cmd install testbucket
:: Installing Riak Search <--> KV hook on bucket 'testbucket'.
At this point if there is already data in the bucket, it will not be indexed. You will need to re-put any pre-existing data that you want indexed.
For a quick demonstration, I created 3 keys, creatively named 1,2, and 3; each containing a simple json object:
curl localhost:8098/buckets/testbucket/keys/1 -H "content-type: application/json" -XPUT \
-d '{"firstName":"Tom", "color":"red"}'
curl localhost:8098/buckets/testbucket/keys/2 -H "content-type: application/json" -XPUT \
-d '{"firstName":"Dick", "color":"green"}'
curl localhost:8098/buckets/testbucket/keys/3 -H "content-type: application/json" -XPUT \
-d '{"firstName":"Harry", "color":"blue"}'
I can then query search to find the keys:
# curl http://localhost:8098/solr/testbucket/select\?q=firstName:Harry
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
<lst name="params">
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">firstName:Harry</str>
<str name="q.op">or</str>
<str name="filter"></str>
<str name="df">value</str>
<str name="wt">standard</str>
<str name="version">1.1</str>
<str name="rows">1</str>
</lst>
</lst>
<result name="response" numFound="1" start="0" maxScore="0.353553">
<doc>
<str name="id">3
</str>
<str name="color">blue
</str>
<str name="firstName">Harry
</str>
</doc>
</result>
</response>
# curl http://localhost:8098/solr/testbucket/select\?q=color:red%20or%20firstName:Harry
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">2</int>
<lst name="params">
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">color:red or firstName:Harry</str>
<str name="q.op">or</str>
<str name="filter"></str>
<str name="df">value</str>
<str name="wt">standard</str>
<str name="version">1.1</str>
<str name="rows">2</str>
</lst>
</lst>
<result name="response" numFound="2" start="0" maxScore="0.143844">
<doc>
<str name="id">1
</str>
<str name="color">red
</str>
<str name="firstName">Tom
</str>
</doc>
<doc>
<str name="id">3
</str>
<str name="color">blue
</str>
<str name="firstName">Harry
</str>
</doc>
</result>
</response>
I don't have a Scala install handy to whip up an example, but this should get you going in the right direction.
In case you haven't already seen them, the search docs are here:
http://docs.basho.com/riak/latest/dev/using/search/
curl http://localhost:8098/solr/preit-users/select?q=firstName:Scala- Joe<?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">0</int> <lst name="params"> <str name="indent">on</str> <str name="start">0</str> <str name="q">firstName:Scala</str> <str name="q.op">or</str> <str name="filter"></str> <str name="df">value</str> <str name="wt">standard</str> <str name="version">1.1</str> <str name="rows">0</str> </lst> </lst> <result name="response" numFound="0" start="0" maxScore="0.0"> </result>-The result - manenumFound="0"), the search query is not finding any results so the mapreduce job had no input. - Joe