2
votes

How to handle transaction management in Solr using Solrj? There is not much documentation related to this on the net. But I would appreciate if someone can provide any links or information related to transaction management using SolrJ.

2
Are you talking about DB style transaction management? There's not really any point with SolrJ. There is no roll-back and commits are sent via a web service call to SOLR, so they are queued there and there is no risk of concurrent access issues (from memory). Are you using embedded SOLR? - nickdos
Yes, I am wondering on how I could achieve DB style transaction Management using solrj, when there are multiple reads and writes to Solr and to a Database. We are using Hibernate for the DB transaction Management. If there is an exception from Solr or from the database both the database commits and solr commits needs to be rolled back. From whatever I understand, it looks like this transaction management needs to all written in the code using the SolrJ api for commits and rollbacks. Please let me know if there are any other better ways to do this. Thanks. - Ravi
Go back through the question you have asked on SO and see if any of them have answers that you consider "correct", if so mark them as correct with the green tick icon. Your rate now says 75% which isn't too bad, so maybe you've already done this... - nickdos
This answer might be of interest: stackoverflow.com/a/6737063/249327. Seems I was wrong and SolrJ does have rollbacks. - nickdos

2 Answers

2
votes

The thing you have to keep in mind with Solr and transactions is that there is no isolation. Solr does not support transactions the way that most of us database developers are used to.

Commit makes all pending changes by all clients visible to new queries. Likewise, rollback rolls back all pending changes by all clients. There is zero consideration of which client sent the commit/rollback command.

For this reason, error handling shouldn't automatically result in a rollback. Because the impact may be much wider than just the data in error. And the cleanup may be much more difficult as a result.

The guidance from Solr documentation is to use auto-commit. This is especially true when performing bulk operations. If you are indexing in bulk, perhaps with multiple, parallel clients, then it's better to auto-commit every so often (or every so many documents). This causes fewer new index segments to be created and a less fragmented index overall as a result.

The details will depend on the mix of query and indexing operations happening on your Solr instance (and your replication approach).

There is a good Lucidworks article here: Understanding Transaction Logs, Soft Commit and Commit In SolrCloud

1
votes

You would have to programatically deal with the transactions in SolrJ. When dealing with multiple writes.

  1. Use the SolrServer api add method to add the SolrInputDoucments to the server.
  2. When all the SolrInputDocuments are added, call the commit method from the SolrServer api to commit all the changes.
  3. In case of an exception from Solr, if you want to rollback the writes to the Solr, Call the SolServer api rollback method.
  4. If you want to rollback the writes to the database as well, just throw back a runtime-exception from the catch block.

This is how I dealt with the transaction management. If anyone has better answers, please feel free to improve the answer.