solr transaction management using solrj

Question

How to handle transaction management in Solr using Solrj? There is not much documentation related to this on the net. But I would appreciate if someone can provide any links or information related to transaction management using SolrJ.

Are you talking about DB style transaction management? There's not really any point with SolrJ. There is no roll-back and commits are sent via a web service call to SOLR, so they are queued there and there is no risk of concurrent access issues (from memory). Are you using embedded SOLR? — nickdos
Yes, I am wondering on how I could achieve DB style transaction Management using solrj, when there are multiple reads and writes to Solr and to a Database. We are using Hibernate for the DB transaction Management. If there is an exception from Solr or from the database both the database commits and solr commits needs to be rolled back. From whatever I understand, it looks like this transaction management needs to all written in the code using the SolrJ api for commits and rollbacks. Please let me know if there are any other better ways to do this. Thanks. — Ravi
Go back through the question you have asked on SO and see if any of them have answers that you consider "correct", if so mark them as correct with the green tick icon. Your rate now says 75% which isn't too bad, so maybe you've already done this... — nickdos
This answer might be of interest: stackoverflow.com/a/6737063/249327. Seems I was wrong and SolrJ does have rollbacks. — nickdos

Charlie Reitzel Charlie Reitzel · Accepted Answer · 2019-07-15T20:11:39

The thing you have to keep in mind with Solr and transactions is that there is no isolation. Solr does not support transactions the way that most of us database developers are used to.

Commit makes all pending changes by all clients visible to new queries. Likewise, rollback rolls back all pending changes by all clients. There is zero consideration of which client sent the commit/rollback command.

For this reason, error handling shouldn't automatically result in a rollback. Because the impact may be much wider than just the data in error. And the cleanup may be much more difficult as a result.

The guidance from Solr documentation is to use auto-commit. This is especially true when performing bulk operations. If you are indexing in bulk, perhaps with multiple, parallel clients, then it's better to auto-commit every so often (or every so many documents). This causes fewer new index segments to be created and a less fragmented index overall as a result.

The details will depend on the mix of query and indexing operations happening on your Solr instance (and your replication approach).

There is a good Lucidworks article here: Understanding Transaction Logs, Soft Commit and Commit In SolrCloud

solr transaction management using solrj

2 Answers