I would like to use Solr atomic updates in combination with some stored copyField
destination fields, which is not a recommended combination - so I wish to understand the risks.
The Solr documentation for Atomic Updates says (my emphasis):
The core functionality of atomically updating a document requires that all fields in your schema must be configured as stored (stored="true") or docValues (docValues="true") except for fields which are
<copyField/>
destinations, which must be configured as stored="false". Atomic updates are applied to the document represented by the existing stored field values. All data in copyField destinations fields must originate from ONLY copyField sources.
However, I have some copyField
destinations that I would like to set stored=true
so that highlighting works correctly for them (see this question, for example).
I need atomic updates so that an (unrelated) field can be modified by another process, without losing data indexed by my process.
The documentation warns that:
If destinations are configured as stored, then Solr will attempt to index both the current value of the field as well as an additional copy from any source fields. If such fields contain some information that comes from the indexing program and some information that comes from copyField, then the information which originally came from the indexing program will be lost when an atomic update is made.
But what does that mean? Can someone give an example that demonstrates this information-loss problem?
I am unsure what is meant by "some information that comes from the indexing program and some information that comes from copyField", in concrete terms.
Is it safe to make one copyField
destination stored, whilst atomically updating other fields, or vice versa? I have tried this out via the Solr Admin console, and have not been able to demonstrate any issues, but would like to be clear on what circumstances would trigger the problem.