I have been recently working with Magnolia CMS which happens to use JCR.
One of the problems I have face is JCR data corruption and I found that I had very little knowledge of how to troubleshoot the situation.
My understanding of JCR is as follows:
- JCR is a specification, there are several implementations
- Jackrabbit is one JCR implementation
- Jackrabbit may store the information using the file system directly or using a database like MySQL
Now my questions are
- How can a JCR repository be backed up and restored?
- Is there any particular tool that can be used to check integrity of a given JCR and try to fix it? I have been playing a little bit with toromiro.
- Is there any particular resource of information/tutorial that I should read to gain full and proper understanding of the JCR technology?
Update:
I have some other questions:
- If a given JCR implementation stores the content on a database, can I expect ALL the content to be stored at that database or could it happen that some content (ie images), would be stored directly on the file system rather than in the database?
- Currently we have a JCR repo which is accessed by three different webservers, it is my understanding that the JCR spec considers this situation and that it protects the repo in order to prevent inconsistency on the content due to concurrent write access. Is this correct?
- To be specific, the problem we experienced consisted on having a node A containing a reference to node B, but node B being not accessible, after using a groovy script, we managed to delete node B (which seemed to be in an inconsistent state), however, how could we find all the references to node B (maybe not only node A referenced it, but also node C). What the hell could have caused the JCR repo to became corrupt?, btw we also tried to use the forceConsistencyCheck, autorepair and enableConsistencyCheck flags, it did not fix the problem.
Thanks