After doing some research about JCR or RDBMS, and reading other posts, I am still uncertain whether to use JCR over JPA for a document management system, which has to deal with different document types, very large files and a lot of concurrent access from many users.
My main reason to consider JCR is because documents look like content to me, and the specification already deals with some problems that comes with it - mostly I am interested in storage and versioning. Also I would like to sort of encapsulate the document stuff within a JCR implementation and use JPA for everything else application specific.
Maybe someone can help me with my remaining questions:
- How does the read/query performance of JCR relate to JPA (I know it should vary greatly on the implementation, but there might be some rules of thumb)?
- Does anybody have real world experience in a simillar use case with some specific JCR implemenations? If so, did you mix it with a relational database (JPA)?
- Is it worth the overhead of introducing JCR considering it's benefits of filestorage and versioning? (I am likely going to my own custom use access control (JPA) and I will not need the extra flexibility to introduce new node properties within runtime)
- Does anybody have any experience about data integrity and backup solutions?
UPDATE: even though this question has been answered in detail, somebody might have a more critical sight about its use from a more practical point of view. Personally I am getting more and more concerned about the following non technically related issues:
- Documentation: Jackrabbit has poor documentation, it's guide to OCM contains a dead link in the first paragraph, some example search queries throw exceptions for unknown reasons, there is a TODO in a very basic tutorial and it's standalone server is not working properly within JDK8 which is not documented at all.
- Maturity: Jackrabbit Oak seems to be still work in progress and the other solutions look like either being abandoned or bleeding edge.
- Community: In opposite to JPA, doing research of JCR leads to way less hits. This could be a real problem, when a project team new to the technology gets stuck within (trival) problems.