0
votes

I am new to Magnolia CMS and the Apache Jackrabbit content repository concepts.

There is a web application which is using Magnolia CMS. Magnolia is using SQL SERVER 2012 database as persistence manager. Here Apache Jackrabbit content repository implementation is done. There are two separate configurations of the Magnolia CMS which are used for the application, referred to as the public and author instances.

Now here we are trying to replace the existing Magnolia CMS with a custom ASP.NET MVC 5 application with all the functionalities.

I analysed the tables in the SQL SERVER database and found that data stored in format of Node_ID and Bundle_Data which is very difficult to analyse. In short, it is not easy to interpret.

Based on the custom CMS a new database model for author instance (SQL SERVER 2012) is developed.

Hence as part of migration task ,I am trying to migrate the old data that is stored in the SQL SERVER with the Apache Jackrabbit content repository implementation to a normal SQL SERVER 2012 (as per the new database model).

Can anyone help me to know are there are any proven methods or tools available to accomplish this task.

1

1 Answers

2
votes

The question is more on the jackrabbit-side, not so much on the Magnolia side, especially since you want to replace Magnolia entirely, not just the persistence layer:

Now here we are trying to replace the existing Magnolia CMS with a custom ASP.NET MVC 5 application with all the functionalities.

although my question really is whether you really want to replace Jackrabbit entirely, or still use Jackrabbit with your ASP.NET application but with a MS SQL Server datastore (which would be my personal suggestion)? Otherwise you will be getting rid of all the benefits that Jackrabbit has.

Jackrabbit does support SQL Server and I would suggest to use it.

https://wiki.apache.org/jackrabbit/DataStore#Configuration-1:

Currently supported are: db2, derby, h2, mssql, mysql, oracle, sqlserver.

Developing a WebCMS with just ASP.NET and SQL Server and without a content repository layer in between sounds like developing everything that a WebCMS usually comes with from scratch, especially if you want to have all the functionality that Magnolia offers (versioning, history, search, etc.).

You can check details regarding Jackrabbit data store here: http://wiki.apache.org/jackrabbit/DataStore although I am wondering why you or your customer would want to change the data store of the content repository to SQL Server. I guess you are not speaking of using MySQL for the persistence of the meta data, but really to store the binary content (a mistake that by the way OpenCms, another Java-based open source WebCMS, made in their architecture design - imho).

Note that usually large files are not stored in the database itself (with Magnolia), but on the file system.

https://wiki.magnolia-cms.com/display/WIKI/Setting+up+a+Jackrabbit+persistence+manager#SettingupaJackrabbitpersistencemanager-Datastorageandbackup:

BLOBs are not by default stored in the database when they exceed a certain threshold definied in your Jackrabbit configuration - instead they are saved on the file system. The default threshold used by a Magnolia installation is 1024 bytes. All files above the defined threshold are put onto the filesystem and not in the database.

In case you really want to get rid of Jackrabbit entirely and only use SQL Server as the persistence layer and store all binary content in it regardless of size (not recommended), I would write a custom export/import script for it, which queries the Jackrabbit repo (standard CMIS protocol) and takes the content from the file system, reading as FileInputStream and writing it to the Oracle DB (Example: http://www.java2s.com/Code/Java/Database-SQL-JDBC/StoreBLOBsdataintodatabase.htm). This would be my suggested method.

I don't think there are any out-of-the-box tools for that.