How to run mlcp import forest from Application server

Question

I have mlcp (v9.0.4) installed on Application Server Connecting to DB1 database in Database server (ML v 9.0.4)

Consider: DB1 forests are in /data/db_data/Forests/forest1 DB2 listens to port 1111

I am trying to run following on Application server

./mlcp.sh import -mode local -host DBSERVER -port 1111 -user uname -password xxx -input_file_path file:///DBSERVER/data/db_data/Forests/forest1 -input_file_type forest

DB1 forests to DB2 forests (both databases are running on same host).

mlcp seems to be looking for the path in Application server instead of DBSERVER and hence throwing an error: Input file path does not exist. What is the right way to do it?

I can do mlcp copy, but wanted to confirm:

this option does not work
mlcp copy seems slower due to the filter conditions given.

Direct question would be - What is the best way to migrate data from one content database to another content database? MLCP copy does that but its too slow from me.

Hi Michael, Thanks for the reply. There is an option in MLCP where i can migrate data from an offline forest into another database, just trying to make that option work. reference: docs.marklogic.com/guide/mlcp/extract#id_49096 however i am able to do this in my local system by creating two databases DB1 and DB2 where i made forest of DB1 offline and used mlcp import option to migrate data to DB2. — JayantKYadav
The issue is if i run the same command from a remote server i does not work, when i say remote server i mean mu MLCP is installed on a different server that DB1 or DB2. Direct question would be :What is the best way to migrate data from one content database to another content database? MLCP copy does that but its too slow from me. — JayantKYadav

Michael Gardner Michael Gardner · Accepted Answer · 2019-03-13T12:59:10

Answering the question from your comment: What is the best way to migrate data from one content database to another content database in the same host/cluster.

I'm assuming this is going to be a one time or infrequent process. One method would be to create replica forests for DB1. Once the forests are sync'd, then remove the replicas and assign them to DB2. This method should be much faster than MLCP.

And it looks like the primary reason your MLCP wasn't working was due to some of the limitations of MLCP. From the docs (Limitations of Direct Access):

When you use mlcp with Direct Access, your forest data must be reachable from the host(s) processing the input. In distributed mode, the forests must be reachable from the nodes in your Hadoop cluster. In local mode, the forests must be reachable from the host on which you execute mlcp.

How to run mlcp import forest from Application server

1 Answers