1
votes

I am using MarkLogic Content Pump to ingest XML documents. I would like to transform these xml documents in the mlcp ingestion process using “-tranform module and -transform namespace” option. I have already created the XSLT for the transformation and also loaded it into ML “modules" database. But mlcp is not accepting xslt file and throwing error:

COMMAND:

    mlcp.sh import \
-username $username -password $passwd \
-host $host -port $port \
-input_file_path $inpath \
-input_compressed true \
-input_file_type aggregates \
-aggregate_record_element $splittag \
-aggregate_uri_id $uriid \
-aggregate_record_namespace "http://www.fda.gov/cdrh/gudid" \
-output_collections $collection \
-output_permissions my-app-role,read,my-app-role,update \
-output_uri_suffix .xml \
-transform_module /marklogic.rest.transform/xml-transform-xsl/assets/transform.xsl \
-transform_namespace "http://marklogic.com/rest-api/transform/xml-transform-xsl" \
-transform_function transform

Below error is thrown ERROR:

15/09/27 15:34:19 WARN mapreduce.ContentWriter: XDMP-MODNOTTEXT: Module /marklogic.rest.transform/fda-transform-xsl/assets/transform.xsl is not a text document

I would like to know whether xslt transformation is accepted by mlcp? If not then what is the alternative.?

MarkLogic creating equivalent xqy file in modules database. By calling below ".xqy" file, parameter mismatch error will be thrown: I think this is due to wrong return type:

xquery version "1.0-ml";
module namespace simple-xsl = "http://marklogic.com/rest-api/transform/simple-xsl";
import module namespace extut = "http://marklogic.com/rest-api/lib/extensions-util"
    at "/MarkLogic/rest-api/lib/extensions-util.xqy";
declare namespace xsl = "http://www.w3.org/1999/XSL/Transform";
declare default function namespace "http://www.w3.org/2005/xpath-functions";
declare option xdmp:mapping "false";
declare private variable $transform-uri := "/marklogic.rest.transform/fda-transform-xsl/assets/transform.xsl";
declare function fda-transform-xsl:transform(
    $context as map:map,
    $params  as map:map,
    $content as document-node()  
) as document-node()?
{
    extut:execute-transform($transform-uri,$context,$params,$content)
};
1
See the response from mflatscher. The "transform_module" is asking for an xQuery module.. Not XSLT. However, within that module, you are free to use a stylesheet.David Ennis
I just checked whenever I install"xls" transform, marklogic automatically creates a equivalent "xqy" file . I tried using that ".xqy" file for transformation but with no success. Its throws parameter mismatch error. The .xqy accepts "params" and I am not supplying any params when transforming content.vish.Net
I have never heard of any file being automatically created for this purpose. The link in the other comment shows the actual manual page for custom transformations. Could you please post the content of this xQuery file you say was created?David Ennis
I have now updated the question. Its looks like MarkLogic always executes xslt in context of xquery.vish.Net

1 Answers

5
votes

I don't think you can point Content Pump's -transform_module directly at an XSLT. I think it expects an xQuery module (cf. https://docs.marklogic.com/guide/ingestion/content-pump#id_82518).

You should be able to set up such a custom transform xQuery module and call your XSLT transform from in there via an xdmp:xslt-invoke() on the $content map that Content Pump passes in (cf. http://docs.marklogic.com/xdmp:xslt-invoke). You would then set -transform_module to point to that custom transfer xQuery module rather than directly calling the XSL transform.

Note that if you use -input_file_type aggregates, as in your example, your custom transform will be applied to each fragment as defined per $splittag. So the incoming $content map will be the fragment you're splitting (and transforming) on.