I use a WebDAV server to import PDFs, DOCs, PPTXs and XLSXs to my database by drag and drop. My WebDAV server is called "CPF", his root is "/" and his port number is "9999".
And I installed the Content Processing Framework with the standard configuration.
Can it be that I have not the needed security requirements?
For this case MarkLogic says:
Set the Needed Permissions on the Root Directory
When you add documents to the database for conversion, the user who adds the documents must have the needed permissions to add and modify documents. If you are using WebDAV server to drag-and-drop documents into the database, the root directory of the WebDAV server must also have the needed permissions.
One simple way to accomplish these security requirements is to do the following:
Create a URI privilege for the URI that is configured as the root directory of your WebDAV server.
Create a role that has the URI privilege and has default permissions of read. insert, and update for the role.
Set the permissions on the WebDAV root directory for the role you created. For example, if the role you created is named webdav, and the root directory has the URI /webdav/root/, run a query (as a privileged user) similar to the following:
xdmp:document-set-permissions("/webdav/root/",
( xdmp:permission("webdav", "read"),
xdmp:permission("webdav", "insert"),
xdmp:permission("webdav", "update") ) )
You can check the permissions with the following query:
xdmp:document-get-permissions("/webdav/root/")
• Grant the new role (webdav in the example above) to the user who accesses the WebDAV server.
In this case I don't get which "role" and which "root directory" they are talking about?
But what if the error comes from somewhere else? Why do I have some documents converted into .xml files and others into .xhtml files and about 50% of my original files ignored and not converted?
As suggested by Dave Cassel, I ran xdmp:document-properties()
for one of the records that had failed to process. Below is the result:
<?xml version="1.0" encoding="UTF-8"?>
<prop:properties xmlns:prop="http://marklogic.com/xdmp/property">
<cpf:processing-status xmlns:cpf="http://marklogic.com/cpf">done</cpf:processing-status>
<cpf:property-hash xmlns:cpf="http://marklogic.com/cpf">93bdf4b50736752e0155c8e16fd42544</cpf:property-hash>
<cpf:last-updated xmlns:cpf="http://marklogic.com/cpf">2016-07-25T11:26:13.006+02:00</cpf:last-updated>
<cpf:state xmlns:cpf="http://marklogic.com/cpf">http://marklogic.com/states/property-updated</cpf:state>
<cpf:self xmlns:cpf="http://marklogic.com/cpf">/XXX/PDFs/XXXXX.pdf</cpf:self>
<Win32CreationTime xmlns="urn:schemas-microsoft-com:">Mon, 25 Jul 2016 08:05:44 GMT</Win32CreationTime>
<Win32LastAccessTime xmlns="urn:schemas-microsoft-com:">Mon, 25 Jul 2016 09:26:12 GMT</Win32LastAccessTime>
<Win32FileAttributes xmlns="urn:schemas-microsoft-com:">00000000</Win32FileAttributes>
<Win32LastModifiedTime xmlns="urn:schemas-microsoft-com:">Mon, 25 Jul 2016 08:05:44 GMT</Win32LastModifiedTime>
</prop:properties>