0
votes

I am using xdmp:document-filter(doc(uri)) to fetch the metadata from the documents. When I run this command on one of the documents I get the following result:-

xdmp:document-filter(doc("/Vision.doc"))//*:meta[@name eq "Creation_Date"]/@content

<?xml version="1.0" encoding="UTF-8"?>
<results warning="attribute node">
  <warning warning="attributes cannot be root nodes" content="17-05-2012 00:48:00"/>
</results>

And when I run this command on another document then I get this:-

<?xml version="1.0" encoding="UTF-8"?>
<results warning="attribute node">
  <warning warning="attributes cannot be root nodes" content="2012-06-03T13:45:00Z"/>
</results>

You can see that date format is different in both the outputs. There may be different date formats in documents uploaded in Marklogic Server. But I want to show the creation date of documents in some fixed format (e.g. May 16, 2012). How can I convert the different date formats to a fixed date format ? And also I want to compare these dates to the date entered by the user. The documents matching the search criteria should get returned by the search query. So I have two questions here:-

  1. How to convert creation date of particular documents to some fixed format and to display it in the UI.
  2. How to compare this creation date to the date entered by the user(which is in "mm/dd/yyyy" format) so that I can get the correct result.
2

2 Answers

2
votes

You will have to parse the dateTime value. For example:

let $dt := "17-05-2012 00:48:00"
return
  if ($dt castable as xs:dateTime)
  then xs:dateTime($dt)
  else xdmp:parse-dateTime("[Y01]-[M01]-[D01] [h01]:[m01]:[s01]", $dt)

This will return an xs:dateTime atomic value, which can be compared and displayed in the UI. If you want to support additional formats, you will need to create additional parse "picture" strings so they can also be converted to xs:dateTime. See the documentation on xdmp:parse-dateTime() for more information.

1
votes

As a part of a larger open source project I cooked up a date parsing library that handles at least 20 different formats in 6 different languages. You can also supply your own formats if one is not already defined. It works by feeding it a date as a string in any of the defined formats and returning an xs:dateTime if it was able to successfully parse it. You can find the library here:

https://github.com/marklogic/Corona/blob/master/corona/lib/date-parser.xqy

To use it:

import module namespace dateparser="http://marklogic.com/dateparser" at "date-parser.xqy";

dateparser:parse($filteredDocument//*:meta[@name eq "Creation_Date"]/@content)

This will allow you to normalize the various date formats that binary documents can have. I will note that different binary formats (Word, PDF, JPEG, etc) will use different names for the creation date. So just looking for the "Creation_Date" metadata could leave some holes depending on what formats you're storing in MarkLogic.

Also note, that if you just want the date information without the time portion, you can cast the xs:dateTime to an xs:date. Doing so will retain the timezone information which is likely a good thing.

As for your second question…

There is a number of different ways to do this and reading some of the MarkLogic documentation is a good place to start. I'd recommend taking a look at:

http://docs.marklogic.com/guide/search-dev/rangequery

Hopefully that will shed a bit of light on what your query needs to look like. In simplest form you will probably have to first parse the date that the user provided. This can also be done with the date parsing library so users can enter tons of different date formats (eg: November 13th, 2012 or even in Spanish Noviembre 13th, 2012). Then use that parsed date to construct date range queries in MarkLogic.

If that doesn't help I'd post another question here with the specifics of where you're getting hung up.