4
votes

I'm trying to use Solr's DataImportHandler to index some documents from an Oracle DB, and everything is working fine with the exception of correctly reading an Oracle Date column into my document.

I have the field in my Solr schema defined as

<field name="release_date" type="date" indexed="true" stored="true" multiValued="false"/>

I first tried doing just a basic select statement of the date column in my DataImportHandler, but all of the dates are being indexed with incorrect time values. For instance, a date that is in the DB as Jan. 12, 2004 09:28 AM (EST) is being indexed as:

<date name="release_date">2004-01-12T05:00:00Z</date>

All of the date values have the correct day, but they all have T05:00:00Z as their time. My best guess as to what is happening is that it's reading the time from the DB as midnight and converting it to UTC. If this is the case, I would expect the correct value to read T14:28:00Z.

Why is it not picking up the time portion of the DB column? I know that there is a transformer for dates that comes with the DIH, but I'm not totally clear on how it's supposed to work. I also tried doing

<field column="RELEASE_DATE" name="release_date" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" /> 

in the DIH, but that didn't seem to change anything.

2
What's the type of the field in Oracle?Jon Skeet
The data type of the field in Oracle is Date.dwc86
What's the nls_date_format set to? Also, what do you get if you just do 'select sysdate from dual';?Mark J. Bobak

2 Answers

2
votes

Here's the full code to go along with the last answer (for extra clarity).

In your data-config.xml file read the date from the DB and cast to a timestamp:

select cast(STRT_DT as timestamp) as STRT_DTTS from DATES

Put into an DataImportHandler entity, that looks like this:

<entity name="startDate" transformer="script:startDateTransform"
        query="select cast(STRT_DT as timestamp) as STRT_DTTS from DATES" >
    <field column="STRT_DTTS" name="STRT_DT" /> 
</entity>

This query will return a oracle.sql.TIMESTAMP, but it won't map directly to date. A script transformer is therefore required. Thus we introduce script:startDateTransform. In the same data-config.xml, you can insert JavaScript like so:

function startDateTransform(row){
    // Get the timestamp and convert it to a date
    var dateVal = row.get("STRT_DTTS").dateValue();

    // Put the correct date object into the original column
    row.put("STRT_DTTS", dateVal);

    return row;
}

Here we convert the timestamp to a date, update the column value and return the row with the new information.

The field STRT_DT:

<field column="STRT_DTTS" name="STRT_DT" />

should now contain the correct date.

0
votes

Oracle JDBC getDate() will return only date part.Only getTimeStamp() return both Date & Time part.

Workaround :

  • Add cast() function to convert date to timestamp,So solr will get oracle.TIMESTAMP object. Ex:CAST(release_date AS TIMESTAMP) AS d_release_date,
  • Add a new date transformer to transform the TIMESTAMP to date.
  • Voila!. Now you have time part in solr.