2
votes

My data has date in the format yyyy-MM-dd ex : "2015-10-12"

My logstash date filter is as below

    input {
            file {
                    path => "/etc/logstash/immport.csv"

                    codec => multiline {
                    pattern => "^S*"
                    negate => true
                    what => "previous"

            }
                    start_position => "beginning"
            }

    }
    filter {
            csv {

                    separator => ","
                    autodetect_column_names => true
                    skip_empty_columns => true
            }

            date {
            match => ["start_date", "yyyy-MM-dd"]
                    target => "start_date"
            }
            mutate {
                 rename => {"start_date" => "[study][startDate]"}
            }

    }
    output {
        elasticsearch {
                action => "index"
                hosts => ["elasticsearch-5-6:9200"]
                index => "immport12"
                document_type => "dataset"
                template => "/etc/logstash/immport-mapping.json"
        template_name => "mapping_template"
        template_overwrite => true
        }

        stdout { codec => rubydebug }
}

However, my es instance is not able to parse it and I'm getting following error

"error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [study.startDate]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2012-04-17T00:00:00.000Z\" is malformed at \"T00:00:00.000Z\""}}}}}

Sample Data Row ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"immport_2017_12_02", :_type=>"dataset", :_routing=>nil}, 2017-12-20T08:55:45.367Z 878192e51991 SDY816,HEPSV_COHORT: Participants that received Heplisav,,,2012-04-17,,10.0,Systems Biology Analysis of the response to Licensed Hepatitis B Vaccine (HEPLISAV) in specific cell subsets (see companion studies SDY299 and SDY690),Interventional,http://www.immport.org/immport-open/public/study/study/displayStudyDetail/SDY816,,Interventional,Vaccine Response,Homo sapiens,Cell,DNA microarray], :response=>{"index"=>{"_index"=>"immport_2017_12_02", "_type"=>"dataset", "_id"=>"AWBzIsBPov62ZQtaldxQ", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [study.startDate]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2012-04-17T00:00:00.000Z\" is malformed at \"T00:00:00.000Z\""}}}}}

I want my logstash to output date in this format yyyy-MM-dd without timestamp Mapping template

"startDate": {
         "type": "date",
        "format": "yyyy-MM-dd"
 },
2
Please paste the sample data row by just printing the output by output { stdout { codec => rubydebug } }Hatim Stovewala
@HatimStovewala I have added sample data row can you let me know what's the issue is ?arjunsv3691
Paste your entire logstash config file.Hatim Stovewala
@HatimStovewala Added full logstash config filearjunsv3691
Everything seems fine. Is the datatype mapping done in template in output elasticsearch? If yes, please paste the json file also. You can also do one thing by explicitly casting/converting each column rather than dynamically allotting.Hatim Stovewala

2 Answers

2
votes

I tried this on my machine with reference to your logstash conf file and it worked fine.

My Logstash conf file :

input {
    file {
        path => "D:\testdata\stack.csv"
        codec => multiline {
            pattern => "^S*"
            negate => true
            what => "previous"
        }
        start_position => "beginning"
    }
}

filter {
    csv {
        separator => ","
        autodetect_column_names => true
        skip_empty_columns => true
    }
    date {
        match => ["dob", "yyyy-MM-dd"]
        target => "dob"
    }
    mutate {
        rename => {"dob" => "[study][dob]"}
    }
}

output {
    elasticsearch {
        action => "index"
        hosts => ["localhost:9200"]
        index => "stack"
    }
    stdout { codec => rubydebug }
}

CSV file :

id,name,rollno,dob,age,gender,comments
1,hatim,88,1992-07-30,25,male,qsdsdadasd asdas das dasd asd asd asd as dd sa d
2,hatim,89,1992-07-30,25,male,qsdsdadasd asdas das dasd asd asd asd as dd sa d

Elasticsearch document after indexing :

{
    "_index": "stack",
    "_type": "doc",
    "_id": "wuBTeGABQ7gwBQSQTX1q",
    "_score": 1,
    "_source": {
        "path": """D:\testdata\stack.csv""",
        "study": {
            "dob": "1992-07-29T18:30:00.000Z"
        },
        "@timestamp": "2017-12-21T09:06:52.465Z",
        "comments": "qsdsdadasd asdas das dasd asd asd asd as dd sa d",
        "gender": "male",
        "@version": "1",
        "host": "INMUCHPC03284",
        "name": "hatim",
        "rollno": "88",
        "id": "1",
        "message": "1,hatim,88,1992-07-30,25,male,qsdsdadasd asdas das dasd asd asd asd as dd sa d\r",
        "age": "25"
    }
}

And everything worked perfectly. See if this example might help you with something.

1
votes

The issue was I changed the logstash mapping template name to new name, I didn't delete the old template file and hence the index was still pointing to old template file

once I deleted the old template file

curl -XDELETE 'http://localhost:9200/_templates/test_template' 

it worked, so whenever we are using new template it's required to delete old template and then process records