0
votes

Logstash jdbc input plugin fetch data from mysql multiple time and keep creating documents in elasticsearch

For 600 rows in mysql, it creates 8581812 documents in elasticsearch

I have created multiple config files to fetch data from each table in mysql and put in /etc/logstash/conf.d folder Start logstash service as sudo systemctl start logstash Run following command to execute files /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/spt_audit_event.conf Data successfully fetched

input{
jdbc {
jdbc_driver_library => "/usr/share/jdbc_driver/mysql-connector-java-5.1.47.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://:3306/"
jdbc_user => ""
jdbc_password => ""
statement => "select * from spt_identity"
}
}

output {
elasticsearch {
"hosts" => "localhost:9200"
"index" => ""
}
stdout {}
}

Actual Results

Number of documents in elasticsearch keep on increasing and reached to 8581812 but there are only 600 rows in mysql table Is it bug in plugin or I'm doing something wrong ?

1

1 Answers

0
votes

You need to mention the unqiue id for elasticsearch

In order to avoid the duplication issues at elasticsearch you may need to add the unique id for the documents at elasticsearch.

Modify the logstash.conf by adding the "document_id" => "%{studentid}" in the output like below.

output {
  stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "localhost:9200"
  "index" => "test-migrate"
  "document_id" => "%{studentid}"
  }

In your case it wont be studentid, but something else. Find the same and add it to your configuration.