I queried my MariaDB and parsed all the data into JSON file formatted according to the documentation of the Elasticsearch bulk api here.
Json sample:
{"index": {"_index": "test", "_type": "test-type", "_id": "5"}
{"testcase": "testcase_value", "load": "load_value", "detcause": "DETAILED_CAUSE_UNKNOWN", "time": "2017-09-28T08:07:03", "br_proc": "ProcDetCause", "proc_message": "MME_CB_DEF", "cause": null, "count": 3}
{"index": {"_index": "test", "_type": "test-type", "_id": "17"}
{"testcase": "testcase_value", "load": "load_value", "detcause": "DETAILED_CAUSE_UNKNOWN", "time": "2017-09-28T08:07:03", "br_proc": "BrDetCause", "proc_message": "MME_CB_DEF", "cause": null, "count": 2}
{"index": {"_index": "test", "_type": "test-type", "_id": "20"}
{"testcase": "testcase_value", "load": "load_value", "detcause": null, "time": "2017-09-28T08:07:03", "br_proc": "BrCause", "proc_message": "MME_CB_DEF", "cause": "CAUSE_UNKNOWN", "count": 2}
{"index": {"_index": "test", "_type": "test-type", "_id": "23"}
{"testcase": "testcase_value", "load": "load_value", "detcause": null, "time": "2017-09-28T08:07:03", "br_proc": "ProcCause", "proc_message": "MME_CB_DEF", "cause": "CAUSE_UNKNOWN", "count": 1}
{"index": {"_index": "test", "_type": "test-type", "_id": "39"}
{"testcase": "testcase_value", "load": "load_value", "detcause": null, "time": "2017-09-28T08:07:03", "br_proc": "ProcCause", "proc_message": "MME_CB_DEF", "cause": "CAUSE_UNKNOWN", "count": 2}
...
When I run:
curl -s -H "Content-Type: application/x-ndjson" -XPOST 'localhost:9200/_bulk' --data-binary @data.json
I get no response at all. I tried to take some subset of the data (i.e. 100, 1000 lines) and those worked (I even receiver a JSON response). But as soon as I went over a million it gave no response. Currently, there are only 500 entries in the Elasticsearch database.
I also checked the elasticsearch logs and they are empty.
The file has 20 million lines and approximately 2.7 GB.
Why am I not getting any response when I post a larger JSON file? Am I doing something wrong? Is there a better way to handle bulk indexing?
wireshark
– tukan