https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/_indexing_documents.html
Based on Elasticsearch API document
To bulk dump data to elasticsearch
for($i = 0; $i < 100; $i++) {
$params['body'][] = [
'index' => [
'_index' => 'my_index',
'_type' => 'my_type',
]
];
$params['body'][] = [
'my_field' => 'my_value',
'second_field' => 'some more values'
];
}
Basically, you loop through each document, add the same meta data for each document and then call the bulk function to bulk dump these data.
I have data save in Google Cloud Storage as JSON (New line delimited) format. There are hundreds of thousands or millions same format documents in the file (same index/type meta data for elasticsearch).
To bulk dump this Google Cloud Storage file to Elasticsearch, I have to read in this file and loop through each document in this file, assign the same meta data for each document and then finally bulk dump to Elasticsearch.
It would be nice that I can just give one meta data (basically for which index and which type these documents should be indexed) instead of looping through the file and add the same meta data for each document, and give the whole file (Json documents new line delimited), then bulk dump will do the rest of the works.
Knowing that Elasticsearch bulk API not offering this feature yet.
But I assume that bulk dump json file saved in s3 or google cloud storage to elasticsearch is common demand.
So someone else might already run into this use case and solve the issue.
Any advice and suggestions from your experience?
Thanks!