0
votes

I have a import job happening once a week, which inserts all the records from MongoDB to ElasticSearch.
What i am doing is the following:

  1. Records already exist in 'main' index
  2. I insert all the new records into 'main-temp' index
  3. I delete the 'main' index
  4. I reindex 'main-temp' to 'main'
  5. I delete the 'main-temp' index

I am running the operation locally on the same data set.
What i am noticing is that the number of records in the new 'main' index does not match the number of records that got imported to the 'main-temp' index.
Here is the code that i am using

try {
        await client.indices.delete({index: "main"})
        Logger.info('Old Index Deleted')
        await client.indices.create({ index: 'main' })
        Logger.info('New Index Created')
        await client.reindex({
            waitForCompletion: true,
            refresh: true,
            body: {
              source: {
                index: 'main-temp'
              },
              dest: {
                index: 'main'
              }
            }
        })
        Logger.info('Temp Index Reindexed/Cloned')
        await client.indices.delete({index: "main-temp"})
        Logger.info('Temp Index Deleted')
    } catch(e) {
        Logger.error(e)
    }

I am using Elastic search 6.8.9, so i can't use Clone API since it is part of 7.X
Check the screenshot below for the results, thing is whenever it reindex's the number of records is different (usually smaller few thousands)

https://i.stack.imgur.com/g1u0J.png

UPDATE: Here is what i get from reindex as response (if i do let result = await )
Sometimes it gets the correct number, sometimes not.

took: 22357,
timed_out: false,
total: 673637,
updated: 0,
created: 673637,
deleted: 0,
batches: 674,
version_conflicts: 0,
noops: 0,
retries: { bulk: 0, search: 0 },
throttled_millis: 0,
requests_per_second: -1,
throttled_until_millis: 0,
failures: []
1
And of course you're certain that MongoDB always contains the exact same amount of records? What response do you get from the reindex call?Val
Yes i am certain.Thing is, i am not refreshing the database, in my local testing i am always reading the same database. (which is not changing)AleksandarT
What about my second question?Val
If i expect result from the reindex, i get this error: [The "event" parameter needs to be a string.]. If i don't do let result = and only await, then all is OK. I am getting the same error on clone index.AleksandarT
Set waitForCompletion: false and then run GET _tasks?actions=*reindex&detailed in Dev Tools. What do you get?Val

1 Answers

0
votes

I fixed this by introducing timeouts after creating/deleting the old index and after reindexing.
Here is the code

try {
        await client.indices.delete({index: "main"})
        Logger.info('Old Index Deleted')
        await client.indices.create({ index: 'main' })
        Logger.info('New Index Created')
        await new Promise(resolve => setTimeout(resolve, 10000))
        await client.reindex({
            waitForCompletion: true,
            refresh: true,
            body: {
              source: {
                index: 'main-temp'
              },
              dest: {
                index: 'main'
              }
            }
        })
        await new Promise(resolve => setTimeout(resolve, 15000))
        Logger.info('Temp Index Reindexed/Cloned')
        await client.indices.delete({index: "main-temp"})
        Logger.info('Temp Index Deleted')
    } catch(e) {
        Logger.error(e)
    }

It seems elasticsearch needs some time to get everything working.