2
votes

I have a file in blob storage folder/new/data.json

It contains json array.

[   
    {
        "name": "a",
        "data": {
            "1":"something1",
            "2":"something2"

        }
    },
    {
        "name": "b",
        "data": {
            "1":"something1",
            "2":"something2"
        }
    }
]

my datasource body :

{
    "name" : "datasource",
    "type" : "azureblob",
    "credentials" : { "connectionString" : "MyStorageConnStrning" },
    "container" : { "name" : "mycontaner", "query" : "folder/new" }
}   

my index body:

{
    "name" : "index",
    "fields": [
       { "name": "id", "type": "Edm.String", "key": true, "searchable": false },
       { "name": "name", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": true},
       { "name": "data", "type": "Edm.String", "searchable": false}
    ]
}

indexer body:

{
    "name" : "indexer",
    "dataSourceName" : "datasource",
    "targetIndexName" : "index",
    "parameters" : { "configuration" : { "parsingMode" : "jsonArray" } }
}

when created i can search for a and b and get result.

now i have modified folder/new/data.json to

[   
    {
        "name": "aa",
        "data": {
            "1":"something1",
            "2":"something2"

        }
    }
]

just running the indexer only overwrites

{
    "name": "a",
    "data": {
       "1":"something1",
       "2":"something2"

    }
}

but

{
    "name": "b",
    "data": {
        "1":"something1",
        "2":"something2"
    }
}

still remains. Meaning b is still searchable.

What can i do so the b gets removed?

More preciously What to do when data-source file is changed and index data need to change accordingly? Removed data from the data-source needs to get removed from index and new data in data-source needs get indexed.

1

1 Answers

3
votes

Nafis,

You should look into adding a soft delete policy. Just removing the data from the data source doesn't mean existing records are deleted. If you added an "IsDeleted" field to the json object, set it to true, and ran your indexer again then the record would be deleted

[   
    {
    "name": "a",
    "data": {
        "1":"something1",
        "2":"something2"
       }
    },
    {
    "name": "b",
    "data": {
        "1":"something1",
        "2":"something2"
      },
    "isDeleted": true
    }
]

Once your indexer has run again, you can then safely remove the "b" object from your json array. I recommend making sure your indexer is on a schedule so deletes are automatically picked up after a period of time.

Please let me know if you have additional questions.

Matt