4
votes

I have a file in blob storage folder/new/data1.json.

data1 contains json array.

[   
    {
        "name": "na",
        "data": {
            "1":"something1",
            "2":"something2"

        }
    },
    {
        "name": "ha",
        "data": {
            "1":"something1",
            "2":"something2"
        }
    }
]

my datasource body :

{
    "name" : "datasource",
    "type" : "azureblob",
    "credentials" : { "connectionString" : "MyStorageConnStrning" },
    "container" : { "name" : "mycontaner", "query" : "folder/new" }
}   

my index body:

{
    "name" : "index",
    "fields": [
       { "name": "id", "type": "Edm.String", "key": true, "searchable": false },
       { "name": "name", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": true},
       { "name": "data", "type": "Edm.String", "searchable": false}
    ]
}

indexer body:

{
    "name" : "indexer",
    "dataSourceName" : "datasource",
    "targetIndexName" : "index",
    "parameters" : { "configuration" : { "parsingMode" : "jsonArray" } }
}

when created i can search for na and ha and get result.

but if i delete folder/new/data1.json from the blob storage and run the indexer and try to search na and ha i still get results.

I found that if i Delete the indexer and recreate it na and ha goes away from search.

Is there any way to remove previous data with out deleting the indexer?

2
I reset it. Still could search na and haNafis Islam

2 Answers

5
votes

Deleting documents using indexers is a bit tricky, especially when your blob contains multiple documents; if you delete the blob directly then the indexer won't see the blob and wouldn't try to delete anything from the index.

To make the indexer delete documents you need to use a soft delete deletion detection policy, for example:

{
  "@odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
  "softDeleteColumnName": "IsDeleted",
  "softDeleteMarkerValue": "true"
}

When you want to delete a document, add "IsDeleted": true to the JSON object. After all documents in a blob has been soft deleted and the deletes have been picked up by the indexer, only then you can do a hard delete and remove the blob.

One subtlety here is that you must not add/remove/rearrange elements of the array because you're using the default document id, which depends on the blob path and array index. If you use the name field as the key then you'll have the flexibility to do partial hard deletes inside the blob.

3
votes

I'm afraid you'll need to delete the entries from the index on your own. Take a look at Add, Update or Delete Documents (Azure Search Service REST API) on how it can be done using HTTP requests using a tool like Postman.

Hope it helps!