1
votes

Short story: I'm using azure search service and it looks like it is not indexing new rows added to my view (datasource points to a view).

Long story: I have a view, created from many tables in my database. I created a datasource that points to this view. The view has many columns, relevant here are 2: Id and ChangeIndicator. The change indicator (int) has a calculated value, result of summing some values that changes in the tables using to buid the view. I defined an index, maped to my datasource and an indexer. The indexer populates the index (first time) correctly, it sees all the rows in my view and index the data (as far as I checked) correctly.

The issue is when I add new rows in view - those rows are not indexed.

I create the datasource, index and indexer using API calls.

The datasource is created like this:

{
    "name": "cs5datasource",
    "description": "Data source for cs search",
    "type": "azuresql",
    "credentials": { "connectionString" : "XXX - Connection string - XXX" },
    "container": {"name": "UserDataAggregatedView"},
    "dataChangeDetectionPolicy": {
        "@odata.type" : "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",  
        "highWaterMarkColumnName" : "ChangeIndicator" 
    }
}

The index is created like this:

{
    "name": "cs5indexer",
    "description": "Indexer service",
    "dataSourceName": "cs5datasource",
    "targetIndexName": "cs5index",
    "schedule": {"interval" : "PT5M", "startTime" : "2017-06-24T22:00:00Z"}
}

After adding the new rows (the one that are not indexed), if I delete the indexer and create it again, the view is indexed correctly. It seems like the issues is to detect the new rows. The added new rows are valid, no errors are generated on indexing process.

I need a way to index new added rows to my view and a way to remove from index the deleted rows. How to do this?

Thank you.

Later edit: The new added rows are not indexed. The new row I added looks like this:

Id |   name   | lang  |    proffesion   | changeIndicator
26 | test_011 |  en   | history teacher |    49536867 
1

1 Answers

3
votes

The changeIndicator column in your view is likely not suited to be the high watermark column. A high water mark column should monotonically increase for each new or updated row; we recommend using a rowversion column for change tracking.

From Capturing new, changed, and deleted rows:

This policy can be used if your table or view contains a column that meets the following criteria:

  • All inserts specify a value for the column.
  • All updates to an item also change the value of the column.
  • The value of this column increases with each insert or update.
  • Queries with the following WHERE and ORDER BY clauses can be executed efficiently: WHERE [High Water Mark Column] > [Current High Water Mark Value] ORDER BY [High Water Mark Column].

For capturing deleted rows, read the Soft Delete Column Deletion Detection policy section in Connecting Azure SQL Database to Azure Search using indexers.