1
votes

I have a mongo db collection users with the following data format

{
    "name": "abc",
    "email": "abc@xyz.com"
    "address": {
        "city": "Gurgaon",
        "state": "Haryana"
    }
}

Now I'm creating a datasource, an index, and an indexer for this collection using azure rest apis.

Datasource

def create_datasource():
  request_body = {
      "name": 'users-datasource',
      "description": "",
      "type": "cosmosdb",
      "credentials": {
          "connectionString": "<db conenction url>"
      },
      "container": {"name": "users"},
      "dataChangeDetectionPolicy": {
          "@odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
          "highWaterMarkColumnName": "_ts"
      }
  }
  resp = requests.post(url="<create-datasource-api-url>", data=json.dumps(request_body), 
    headers=headers)

Index for the above datasource

def create_index(config):

  request_body = {
      'name': "users-index",
      'fields': [
          {
              'name': 'name',
              'type': 'Edm.String'
          },
          {
              'name': 'email',
              'type': 'Edm.DateTimeOffset'
          },
          {
              'name': 'address',
              'type': 'Edm.String'
          },
          {
              'name': 'doc_id',
              'type': 'Edm.String',
              'key': True
          }
      ]
  }
  resp = requests.post(url="<azure-create-index-api-url>", data=json.dumps(request_body), 
    headers=config.headers)

Now the inxder for the above datasource and index

def create_interviews_indexer(config):
  request_body = {
    "name": "users-indexer",
    "dataSourceName": "users-datasource",
    "targetIndexName": users-index,
    "schedule": {"interval": "PT5M"},
    "fieldMappings": [
        {"sourceFieldName": "address.city", "targetFieldName": "address"},
    ]
  }
  resp = requests.post("create-indexer-pi-url", data=json.dumps(request_body), 
      headers=config.headers)

This creates the indexer without any exception, but when I check the retrieved data in azure portal for the users-indexer, the address field is null and is not getting any value from address.city field mapping that is provided while creating the indexer.

I have also tried the following code as a mapping but its also not working.

"fieldMappings": [
        {"sourceFieldName": "/address/city", "targetFieldName": "address"},
    ]

The azure documentation also does not say anything about this kind of mapping. So if anyone can help me on this, it will be very much appreciated.

3

3 Answers

1
votes

container element in data source definition allows you to specify a query that you can use to flatten your JSON document (Ref: https://docs.microsoft.com/en-us/rest/api/searchservice/create-data-source) so instead of doing column mapping in the indexer definition, you can write a query and get the output in desired format.

Your code for creating data source in that case would be:

def create_datasource():
  request_body = {
      "name": 'users-datasource',
      "description": "",
      "type": "cosmosdb",
      "credentials": {
          "connectionString": "<db conenction url>",
      },
      "container": {
        "name": "users",
        "query": "SELECT a.name, a.email, a.address.city as address FROM a",
      },
      "dataChangeDetectionPolicy": {
          "@odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
          "highWaterMarkColumnName": "_ts"
      }
  }
  resp = requests.post(url="<create-datasource-api-url>", data=json.dumps(request_body), 
    headers=headers)
1
votes

Support for MongoDb API flavor is in public preview - you need to explicitly indicate Mongo in the datasource's connection string as described in this article. Also note that with Mongo datasources, custom queries suggested by the previous response are not supported afaik. Hopefully someone from the team would clarify the current state of this support.

0
votes

It's working for me with the below field mapping correctly. Azure search query is returning values for address properly.

    "fieldMappings": [{"sourceFieldName": "address.city", "targetFieldName": "address"}]

I did made few changes to the data your provided for e.g.

  1. while creating indexers, removed extra comma at the end of fieldmappings
  2. while creating index, email field is kept at Edm.String and not datetimeoffset.

Please make sure you are using the Preview API version since for MongoDB API is in preview mode with Azure Search. For e.g. https://{azure search name}.search.windows.net/indexers?api-version=2019-05-06-Preview