0
votes

trying to add custom skill in the skillset and map it in the index

here is in detail

I'm using the azure Named Entity Recognition in my skillset as

    {
        "@odata.type": "#Microsoft.Skills.Text.MergeSkill",
        "description": "Merge text content with image tags",
        "insertPreTag": " ",
        "context": "/document",
        "inputs": [
            {
                "name": "text",
                "source": "/document/fullTextAndCaptions"
            },
            {
                "name": "itemsToInsert",
                "source": "/document/normalized_images/*/Tags/*/name"
            }
        ],
        "outputs": [
            {
                "name": "mergedText",
                "targetName": "finalText"
            }
        ]
    }

and in the indexer as

    {
        "sourceFieldName": "/document/finalText/pages/*/entities/*/value",
        "targetFieldName": "entities"
    },
    {
        "sourceFieldName": "/document/finalText/pages/*/locations/*",
        "targetFieldName": "locations"
    },

and it works 100% now I want to add the Distinct custom skill from https://github.com/Azure-Samples/azure-search-power-skills/tree/master/Text/Distinct I did publish the function and when I go to test it manually it works as expected. however overall its not working in skillset. I want it to take the location and filter it and output the distinct only in it's own field in the search index. I'm having a really hard time to configure the skillset and indexer to get it to work.

any help please?

2
Can you also add the json for the distinct custom skill from your skillset?8163264128
so json for distinct is my question I don't know how to do it. I have been trying different ways and it fails. you can see the sample one on the link I posted in the question. I have changed line 46 in the distinct.cs before publishing to "JArray wordsParameter = inRecord.Data.TryGetValue("locations", out object wordsParameterObject) ?" so it accept location field instead of words field. so I need help creating the skillsetnerdHunter

2 Answers

1
votes

You'll need to add the distinct custom skill like this, assuming you want to dedup over the whole document

{
    "skills": [
        ...

        {
            "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
            "description": "Distinct skill",
            "uri": "<https://distinct-skill>",
            "context": "/document",
            "inputs": [
                {
                    "name": "locations",
                    "source": /document/finalText/pages/*/locations/*"
                }
            ],
            "outputs": [
                {
                    "name": "distinct",
                    "targetName": "distinctLocations"
                }
            ]
        }

        ...
    ]
}

and an output field mapping to put it into the index.

    {
        "sourceFieldName": "/document/distinctLocations",
        "targetFieldName": "distinctLocations"
    }

See https://docs.microsoft.com/en-us/azure/search/cognitive-search-custom-skill-interface#consuming-custom-skills-from-skillset for adding a custom skill.

0
votes

The skill inputs for the custom skill must be configured to point to the data you want to disambiguate. In this case, you didn't really need to modify the code, all you had to do was have an input with name 'words' and source '/document/finalText/pages//locations/'.