1
votes

Thanks for your help in advance.

I am using Azure Search .Net SDK to build an indexer. I am currently also using a custom analyzer

Before using the custom analyzer, I was using EnLucene analyzer, which allowed me to use wildcard search *. For example, I was using to allow users to search suffix search. If a user searches for "app", it will return the results such as "apple, application, approach". Please do not suggest autocomplete or suggest because suggester cannot be used together with a custom analyzer. I do not want to create additional 20 search fields just because of suggester. (one for suggester and one for search).

Below is my custom analyzer example. It does not allow me to use * to do partial match. I am not looking for NGram solution for any prefix or suffix partial match. I would actually like to use wildcard *. What could I do to allow wildcard search?

var definition = new Index()
{
    Name = indexName,
    Fields = mapFields,
    Analyzers = new[]
    {
        new CustomAnalyzer
        {
            Name = "custom_analyzer",
            Tokenizer = TokenizerName.Whitespace,
            TokenFilters = new[]
            {
                TokenFilterName.AsciiFolding,
                TokenFilterName.Lowercase,
                TokenFilterName.Phonetic
            }
        }
    }
};
1

1 Answers

0
votes

Here is how you can do that:

  • Add you custom analyzer like below:

{
  "name":"names",
  "fields":[
    { "name":"id", "type":"Edm.String", "key":true, "searchable":false },
    { "name":"name", "type":"Edm.String", "analyzer":"my_standard" }
  ],
  "analyzers":[
    {
      "name":"my_standard",
      "@odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
      "tokenizer":"standard",
      "tokenFilters":[ "lowercase", "asciifolding" ]
    }
  ]
}

// Below snippet is for creating definition using c#
new CustomAnalyzer
                {
                    Name = "custom_analyzer",
                    Tokenizer = TokenizerName.Standard,
                    TokenFilters = new[]
                    {
                        TokenFilterName.Lowercase,
                        TokenFilterName.AsciiFolding,
                        TokenFilterName.Phonetic
                    }
                }
  • Then reference the custom analyzer while creating doc definition like below:

    [IsSearchable, IsFilterable, IsSortable, Analyzer("custom_analyzer")]
    public string Property { get; set; }

Check this blog for further reference:

https://azure.microsoft.com/en-in/blog/custom-analyzers-in-azure-search/

Here is sample test method for custom analyzer:

[Fact]
        public void CanSearchWithCustomAnalyzer()
        {
            Run(() =>
            {
                const string CustomAnalyzerName = "my_email_analyzer";
                const string CustomCharFilterName = "my_email_filter";

                Index index = new Index()
                {
                    Name = SearchTestUtilities.GenerateName(),
                    Fields = new[]
                    {
                        new Field("id", DataType.String) { IsKey = true },
                        new Field("message", (AnalyzerName)CustomAnalyzerName) { IsSearchable = true }
                    },
                    Analyzers = new[]
                    {
                        new CustomAnalyzer()
                        {
                            Name = CustomAnalyzerName,
                            Tokenizer = TokenizerName.Standard,
                            CharFilters = new[] { (CharFilterName)CustomCharFilterName }
                        }
                    },
                    CharFilters = new[] { new PatternReplaceCharFilter(CustomCharFilterName, "@", "_") }
                };

                Data.GetSearchServiceClient().Indexes.Create(index);

                SearchIndexClient indexClient = Data.GetSearchIndexClient(index.Name);

                var documents = new[]
                {
                    new Document() { { "id", "1" }, { "message", "My email is [email protected]." } },
                    new Document() { { "id", "2" }, { "message", "His email is [email protected]." } },
                };

                indexClient.Documents.Index(IndexBatch.Upload(documents));
                SearchTestUtilities.WaitForIndexing();

                DocumentSearchResult<Document> result = indexClient.Documents.Search("[email protected]");

                Assert.Equal("1", result.Results.Single().Document["id"]);
            });
        }

Feel free to tag me in your conversation, hope it helps.