0
votes

I'm migrating Azure Search sdk from Microsoft.Azure.Search (v10) to Azure.Search.Documents (v11).

Before, with the v10, we had the ability to create indexes with Custom Analyzers, Tokenizers... using the C# SDK, like the following :

var index = new Microsoft.Azure.Search.Models.Index(
                name: GetIndexName(),
                defaultScoringProfile: defaultScoringProfile,
                fields: AzureQuestionItemDefinition.GetQuestionItemFieldsDefinition(),
                analyzers: new[] {
                    new CustomAnalyzer
                    {
                        Name = "standardAnalyzer",
                        Tokenizer = TokenizerName.Standard,
                        TokenFilters = new[]
                        {
                            TokenFilterName.Lowercase,
                            TokenFilterName.AsciiFolding,
                            TokenFilterName.Phonetic,
                        }
                    },
                    new CustomAnalyzer
                    {
                        Name = "prefixAnalyzer",
                        Tokenizer = TokenizerName.Standard,
                        TokenFilters = new[]
                        {
                            TokenFilterName.Lowercase,
                            TokenFilterName.AsciiFolding,
                            TokenFilterName.Phonetic,
                            "edgeNgramTokenFilter"
                        }
                    },
                },
                tokenFilters: new[]
                {
                    new EdgeNGramTokenFilterV2("edgeNgramTokenFilter", minGram: 2, maxGram: 10, EdgeNGramTokenFilterSide.Front),
                },
                scoringProfiles: new[]
                {
                    new ScoringProfile(defaultScoringProfile)
                    {
                        TextWeights = new TextWeights()
                        {
                            Weights = new Dictionary<string, double>() {
                                { nameof(QuestionItem.Text), 5.0 },
                                { nameof(QuestionItem.Context), 5.0 },
                                { $"{nameof(QuestionItem.Asker)}/{nameof(QuestionItem.Asker.Name)}", 3.0 },
                                { $"{nameof(QuestionItem.Answers)}/{nameof(AnswerItem.Text)}", 2.0 },
                                { $"{nameof(QuestionItem.Answers)}/{nameof(AnswerItem.AnswererName)}", 2.0 }
                            }
                        }
                    }
                }

While migrating to the new Azure.Search.Documents v11, I couldn't find a way to create my index like so using the C# SDK.

I found that the SearchIndex attributes are readonly:

//
    // Summary:
    //     Represents a search index definition, which describes the fields and search behavior
    //     of an index.
    public class SearchIndex : IUtf8JsonSerializable
    {
        //
        // Summary:
        //     Initializes a new instance of the Azure.Search.Documents.Indexes.Models.SearchIndex
        //     class.
        //
        // Parameters:
        //   name:
        //     The name of the index.
        //
        // Exceptions:
        //   T:System.ArgumentException:
        //     name is an empty string.
        //
        //   T:System.ArgumentNullException:
        //     name is null.
        public SearchIndex(string name);
        //
        // Summary:
        //     Initializes a new instance of the Azure.Search.Documents.Indexes.Models.SearchIndex
        //     class.
        //
        // Parameters:
        //   name:
        //     The name of the index.
        //
        //   fields:
        //     Fields to add to the index.
        //
        // Exceptions:
        //   T:System.ArgumentException:
        //     name is an empty string.
        //
        //   T:System.ArgumentNullException:
        //     name or fields is null.
        public SearchIndex(string name, IEnumerable<SearchField> fields);

        //
        // Summary:
        //     The name of the scoring profile to use if none is specified in the query. If
        //     this property is not set and no scoring profile is specified in the query, then
        //     default scoring (tf-idf) will be used.
        public string DefaultScoringProfile { get; set; }
        //
        // Summary:
        //     Options to control Cross-Origin Resource Sharing (CORS) for the index.
        public CorsOptions CorsOptions { get; set; }
        //
        // Summary:
        //     A description of an encryption key that you create in Azure Key Vault. This key
        //     is used to provide an additional level of encryption-at-rest for your data when
        //     you want full assurance that no one, not even Microsoft, can decrypt your data
        //     in Azure Cognitive Search. Once you have encrypted your data, it will always
        //     remain encrypted. Azure Cognitive Search will ignore attempts to set this property
        //     to null. You can change this property as needed if you want to rotate your encryption
        //     key; Your data will be unaffected. Encryption with customer-managed keys is not
        //     available for free search services, and is only available for paid services created
        //     on or after January 1, 2019.
        public SearchResourceEncryptionKey EncryptionKey { get; set; }
        //
        // Summary:
        //     The type of similarity algorithm to be used when scoring and ranking the documents
        //     matching a search query. The similarity algorithm can only be defined at index
        //     creation time and cannot be modified on existing indexes. If null, the ClassicSimilarity
        //     algorithm is used.
        public SimilarityAlgorithm Similarity { get; set; }
        //
        // Summary:
        //     Gets the name of the index.
        [CodeGenMemberAttribute("name")]
        public string Name { get; }
        //
        // Summary:
        //     Gets the analyzers for the index.
        public IList<LexicalAnalyzer> Analyzers { get; }
        //
        // Summary:
        //     Gets the character filters for the index.
        public IList<CharFilter> CharFilters { get; }
        //
        // Summary:
        //     Gets or sets the fields in the index. Use Azure.Search.Documents.Indexes.FieldBuilder
        //     to define fields based on a model class, or Azure.Search.Documents.Indexes.Models.SimpleField,
        //     Azure.Search.Documents.Indexes.Models.SearchableField, and Azure.Search.Documents.Indexes.Models.ComplexField
        //     to manually define fields. Index fields have many constraints that are not validated
        //     with Azure.Search.Documents.Indexes.Models.SearchField until the index is created
        //     on the server.
        public IList<SearchField> Fields { get; set; }
        //
        // Summary:
        //     Gets the scoring profiles for the index.
        public IList<ScoringProfile> ScoringProfiles { get; }
        //
        // Summary:
        //     Gets the suggesters for the index.
        public IList<SearchSuggester> Suggesters { get; }
        //
        // Summary:
        //     Gets the token filters for the index.
        public IList<TokenFilter> TokenFilters { get; }
        //
        // Summary:
        //     Gets the tokenizers for the index.
        public IList<LexicalTokenizer> Tokenizers { get; }
        //
        // Summary:
        //     The Azure.ETag of the Azure.Search.Documents.Indexes.Models.SearchIndex.
        public ETag? ETag { get; set; }
    }

My question is how to set a custom Tokenizers, TokenFilters, ScoringProfiles...

1

1 Answers

1
votes

Collection properties are initialized by default in the new Azure .NET client libraries. Although you can't set the properties, you can still call Add on each one:

var index = new SearchIndex("myindex");
index.ScoringProfiles.Add(new ScoringProfile(...));

I personally find this less convenient since I like to write expression-based code, so I've already passed along this feedback to the Azure SDK team.