1
votes

I'm trying to perform a search on top of a dictionary using the Search method from RavenDB 4. Strangely, if the search term is the word in or it I get random results back. I'm absolutely sure that none of the records contains those words. It also happens when executing the equivalent lucene query on the studio. It works as expected when I enter a valid search term like the employee's name, number, etc.

I've managed to create this simple scenario based on the real one.

Here's the index:

public class Search : AbstractIndexCreationTask<Employee, Page>
{
    public Search()
    {
        Map = employees => from employee in employees
                           select new
                           {
                               Id = employee.Id,
                               Details = employee.Details
                           };

        Reduce = results => from result in results
                            group result by new
                            {
                                result.Id,
                                result.Details
                            }
                            into g
                            select new
                            {
                                g.Key.Id,
                                g.Key.Details
                            };

        Index("Details", FieldIndexing.Search);
    }
}

Employee class:

public class Employee 
{
    public string Id { get; set; }
    public Dictionary<string, object> Details { get; set; }
}

Adding employees:

details = new Dictionary<string, object>();
details.Add("EmployeeNo", 25);
details.Add("FirstNames", "Yuri");
details.Add("Surname", "Cardoso");
details.Add("PositionCode", "XYZ");
details.Add("PositionTitle", "Developer");

employee = new Employee
{
    Details = details
};

session.Store(employee);
session.SaveChanges();

Search method:

var searchTerm = "in";

var result = session
    .Query<Page, Search>()
    .Search(i => i.Details, $"EmployeeNo:({searchTerm})")
    .Search(i => i.Details, $"FirstNames:({searchTerm})", options: SearchOptions.Or)
    .Search(i => i.Details, $"Surname:({searchTerm})", options: SearchOptions.Or)
    .Search(i => i.Details, $"PositionCode:({searchTerm})", options: SearchOptions.Or)
    .Search(i => i.Details, $"PositionTitle:({searchTerm})", options: SearchOptions.Or)
    .ToList();

Lucene query outputed:

from index 'Search' where search(Details, "EmployeeNo:(it)") 
or search(Details, "FirstNames:(it)") 
or search(Details, "Surname:(it)") 
or search(Details, "PositionCode:(it)") 
or search(Details, "PositionTitle:(it)")

Any idea why random results are returned when those specific words are enterered?

2

2 Answers

1
votes

The issue is stop words. Certain terms are so common, that they are meaningless for searching using full text search. is, it, they, are, etc. They are erased by the query analyzer. See the discussion here: https://ravendb.net/docs/article-page/4.2/Csharp/indexes/using-analyzers

You can use a whitespace analyzer, instead of the Standard Analyzer, since the former doesn't eliminate stop words.

0
votes

After getting help from the RavenDB group guys, we've managed to find a solution for my scenario.

Employee:

public class Employee
{
    public string Id { get; set; }
    public string DepartmentId { get; set; }
    public Dictionary<string, object> Details { get; set; }
}

Department:

public class Department
{
    public string Id { get; set; }
    public string Name { get; set; }
}

Page:

public class Page
{
    public string Id { get; set; }
    public string Department { get; set; }
    public Dictionary<string, object> Details { get; set; }
}

Index (with dynamic fields):

public class Search : AbstractIndexCreationTask<Employee, Page>
{
    public Search()
    {
        Map = employees => from employee in employees
                           let dept = LoadDocument<Department>(employee.DepartmentId)
                           select new
                           {
                               employee.Id,
                               Department = dept.Name,
                               _ = employee.Details.Select(x => CreateField(x.Key, x.Value))
                           };

        Store(x => x.Department, FieldStorage.Yes);
        Index(Constants.Documents.Indexing.Fields.AllFields, FieldIndexing.Search);
    }
}

Query:

using (var session = DocumentStoreHolder.Store.OpenAsyncSession())
{
    var searchTearm = "*yu* *dev*";

    var result = await session
        .Advanced
        .AsyncDocumentQuery<Page, Search>()
        .Search("Department", searchTearm)
        .Search("EmployeeNo", searchTearm)
        .Search("FirstNames", searchTearm)
        .Search("Surname", searchTearm)  
        .Search("PositionCode", searchTearm)
        .Search("PositionTitle", searchTearm)
        .SelectFields<Page>()
        .ToListAsync();
}

Everything seems to be working fine this way, no more random results. Big thanks to Ayende and Egor.