I have an ElasticSearch object with these fields:
[Keyword]
public List<string> Tags { get; set; }
[Text]
public string Title { get; set; }
And, before I used to get the top Tags, in all the documents, using this code:
var Match = Driver.Search<Metadata>(_ => _
.Query(Q => Q
.Term(P => P.Category, (int)Category)
&& Q.Term(P => P.Type, (int)Type))
.FielddataFields(F => F.Fields(F1 => F1.Tags, F2 => F2.Title))
.Aggregations(A => A.Terms("Tags", T => T.Field(F => F.Tags)
.Size(Limit))));
But with Elastic 5.1, I get an error 400 with this hint:
Fielddata is disabled on text fields by default. Set fielddata=true on [Tags] in order to load fielddata in memory by uninverting the inverted index.
Then the ES documentation about parameter mapping tells you "It usually doesn’t make sense to do so" and to "have a text field for full text searches, and an unanalyzed keyword field with doc_values enabled for aggregations".
But the only doc with this is for 5.0, and the same page for 5.1 seem to not exist.
Now, 5.1 has a page about Term Aggregation that seems to cover what I need, but there is absolutely nothing to be found in C# / Nest that I can use.
So, I'm trying to figure out how I can just get the top words, across all documents, from the Tags (where each tag is its own word; for example "New York" is not "New" and "York") and the title (where each word is its own thing) in C#.
I need to edit this post because there seems to be a deeper problem. I wrote some test code that illustrates the issue:
Let's create a simple object:
public class MyObject
{
[Keyword]
public string Id { get; set; }
[Text]
public string Category { get; set; }
[Text(Fielddata = true)]
public string Keywords { get; set; }
}
create the index:
var Uri = new Uri(Constants.ELASTIC_CONNECTIONSTRING);
var Settings = new ConnectionSettings(Uri)
.DefaultIndex("test")
.DefaultFieldNameInferrer(_ => _)
.InferMappingFor<MyObject>(_ => _.IdProperty(P => P.Id));
var D = new ElasticClient(Settings);
fill the index with random stuff:
for (var i = 0; i < 10; i++)
{
var O = new MyObject
{
Id = i.ToString(),
Category = (i % 2) == 0 ? "a" : "b",
Keywords = (i % 3).ToString()
};
D.Index(O);
}
and do the query:
var m = D.Search<MyObject>(s => s
.Query(q => q.Term(P => P.Category, "a"))
.Source(f => f.Includes(si => si.Fields(ff => ff.Keywords)))
.Aggregations(a => a
.Terms("Keywords", t => t
.Field(f => f.Keywords)
.Size(Limit)
)
)
);
It fails the same way as before, with a 400 and:
Fielddata is disabled on text fields by default. Set fielddata=true on [Keywords] in order to load fielddata in memory by uninverting the inverted index.
but Fielddata is set to true on [Keywords], yet it keeps complaining about it.
so, let's get crazy and modify the class this way:
public class MyObject
{
[Text(Fielddata = true)]
public string Id { get; set; }
[Text(Fielddata = true)]
public string Category { get; set; }
[Text(Fielddata = true)]
public string Keywords { get; set; }
}
that way everything is a Text and everything has Fielddata = true.. well, same result.
so, either I am really not understanding something simple, or it's broken or not documented :)