0
votes

I want to search on content field and return content and file name. The query below is taken from NEST github page

Connection string:

var node = new Uri("http://localhost:9200");
var settings = new ConnectionSettings(node);
var client = new ElasticClient(settings);

My class:

The class of search type is below (I feel problem might be here):

public class myclass
{
    public string Content { get; set; }
    public string filename { get; set; }
}

So I need only content and filename which is in file.filename, but in my search it return null for file.filename but content do return in same query.

NEST API CALL:

var request = new SearchRequest
{
    From = 0,
    Size = 10,
    Query = new TermQuery { Name="Web", Field = "content", Value = "findthis" }
};

var response = client.Search<myclass>(request);
var twet = response.Documents.Select(t=>t.Content).ToList();

As I am new to elastic search so can't understand it. I even don't know why I am using term query to search a document while in kibana I user different queries and quite understandable match and match_phrase queries. So please help me get file.filename.

Edit: I have tried to include this too (later removed):

Source = new SourceFilter { Includes = ("file.filename") }

KIBANA Call:

This is the call from kibana console:

GET /extract/_search
{
   "from" : 0, "size" : 1
    , "query": {
            "match": {
                        "content": "findthis"
                     }
               }
}

The call returns following result I have used 1 result to show here:

Document in Elastic Search Index:

{
  "took": 322,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3330,
    "max_score": 4.693223,
    "hits": [
      {
        "_index": "extract",
        "_type": "doc",
        "_id": "29ebfd23bd7b276d7f3afc2bfad146d",
        "_score": 4.693223,
        "_source": {
          "content": """
                        my title
                        A looooong document text to findthis.
                    """,
          "meta": {
            "title": "my title",
            "raw": {
              "X-Parsed-By": "org.apache.tika.parser.DefaultParser",
              "Originator": "Microsoft Word 11",
              "dc:title": "my title",
              "Content-Encoding": "windows-1252",
              "Content-Type-Hint": "text/html; charset=windows-1252",
              "resourceName": "filename.htm",
              "ProgId": "Word.Document",
              "title": "my title",
              "Content-Type": "text/html; charset=windows-1252",
              "Generator": "Microsoft Word 11"
            }
          },
          "file": {
            "extension": "htm",
            "content_type": "text/html; charset=windows-1252",
            "last_modified": "2015-10-27T15:44:07.093+0000",
            "indexing_date": "2018-02-10T08:16:23.329+0000",
            "filesize": 32048,
            "filename": "filename.htm",
            "url": """file://D:\tmp\path\to\filename.htm"""
          },
          "path": {
            "root": "e1a38f7da342f641e3eefad1ed1ca0f2",
            "virtual": "/path/to/filename.htm",
            "real": """D:\tmp\path\to\filename.htm"""
          }
        }
      }
    ]
  }
}

I am using NEST Api to get document file.filename from elastic search 6 on same server.

ISSUE:

Even though I have mentioned above too. Problem is filename is returned null in NEST API while content does return.


SOLUTION 1: Using settings.DisableDirectStreaming(); I retrieved JSON result and created Following Class:

New Class:

public class Rootobject
    {
        public int took { get; set; }
        public bool timed_out { get; set; }
        public _Shards _shards { get; set; }
        public Hits hits { get; set; }
    }

    public class _Shards
    {
        public int total { get; set; }
        public int successful { get; set; }
        public int skipped { get; set; }
        public int failed { get; set; }
    }

    public class Hits
    {
        public int total { get; set; }
        public float max_score { get; set; }
        public Hit[] hits { get; set; }
    }

    public class Hit
    {
        public string _index { get; set; }
        public string _type { get; set; }
        public string _id { get; set; }
        public float _score { get; set; }
        public _Source _source { get; set; }
    }

    public class _Source
    {
        public string content { get; set; }
        public Meta meta { get; set; }
        public File file { get; set; }
        public Path path { get; set; }
    }

    public class Meta
    {
        public string title { get; set; }
        public Raw raw { get; set; }
    }

    public class Raw
    {
        public string XParsedBy { get; set; }
        public string Originator { get; set; }
        public string dctitle { get; set; }
        public string ContentEncoding { get; set; }
        public string ContentTypeHint { get; set; }
        public string resourceName { get; set; }
        public string ProgId { get; set; }
        public string title { get; set; }
        public string ContentType { get; set; }
        public string Generator { get; set; }
    }

    public class File
    {
        public string extension { get; set; }
        public string content_type { get; set; }
        public DateTime last_modified { get; set; }
        public DateTime indexing_date { get; set; }
        public int filesize { get; set; }
        public string filename { get; set; }
        public string url { get; set; }
    }

    public class Path
    {
        public string root { get; set; }
        public string _virtual { get; set; }
        public string real { get; set; }
    }

Query: Instead of TermQuery I used MatchQuery here is my query, connection string is as above:

 var request = new SearchRequest
 {
     From = 0,
     Size = 1,
     Query = new MatchQuery { Field = "content", Query = txtsearch.Text }
 };

New Problem: I tried much, though response does contain whole JSON result, but it is not being mapped properly.

I tried using Rootobject, Hits and Hit class but results only returned for _source as:

var response = client.Search<_Source>(request);
var twet = response.Documents.Select(t => t.file.filename).ToList();

Now I can retrieve content and file name but if I try using previous classes. The Hits and hits.total are returned as null.

I tried following queries:

var twet = response.Documents.SelectMany(t => t.hits.hits.Select(k => k._source.content)).ToList();

and

var twet1 = response.Hits.SelectMany(t => t.Source.hits.hits.Select(k => k._source.content)).ToList();

and

var twet1 = response.Documents.Select(t => t.Filename.fileName).ToList();

and

var twet = response.HitsMetadata.Hits.Select(t => t.Source.filename).ToList();

using Rootobject , Hits, Hit classes. While response does contain it.

So how can I use Rootobject class instead so that I can get whatever I want.

1
Can you provide a complete, reproducible, succinct example? For example, showing the ConnectionSettings you're using, indexing a document, then the search you are performing?Russ Cam
@RussCam I have added the connection string, an example document and the problem above, I have also formatted the document for easy read.abdul qayyum

1 Answers

2
votes

The elastic search server returns response as a JSON string and then Nest deserializes it into your required class.

In your case filename is a nested property inside the file property. To deserialize nested JSON properties check this link How to access nested object from JSON with Json.NET in C#

public class MyClass
{
    public string Content { get; set; }
    public FileClass File { get; set; }
}

public class Fileclass
{
    public string Filename { get; set; }
}

And then you can read filename like response.Documents.Select(t=>t.File.Filename).ToList();