12
votes

How do I search within an array field?

I am using solr 4.2 with default settings. I indexed a few html and pdf documents using SolrNet. Here is a sample result for such a document when I search using the admin search *:*

enter code here
<doc>
<str name="id">2</str>
<date name="last_modified">2011-12-19T17:33:25Z</date>
<str name="author">name</str>
<str name="author_s">name</str>
<arr name="title">
  <str>CALIFORNIA CODES</str>
</arr>
<arr name="content_type">
  <str>application/pdf</str>
</arr>
<str name="resourcename">T01041.pdf</str>
<arr name="content">
  <str> PDF text here </str>
</arr>
<long name="_version_">1431314431195742208</long>
</doc>

The search using content:* returns 0 results.

3

3 Answers

15
votes

Instead of content:* try with content:[* TO *]. That will fetch all documents that have the field content non-empty.

For querying arrays/multi-valued fields, it depends on what you want to do. If you have a multi-valued field like:

<arr name="tag_names">
    <str>death</str>
    <str>history</str>
    <str>people</str>
    <str>historical figures</str>
    <str>assassinations</str>
</arr>

and you want to find documents having both death and history as tag_names then issue a query like

q=tag_names:(death AND history)

To do an OR, use

q=tag_names:(death OR history)
3
votes

The answer to your question is very simple.

Your Schema.xml file says that the field name="content" indexed="false" i.e. your content field is not searchable. So if you search anything for "content" it will return 0 results.

Please change your schema.xml file and make content field as indexed="true", so it will make the field searable.

Save the file
Restart Solr.
Clear the index.
Reindex the documents

Now you will be able to do search on content:*

Please accept the answer if it resolves your problem...

-1
votes

text:* works. It returns all my docs.

I got this from the schema:

     <!-- Main body of document extracted by SolrCell.
        NOTE: This field is not indexed by default, since it is also copied to "text"
        using copyField below. This is to save space. Use this field for returning and
        highlighting document content. Use the "text" field to search the content. -->
   <field name="content" type="text_general" indexed="false" stored="true" multiValued="true"/>


   <!-- catchall field, containing all other searchable text fields (implemented
        via copyField further on in this schema  -->
   <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>