I created a custom Lucene index in a Sitecore 8.1 environment like this:
<?xml version="1.0"?>
<configuration xmlns:x="http://www.sitecore.net/xmlconfig/">
<sitecore>
<contentSearch>
<configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
<indexes hint="list:AddIndex">
<index id="Products_index" type="Sitecore.ContentSearch.LuceneProvider.LuceneIndex, Sitecore.ContentSearch.LuceneProvider">
<param desc="name">$(id)</param>
<param desc="folder">$(id)</param>
<param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
<configuration ref="contentSearch/indexConfigurations/ProductIndexConfiguration" />
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync" />
</strategies>
<commitPolicyExecutor type="Sitecore.ContentSearch.CommitPolicyExecutor, Sitecore.ContentSearch">
<policies hint="list:AddCommitPolicy">
<policy type="Sitecore.ContentSearch.TimeIntervalCommitPolicy, Sitecore.ContentSearch" />
</policies>
</commitPolicyExecutor>
<locations hint="list:AddCrawler">
<crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
<Database>master</Database>
<Root>/sitecore/content/General/Product Repository</Root>
</crawler>
</locations>
<enableItemLanguageFallback>false</enableItemLanguageFallback>
<enableFieldLanguageFallback>false</enableFieldLanguageFallback>
</index>
</indexes>
</configuration>
<indexConfigurations>
<ProductIndexConfiguration type="Sitecore.ContentSearch.LuceneProvider.LuceneIndexConfiguration, Sitecore.ContentSearch.LuceneProvider">
<initializeOnAdd>true</initializeOnAdd>
<analyzer ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/analyzer" />
<documentBuilderType>Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder, Sitecore.ContentSearch.LuceneProvider</documentBuilderType>
<fieldMap type="Sitecore.ContentSearch.FieldMap, Sitecore.ContentSearch">
<fieldNames hint="raw:AddFieldByFieldName">
<field fieldName="_uniqueid" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
<analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
</field>
<field fieldName="key" storageType="YES" indexType="UNTOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider"/>
</fieldNames>
</fieldMap>
<documentOptions type="Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilderOptions, Sitecore.ContentSearch.LuceneProvider">
<indexAllFields>true</indexAllFields>
<include hint="list:AddIncludedTemplate">
<Product>{843B9598-318D-4AFA-B8C8-07E3DF5C6738}</Product>
</include>
</documentOptions>
<fieldReaders ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/fieldReaders"/>
<indexFieldStorageValueFormatter ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/indexFieldStorageValueFormatter"/>
<indexDocumentPropertyMapper ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/indexDocumentPropertyMapper"/>
</ProductIndexConfiguration>
</indexConfigurations>
</contentSearch>
</sitecore>
</configuration>
Actually a quite straightforward index that points to a root, includes a template and stores a field. It all works. But I need something extra:
- Limit the entries in the index to the latest version
- Limit the entries in the index to one language (English)
I know this can be easily triggered in the query as well, but I want to get my indexes smaller so they will (re)build faster.
The first one could be solved by indexing web instead of master, but I actually also want the unpublished ones in this case. I'm quite sure I will need a custom crawler for this.
The second one (limit the languages) is actually more important as I will also need this in other indexes. The easy answer is probably also to use a custom crawler but I was hoping there would be a way to configure this without custom code. Is it possible to configure an custom index to only include a defined set of languages (one or more) instead of all?