The example on Solr's wiki page shows a kind of indexed hierarchy nodes:
Doc#1: 0/NonFic, 1/NonFic/Law
Doc#2: 0/NonFic, 1/NonFic/Sci
Doc#3: 0/NonFic, 1/NonFic/Hist
How do I index my paths to achieve this? Do I manually split my paths, count the nodes, and generate these terms myself and store them as an array in Solr (multiValued field) or is it possible to configure Solr's path hierarchy tokenizer to apply the indexes itself?
For reference, I though about generating the paths like this:
public class DocumentPathBuilder {
private List<String> nodes = new ArrayList<>();
public static DocumentPathBuilder newInstance() {
return new DocumentPathBuilder();
}
public static String escapeText(String input) {
if (input == null)
throw new NullPointerException("Cannot escape null input!");
return input.replaceAll(ESearchDocumentPath.HIERARCHY_SEPERATOR, "").toUpperCase().trim();
}
public DocumentPathBuilder add(String node) {
nodes.add(escapeText(node));
return this;
}
public DocumentPathBuilder add(Collection<String> nodes) {
this.nodes.addAll(nodes.stream()
.map(n->escapeText(n))
.collect(Collectors.toList())
);
return this;
}
public List<String> build() {
List<String> result = new ArrayList<>();
for (int i = 0; i < nodes.size(); i++) {
StringJoiner joiner = new StringJoiner(ESearchDocumentPath.HIERARCHY_SEPERATOR);
joiner.add(""+i);
for (int j = 0; j <= i; j++) {
joiner.add(nodes.get(j));
}
result.add(joiner.toString()+ESearchDocumentPath.HIERARCHY_SEPERATOR);
}
return result;
}
}
Example input:
List<String> build = DocumentPathBuilder.newInstance()
.add("A")
.add("350")
.add(Arrays.asList("350-01", "FIGUTZRg"))
.build();
Output entries:
0 = "0>A>"
1 = "1>A>350>"
2 = "2>A>350>350-01>"
3 = "3>A>350>350-01>FIGUTZRG>"
Also, what is the difference? If I store my generated values in a multiValued field, do I get the same result If Solr would have generated it with path tokenizer?