I have an array of searchable terms, and I want to use Lucene to basically CTRL-F through this stack of documents and find and store the locations of all of those terms within that stack of documents. For example:
Terms: "A", "B", "C"
Doc1: "CREATION" Doc2: "A BIG CAR" Doc3: "DOUBLE TROUBLE"
If I query the letter "A", I want to be able to say that there are 3 "A"s:
- Doc1 at position 4
- Doc2 at position 1
- Doc2 at position 8
Something like that. How can I do this? So far, I'm just using a StandardAnalyzer like so:
public Analyzer _analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);
// for some directory defined here
using (var indexWriter = new IndexWriter(directory, _analyzer, true, new IndexWriter.MaxFieldLength(IndexWriter.DEFAULT_MAX_FIELD_LENGTH)))
{
using (var textReader = new StreamReader(blobStream))
{
// this code should analyze and write my indexes to the lucene instance
var text = await textReader.ReadToEndAsync();
var document = new Document();
document.Add(new Field("Text", text, Field.Store.NO, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS));
document.Add(new Field("DocId", docId.ToString(), Field.Store.YES, Field.Index.NOT_ANALYZED));
document.Add(new Field("FamilyId", familyId.ToString(), Field.Store.YES, Field.Index.NOT_ANALYZED));
indexWriter.AddDocument(document);
}
}
Lucene originally generates a lot of documents, but then deletes all but the .cfs file. How do I keep the other files to do my queries?