I searched in Google but didn't find any clue related to implement analyzer in Xapian, even Xapian may not support using analyzer like lucene. In another word, I can't extend to support in Chinese. Am I right?
I searched in Xapian C++ API, only found TermGenerator which may related to extract word. There is a flag named FLAG_CJK_NGRAM
, it can split UTF-8 CJK word, let's say ABCD, it will split it into AB, BC, CD and A, B, C, D. That's very simple and straightforward. I suppose I need a more accurate solution, it seems I need implement or migrate mature solution(like jieba) to Xapian. Am I right?