I have a set of documents(Multiple Line Sentences Text). I would like to cluster them using carrot2. According to the xml file format specified in the documentation. There has to be a query and documents with the snippets and url and title.
My questions are the following:-
- What should be written in the query component in XML file??
- What should be given as the URL and title for the documents as I have neither of them. I just have documents(Multiple Line Texts) which I extracted from a dataset.
I think answer to the first question is *:*
. Is that correct??
Please help!!
Edit:-
The carrot2-wordbench throws the java.lang.NullPointerException
after specifying the xml file and pressing process.
I am confident that the error is due to the xml file being given as input.
Does anyone know about possible things wrong with the xml which could cause the program to throw the Exception?
I have not been able to figure this out for a long time.