I am using HtmlCleaner library in order to parse/convert HTML files in java.
It seems that is not able to handle Spanish characters like 'ÁáÉéÍíÑñÓóÚúÜü'
Is there any property which I can set in HtmlCleaner for handling this or any other solution? Here's the code I'm using to invoke it:
CleanerProperties props = new CleanerProperties();
props.setRecognizeUnicodeChars(true);
java.io.File file = new java.io.File("C:\\example.html");
TagNode tagNode = new HtmlCleaner(props).clean(file);
new PrettyHtmlSerializer(props).writeToFile(tagNode, filePath, "utf-8");
– choop