I have a website that users may enter an accented character search term. Since users may come from various countries, various OS, the charset accented characters they input may be encoded in windows-1252, iso-8859-1, or even iso-8859-X, windows-125X.
I am using Perl, and my index server is Solr 8, all data in utf8. I can use decode+encode to convert it if the source charset is known, but how could I convert an unknown accented to utf8? How could I detect the charset of the source accented characters, in Perl?
use utf8;
use Encode;
encode("utf8",decode("cp1252",$input));