1
votes

I'm executing a Windows command and need to parse the results output, and compare part of the text with a string previously stored in java code.

But apparently the charset mismatch prevents the equals to return true.

Thats my code:

    ProcessBuilder pb = new ProcessBuilder();
    pb.command("systeminfo");
    Process shell = pb.start();
    InputStream shellIn = shell.getInputStream();


    InputStreamReader reader = new InputStreamReader(shellIn, "Cp1252");
    BufferedReader br = new BufferedReader(reader);

    String sCurrentLine;
    while((sCurrentLine = br.readLine()) != null) {

        // ... omitting parse of sCurrentLine for brevity
        System.out.println("DOS String:" + sCurrentLine);
        System.out.println("JAVA String: "+ Versão");
        System.out.println("Versão".equals(sCurrentLine));
    }

And my output will be: (Commandline window):

    Windows String: Versão
    JAVA String: VersÒo
    false

To a text file:

    Windows String: VersÒo
    JAVA String: Versão
    false

I've found a couple of similar issues here in stackoverflow but none of the anwsers worked for me.

converting String from Windows charset to UTF 8 in Java

Converting from Windows 1252 to UTF8 in Java: null characters with CharsetDecoder/Encoder

How to parse a string that is in a different encoding from java

Setting the default Java character encoding?

How to Find the Default Charset/Encoding in Java?

1
How is your Java source code file coded? UTF8? Windows Ansi/CP1252?FrankPl
According to stackoverflow.com/questions/1259084/…, you can use chcp to show the codepage that a console window is using, And that is not something like 1252, but 850 (an old DOS code page) on my computer. Maybe you should use that in your InputStreamReader.FrankPl
@Holger, this error is due other user edition, anyway, this details doesn't matter.Heitor
@FrankPl this solved the issue. Thanks a lot ! Now, how do we 'convert' a comment in a answer ?Heitor
I added the content as an answer.FrankPl

1 Answers

1
votes

The command line in most cases does not use a standard Windows code page, but an old DOS one. According to What encoding/code page is cmd.exe using?, you can find out which using the command chcp on the command line to find out which it uses in your environment. On my computer, this command shows 850. Thus, I would assume this is the codepage which is used, and hence you should use that in your call to new InputStreamReader.

I am not sure, however, if this applies to all versions of Windows in all locales. Actually, I never used a Japanese, Arabic, Chinese, or Korean Windows.