How to make PrintStream recognize carriage returns (unicode) in a String?

Question

I'm essentially trying to use PrintStream to write a single string to a file that contains line breaks, in this case I think it will be a carriage return (CR) '\u000D'. As to where these line breaks will happen is unknown, so I have to format the String itself to do line breaks instead of having PrintStream do it.

Here is where I add the carriage return in the string (which is line):

if(useNLTranslator && !isNumber(section))
    line = nlt.translate(line) + System.getProperty("line.separator");

Here is where I print the string to the text file using PrintStream:

try
{
    File file = new File(answer);
    PrintStream print = new PrintStream(file);

    print.println(result);
}
//result is the same as the line string above once its all put together

I'm also checking through the String to find where there is a carriage return character and replacing it with something else, the reason for this I won't get into as it would be a very long explanation. I'm using the following to find the carriage return in the String:

String cr = System.getProperty("line.separator");

The problem I'm having is that it is not recognizing the carriage return when searching through the text. This text is taken fairly directly from a Microsoft Word document, which might be part of the issue. Here is what I have that catches when it doesn't recognize the carriage return:

//@@DEBUG -- KEEP THIS
String charValue = Character.toString(text.charAt(index));

try{
    current[i] = alphaBits[Character.getNumericValue(text.charAt(index)) - 10][i];
}catch(ArrayIndexOutOfBoundsException e){

    //@@DEBUG -- KEEP THIS
    System.out.println("Unrecognized character: " + charValue);
    Character whatIsThis = charValue.charAt(0);
    String name = Character.getName(whatIsThis.charValue());
    System.out.println("Unrecognized character name: " + name);
    System.out.print("You may want to consider adding this character");
    System.out.println(" to the list of recognized characters");

    return "Unrecognized character found.";
}

(1) How are you getting your text from Microsoft Word? (2) are you looking for line breaks or "hard returns"/paragraph marks? Word doesn't mark up line breaks resulting from word wrapping internally, so there would be no detectable character codes in that case in a .doc or .docx (I think). — user1379931
1) I guess I should have said that I actually am taking what's in a Microsoft Word file and converting it to a text file (.txt). 2) I was hoping to catch hard returns and soft returns, I didn't think about soft returns for this though. Is there a way to catch soft returns? — user3499639

user3499639 user3499639 · Accepted Answer · 2014-05-18T19:24:01

So I actually just figured out the issue I was having. And I guess it would have been hard for anyone to figure this out as I didn't explain what the translate() method did. Oops.

if(useNLTranslator && !isNumber(section))
    line = nlt.translate(line) + nlt.translate(System.getProperty("line.separator"));

Before I wasn't translating the carriage return / line separator, so it wasn't recognizing it since it was in the wrong format. Thanks for helping me out with this problem though!

How to make PrintStream recognize carriage returns (unicode) in a String?

1 Answers