1
votes

The function appendReplacement(StringBuffer sb, String replacement) in Matcher is ignoring escape characters with double backslash. I want to replace two lines into one line separated by \N as it is. Here's my code:

import java.util.regex.Matcher;
import java.util.regex.Pattern;



public class Test {

    public static void main(String[] args) {
        String str = "Lorem ipsum dolor sit amet,\n"
                + "consectetur adipiscing elit\n"
                + "\n"
                + "sed do eiusmod tempor incididunt ut\n"
                + "labore et dolore magna aliqua\n";
        StringBuffer sb = new StringBuffer();
        Pattern p = Pattern.compile("(.+)\n(.+)\n");
        Matcher m = p.matcher(str);
        while(m.find()) {
            String xyz = m.group(1) + "\\N" + m.group(2);
            System.out.println(xyz);
            m.appendReplacement(sb, xyz);
        }                    
        m.appendTail(sb);
        System.out.println("\n" + sb);
    }

}  

OUTPUT:

Lorem ipsum dolor sit amet,\Nconsectetur adipiscing elit sed do eiusmod tempor incididunt ut\Nlabore et dolore magna aliqua

Lorem ipsum dolor sit amet,Nconsectetur adipiscing elit sed do eiusmod tempor incididunt utNlabore et dolore magna aliqua

1
Your replacement text does not contain a double backslash. It contains one backslash character. In a Java String literal, "\\" is one character. - VGR
String xyz = m.group(1) + "\\N" + m.group(2); - Jimmy
Yes, I saw that in your question. "\\N" is two characters. Regex patterns and replacement text have a syntax that is independent of Java String escapes. A literal backslash must appear as two U+005C REVERSE SOLIDUS characters in a regex pattern or replacement. In Java that would be "\\\\". - VGR

1 Answers

0
votes

Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

You are inserting \N in your xyzString object. The append function will replace backslashed characters with the replacement as if you put it in a String literal. Thus, to get a \N through the append function, you need to escape it twice: \\\\N