45
votes

What are the rules for the escape character \ in string literals? Is there a list of all the characters that are escaped?

In particular, when I use \ in a string literal in gedit, and follow it by any three numbers, it colors them differently.

I was trying to create a std::string constructed from a literal with the character 0 followed by the null character (\0), followed by the character 0. However, the syntax highlighting alerted me that maybe this would create something like the character 0 followed by the null character (\00, aka \0), which is to say, only two characters.

For the solution to just this one problem, is this the best way to do it:

std::string ("0\0" "0", 3)  // String concatenation 

And is there some reference for what the escape character does in string literals in general? What is '\a', for instance?

6
Related, on how to escape an escape sequence. The best solution is to use concatenation as you had. - MPelletier
If you need a single ` just use \`. - MPelletier
It looks like I can also use the initializer list syntax: std::string { '0', 0, '0' }; - David Stone
Not only can I use the initializer list syntax, I now highly recommend it over any other method of constructing a string that requires you to specify a size or uses escaped characters. Consider the subtle undefined behavior outlined in stackoverflow.com/questions/164168/… - David Stone
I realize now my comment at 1:32 is completely obfuscated... I have no idea what I meant... - MPelletier

6 Answers

77
votes

Control characters:

(Hex codes assume an ASCII-compatible character encoding.)

  • \a = \x07 = alert (bell)
  • \b = \x08 = backspace
  • \t = \x09 = horizonal tab
  • \n = \x0A = newline (or line feed)
  • \v = \x0B = vertical tab
  • \f = \x0C = form feed
  • \r = \x0D = carriage return
  • \e = \x1B = escape (non-standard GCC extension)

Punctuation characters:

  • \" = quotation mark (backslash not required for '"')
  • \' = apostrophe (backslash not required for "'")
  • \? = question mark (used to avoid trigraphs)
  • \\ = backslash

Numeric character references:

  • \ + up to 3 octal digits
  • \x + any number of hex digits
  • \u + 4 hex digits (Unicode BMP, new in C++11)
  • \U + 8 hex digits (Unicode astral planes, new in C++11)

\0 = \00 = \000 = octal ecape for null character

If you do want an actual digit character after a \0, then yes, I recommend string concatenation. Note that the whitespace between the parts of the literal is optional, so you can write "\0""0".

4
votes

\0 will be interpreted as an octal escape sequence if it is followed by other digits, so \00 will be interpreted as a single character. (\0 is technically an octal escape sequence as well, at least in C).

The way you're doing it:

std::string ("0\0" "0", 3)  // String concatenation 

works because this version of the constructor takes a char array; if you try to just pass "0\0" "0" as a const char*, it will treat it as a C string and only copy everything up until the null character.

Here is a list of escape sequences.

4
votes

\a is the bell/alert character, which on some systems triggers a sound. \nnn, represents an arbitrary ASCII character in octal base. However, \0 is special in that it represents the null character no matter what.

To answer your original question, you could escape your '0' characters as well, as:

std::string ("\060\000\060", 3);

(since an ASCII '0' is 60 in octal)

The MSDN documentation has a pretty detailed article on this, as well cppreference

1
votes

I left something like this as a comment, but I feel it probably needs more visibility as none of the answers mention this method:

The method I now prefer for initializing a std::string with non-printing characters in general (and embedded null characters in particular) is to use the C++11 feature of initializer lists.

std::string const str({'\0', '6', '\a', 'H', '\t'});

I am not required to perform error-prone manual counting of the number of characters that I am using, so that if later on I want to insert a '\013' in the middle somewhere, I can and all of my code will still work. It also completely sidesteps any issues of using the wrong escape sequence by accident.

The only downside is all of those extra ' and , characters.

0
votes

With the magic of user-defined literals, we have yet another solution to this. C++14 added a std::string literal operator.

using namespace std::string_literals;
auto const x = "\0" "0"s;

Constructs a string of length 2, with a '\0' character (null) followed by a '0' character (the digit zero). I am not sure if it is more or less clear than the initializer_list<char> constructor approach, but it at least gets rid of the ' and , characters.

0
votes

ascii is a package on linux you could download. for example sudo apt-get install ascii ascii

Usage: ascii [-dxohv] [-t] [char-alias...]
-t = one-line output  -d = Decimal table  -o = octal table  -x = hex table
-h = This help screen -v = version information
Prints all aliases of an ASCII character. Args may be chars, C \-escapes,
English names, ^-escapes, ASCII mnemonics, or numerics in decimal/octal/hex.`

This code can help you with C/C++ escape codes like \x0A