56
votes

There are lots of posts about regexs to match a potentially empty string, but I couldn't readily find any which provided a regex which only matched an empty string.

I know that ^ will match the beginning of any line and $ will match the end of any line as well as the end of the string. As such, /^$/ matches far more than the empty string such as "\n", "foobar\n\n", etc.

I would have thought, though, that /\A\Z/ would match just the empty string, since \A matches the beginning of the string and \Z matches the end of the string. However, my testing shows that /\A\Z/ will also match "\n". Why is that?

11
There are many SO posts about regex to match an empty string, so at a cursory glance it seemed like it may be a duplicate. Consider changing your title to more specifically address your issue of ignoring line breaks.Scott Solmer
That's a post about a regex which doesn't match the empty string with a set of answers as to why. I really tried and couldn't find a post about a regex which only matched an empty string, let alone one which dealt with that and the difference between \z and \Z. I don't want to clutter up SO. If you can find a question this is a dup of, I'll gladly delete this one. That said, I added emphasis to the word ONLY in this title.Peter Alfvin
Remove the multiline flag and ^$ should workClay Risser
@JamRisser I understand the interaction with multi-line mode. I should have been explicit, but I'm asking about a regex to match only an empty string in multiline mode. Note, in particular, the last paragraph.Peter Alfvin

11 Answers

58
votes

I would use a negative lookahead for any character:

^(?![\s\S])

This can only match if the input is totally empty, because the character class will match any character, including any of the various newline characters.

46
votes

Wow, ya'll are overthinking it. It's as simple as the following. Besides, many of those answers aren't understood by the RE2 dialect used by C and golang.

^$
12
votes

As explained in http://www.regular-expressions.info/anchors.html under the section "Strings Ending with a Line Break", \Z will generally match before the end of the last newline in strings that end in a newline. If you want to only match the end of the string, you need to use \z. The exception to this rule is Python.

In other words, to exclusively match an empty string, you need to use /\A\z/.

5
votes

^$ -- regex to accept empty string.And it wont match "/n" or "foobar/n" as you mentioned. You could test this regex on https://www.regextester.com/1924.

If you have your existing regex use or(|) in your regex to match empty string. For example /^[A-Za-z0-9&._ ]+$|^$/

4
votes

I believe Python is the only widely used language that doesn't support \z in this way (yet). There are Python bindings for Russ Cox / Google's super fast re2 C++ library that can be "dropped in" as a replacement for the bundled re.

There's an excellent discussion (with workarounds) for this at Perl Compatible Regular Expression (PCRE) in Python, here on SO.

python
Python 2.7.11 (default, Jan 16 2016, 01:14:05) 
[GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 on freebsd10
Type "help", "copyright", "credits" or "license" for more information.
>>> import re2 as re
>>> 
>>> re.match(r'\A\z', "")
<re2.Match object at 0x805d97170>

@tchrist's answer is worth the read.

2
votes

Try looking here: https://docs.python.org/2/library/re.html

I ran into the same problem you had though. I could only build a regex that would match only the empty string and also "\n". Try trimming/replacing the newline characters in the string with another character first.

I was using http://pythex.org/ and trying weird regexes like these:

()

(?:)

^$

^(?:^\n){0}$

and so on.

2
votes

The answer may be language dependent, but since you don't mention one, here is what I just came up with in js:

 var a = ['1','','2','','3'].join('\n');

 console.log(a.match(/^.{0}$/gm)); // ["", ""]

 // the "." is for readability. it doesn't really matter
 a.match(/^[you can put whatever the hell you want and this will also work just the same]{0}$/gm)

You could also do a.match(/^(.{10,}|.{0})$/gm) to match empty lines OR lines that meet a criteria. (This is what I was looking for to end up here.)

I know that ^ will match the beginning of any line and $ will match the end of any line

This is only true if you have the multiline flag turned on, otherwise it will only match the beginning/end of the string. I'm assuming you know this and are implying that, but wanted to note it here for learners.

0
votes

Based on the most-approved answer, here is yet another way:

var result = !/[\d\D]/.test(string);  //[\d\D] will match any character
0
votes

As @Bohemian and @mbomb007 mentioned before, this works AND has the additional advantage of being more readable:

console.log(/^(?!.)/s.test("")); //true

0
votes

Another possible answer considering also the case that an empty string might contain several whitespace characters for example spaces,tabs,line break characters can be the folllowing pattern.

pattern = r"^(\s*)$"

This pattern matches if the string starts and ends with zero or more whitespace characters.

It was tested in Python 3

-1
votes

You are not asking about the empty string. A string in regex is not a grouping of letters, numbers, and punctuation. It is a grouping of ASCII characters. So a "\n" is not an empty string. It has an ASCII character "\n" in it. link