0
votes

So today I was trying to put together some regex for a fail2ban filter. This is where I noticed that fail2ban has some issues with nested OR'ing in regex patterns.

Input string: 127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] "a

Pattern: ^<HOST> -.*\"(c|b)|a

Here's an example:

$ fail2ban-regex "127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] \"a" '^<HOST> -.*\"(c|b)|a'

Running tests
=============

Use   failregex line : ^<HOST> -.*\"(c|b)|a
Use      single line : 127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] "a


Results
=======

Failregex: 0 total

Ignoreregex: 0 total

Date template hits:
|- [# of hits] date format
|  [1] Day/MONTH/Year:Hour:Minute:Second
`-

Lines: 1 lines, 0 ignored, 0 matched, 1 missed
|- Missed line(s):
|  127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] "a
`-

I've noted that this will actually succeed and report a match if you run the regex pattern a|(c|b), however I need to be able to check both sides of the first OR to see if the first condition is matched (for example, if a HTTP request type is not POST or GET), ignore the rest of the regex pattern, else run the remaining regex pattern after the first OR. One other thing is that grouping doesn't seem to matter, as it will always only seemingly match on the first portion of the most outer OR.

Here we get a match:

$ fail2ban-regex "127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] \"a" '^<HOST> -.*\"a|(c|b)'

Running tests
=============

Use   failregex line : ^<HOST> -.*\"a|(c|b)
Use      single line : 127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] "a


Results
=======

Failregex: 1 total
|-  #) [# of hits] regular expression
|   1) [1] ^<HOST> -.*\"a|(c|b)
`-

Ignoreregex: 0 total

Date template hits:
|- [# of hits] date format
|  [1] Day/MONTH/Year:Hour:Minute:Second
`-

Lines: 1 lines, 0 ignored, 1 matched, 0 missed

I say this may be a bug because of my testing with sites like regex101.com and debuggex.com reporting matches with both of these regex patterns.

1
Your escaping on \"a" looks incomplete or incorrect; what is the purpose of that? - l'L'l
this is a bash string. for example, I couldn't echo """, it would be echo "\"" - cellsheet
Maybe add the original (input) string to your question; that might help. - l'L'l
String added (:. Also for now I decided to split the filters into 2 so I could go ahead and implement it, but for now I'm curious if this is a bug in the engine fail2ban uses. - cellsheet
Normally I don't think you need to put the regex in quotes for fail2ban; anyway, another question - why the capture group of (c|b), and then a outside of it? why not just do [abc]? - l'L'l

1 Answers

1
votes

This regex:

^<HOST> -.*\"(c|b)|a

...is the same as this:

a|^<HOST> -.*\"(c|b)

The only difference is the order the alternatives are tried in. If the regex were all that mattered, this should match either way. However, a quick look at the fail2ban docs tells me every failregex must match the host name/IP associated with the request. You've got essentially two regexes there (^<HOST> -.*\"(c|b) and a), one of which doesn't contain <HOST>.

I'm not sure what you're trying to accomplish, but if you can't do it by putting the pipe inside parens ((a|b|c)), you probably need to use two separate regexes.