I have the following Grok patterns defined in a pattern file
HOSTNAME \b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b)
EMAILLOCALPART [a-zA-Z][a-zA-Z0-9_.+-=:]+
EMAILADDRESS %{EMAILLOCALPART}@%{HOSTNAME}
For some reason this doesn't compile when run against http://grokdebug.herokuapp.com/ with the following input, it simply returns "Compile error"
Node1\Spam.log.2016-05-03 171 1540699703 03/May/2016 00:00:01 +0000 INFO [http-bio-0.0.0.0-8001-exec-20429] EngagementServiceImpl logDefault 192.168.1.122 77777777777777777 [email protected] > initiated Stuff: 8675309, provider: 8675309, member: 8675309
Is there some reason I'm getting a compile error / will this even match the email in that log line?
Thanks,
(?<email>[a-zA-Z0-9_.+=:-]+@[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*)
. No idea why that fails, but+-=
is definitely a wrong pattern, the-
should not be a range operator (must be escaped or put at the end of the char class). Also, you need no\b
in the resulting regex since@
is a non-word char, and[0-9A-Za-z]
matches a word char. – Wiktor Stribiżew(?<email>[a-zA-Z0-9_.+=:-]+@[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*)
(and(?<email>[\w.+=:-]+@[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:[.](?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*)
) works at grokdebug.herokuapp.com. BTW, github.com/rgevaert/grok-patterns/blob/master/grok.d/… defines the email pattern differently:EMAILADDRESS %{EMAILADDRESSPART:local}@%{EMAILADDRESSPART:remote}
– Wiktor Stribiżew