0
votes

I would like to capture the word unknown and anything after abcd, abcd.com\ and unknown

unknown                     
abcd\svc-backup
abcd\swt034         
abcd\svc-app-login  
abcd.com\chi572 
abcd\daj144 
abcd\smi556
abcd\mki317
abcd\aiw014
abcd\joh488
abcd\ymc965 
abcd\jet041
abcd\rjo220 
abcd\mst790
abcd.com\sre590

It captures fine with the regex

https://regex101.com/r/c9vdia/2/

But when I use this in the Splunk search its just throwing my domain

index="paloalto"
| table user 
| rex field=user "(?P<user_name>((?:abcd\([A-Za-z0-9-]+|\w+)))" 

I am only getting the domain name (abcd) but users without domain looks good.

1
What should happen to firstname.lastname?revo
Its extracting as well. Only abcd and acd.com has the issuespectrum
But you are not using the same regex, you are trying to match a ) by escaping it. See regex101.com/r/v08cz7/2. Try (?P<user_name>(?:abcd.*\\)([A-Za-z0-9-]+)|\w+.?) Note that if you want to match the dot literally you have to escape it. DemoThe fourth bird
To get a bit more precise match, you might try (?P<user_name>(?:abcd[^\\]*\\)([A-Za-z0-9]+(?:-[A-Za-z0-9]+)*)|\w+\.?) See regex101.com/r/oW65Fr/1The fourth bird
Thanks for your answer but its still its extracting it as abcd\ :( not the actual user namespectrum

1 Answers

0
votes

This will do it, based on your sample data:

(?<username>[a-z0-9A-Z\-\_\.\w]+)(\s?)+$

There are probably more efficient formulations. If you don't have spaces at the end of the line, you can use this:

(?<username>[a-z0-9A-Z\-\_\.]+)$