0
votes

If this question has been asked and answered before, my apologies. I couldn't find anything from looking.

How can I use linux grep / regex to find unknown characters in an email address? For example, let's say we had this list:

userone:[email protected]

usertwo:[email protected]

userthree:[email protected]

how could I grep the list to find emails matching ***@example.com? (the only email that should be found from this is [email protected])

I'm aware that grep -e '...@example\.com' would work, but periods can represent any characters in grep, so doing this would also find :[email protected]. Plus, MOST email address don't contain just any character, they are typically confined to letters, numbers, periods, and underscores (many email providers don't allow anything else)

I need to use something else besides a period symbol in grep, something like [a-Z0-9._] so that letters, numbers, periods, and underscores are included but nothing else. I'm unsure of how to go about this. Thanks

EDIT: What I've tried so far: grep -e '[a-zA-Z0-9_.]{3}@example.com' *. This doesn't work, so it comes down to just me getting the regex wrong.

2
Are you looking for a regular expression to match any upper or lowercase character, digit, period, or underscore? - alexanderbird
If you need to match a literal period, you can escape it with a backslash: \.. Your question seems to have to do with regex basics. I would suggest looking at an informational site such as regular-expressions.info or experimenting on regex101.com. - CAustin
Yes, but in this example only for 3 characters, so I can use something else instead of "..." - BotHam
Real email addresses can, in fact, contain any character; don't confuse that with what most places "typically" do. I commonly have + in email addresses. See regular-expressions.info/email.html - Stephen P
how about just grep '[email protected]' then. you do not need any regular expression. - Serge

2 Answers

1
votes

If the email addresses are always preceded by a username, which is then followed by a colon or a space and then the email address, you can use that knowledge to restrict your matches.

What does a username look like? You need to know if you're going to use it to find matches. Let's say for now it is letters, numbers, dash, and underscore, it always starts with a letter, and is from 2 to 12 characters long. We also know it's got a colon or space after it. The regex for that is

[A-Za-z][A-Za-z0-9_-]{1,11}[: ]

That would be followed by your email address which, it sounds like, is something you decide on and input because that's what you're looking for at the moment.

Your example of test*****@example.com would be

[A-Za-z][A-Za-z0-9_-]{1,11}[: ]test.\[email protected]

or, if exactly 5 chars after "test"

[A-Za-z][A-Za-z0-9_-]{1,11}[: ][email protected]

Your original sample ***@example.com is "any 3-char address at example.com" and would be

[A-Za-z][A-Za-z0-9_-]{1,11}[: ][email protected]

This would be a pain to retype that prefix all the time, so you'd want to wrap that in a script that uses prefix + what_i_typed as the pattern.

0
votes

try this command line i used to found any thing in any files

 grep -r -i @example.com ./