95
votes

What are you using to validate users' email addresses, and why?

I had been using validates_email_veracity_of which actually queries the MX servers. But that is full of fail for various reasons, mostly related to network traffic and reliability.

I looked around and I couldn't find anything obvious that a lot of people are using to perform a sanity check on an email address. Is there a maintained, reasonably accurate plugin or gem for this?

P.S.: Please don't tell me to send an email with a link to see if the email works. I'm developing a "send to a friend" feature, so this isn't practical.

13
Here is a super-easy way, without dealing with regex: detecting-a-valid-email-addressZabba
Could you give a more detailed reason why querying MX server is fail? I would like to know so I can see if these are fixable.lulalala

13 Answers

106
votes

Don't make this harder than it needs to be. Your feature is non-critical; validation's just a basic sanity step to catch typos. I would do it with a simple regex, and not waste the CPU cycles on anything too complicated:

/\A[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]+\z/

That was adapted from http://www.regular-expressions.info/email.html -- which you should read if you really want to know all the tradeoffs. If you want a more correct and much more complicated fully RFC822-compliant regex, that's on that page too. But the thing is this: you don't have to get it totally right.

If the address passes validation, you're going to send an email. If the email fails, you're going to get an error message. At which point you can tell the user "Sorry, your friend didn't receive that, would you like to try again?" or flag it for manual review, or just ignore it, or whatever.

These are the same options you'd have to deal with if the address did pass validation. Because even if your validation is perfect and you acquire absolute proof that the address exists, sending could still fail.

The cost of a false positive on validation is low. The benefit of better validation is also low. Validate generously, and worry about errors when they happen.

67
votes

With Rails 3.0 you can use a email validation without regexp using the Mail gem.

Here is my implementation (packaged as a gem).

12
votes

I created a gem for email validation in Rails 3. I'm kinda surprised that Rails doesn't include something like this by default.

http://github.com/balexand/email_validator

10
votes

This project seems to have the most watchers on github at the moment (for email validation in rails):

https://github.com/alexdunae/validates_email_format_of

7
votes

From the Rails 4 docs:

class EmailValidator < ActiveModel::EachValidator
  def validate_each(record, attribute, value)
    unless value =~ /\A([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})\z/i
      record.errors[attribute] << (options[:message] || "is not an email")
    end
  end
end

class Person < ActiveRecord::Base
  validates :email, presence: true, email: true
end
5
votes

In Rails 4 simply add validates :email, email:true (assuming your field is called email) to your model and then write a simple (or complex†) EmailValidator to suit your needs.

eg: - your model:

class TestUser
  include Mongoid::Document
  field :email,     type: String
  validates :email, email: true
end

Your validator (goes in app/validators/email_validator.rb)

class EmailValidator < ActiveModel::EachValidator
  EMAIL_ADDRESS_QTEXT           = Regexp.new '[^\\x0d\\x22\\x5c\\x80-\\xff]', nil, 'n'
  EMAIL_ADDRESS_DTEXT           = Regexp.new '[^\\x0d\\x5b-\\x5d\\x80-\\xff]', nil, 'n'
  EMAIL_ADDRESS_ATOM            = Regexp.new '[^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+', nil, 'n'
  EMAIL_ADDRESS_QUOTED_PAIR     = Regexp.new '\\x5c[\\x00-\\x7f]', nil, 'n'
  EMAIL_ADDRESS_DOMAIN_LITERAL  = Regexp.new "\\x5b(?:#{EMAIL_ADDRESS_DTEXT}|#{EMAIL_ADDRESS_QUOTED_PAIR})*\\x5d", nil, 'n'
  EMAIL_ADDRESS_QUOTED_STRING   = Regexp.new "\\x22(?:#{EMAIL_ADDRESS_QTEXT}|#{EMAIL_ADDRESS_QUOTED_PAIR})*\\x22", nil, 'n'
  EMAIL_ADDRESS_DOMAIN_REF      = EMAIL_ADDRESS_ATOM
  EMAIL_ADDRESS_SUB_DOMAIN      = "(?:#{EMAIL_ADDRESS_DOMAIN_REF}|#{EMAIL_ADDRESS_DOMAIN_LITERAL})"
  EMAIL_ADDRESS_WORD            = "(?:#{EMAIL_ADDRESS_ATOM}|#{EMAIL_ADDRESS_QUOTED_STRING})"
  EMAIL_ADDRESS_DOMAIN          = "#{EMAIL_ADDRESS_SUB_DOMAIN}(?:\\x2e#{EMAIL_ADDRESS_SUB_DOMAIN})*"
  EMAIL_ADDRESS_LOCAL_PART      = "#{EMAIL_ADDRESS_WORD}(?:\\x2e#{EMAIL_ADDRESS_WORD})*"
  EMAIL_ADDRESS_SPEC            = "#{EMAIL_ADDRESS_LOCAL_PART}\\x40#{EMAIL_ADDRESS_DOMAIN}"
  EMAIL_ADDRESS_PATTERN         = Regexp.new "#{EMAIL_ADDRESS_SPEC}", nil, 'n'
  EMAIL_ADDRESS_EXACT_PATTERN   = Regexp.new "\\A#{EMAIL_ADDRESS_SPEC}\\z", nil, 'n'

  def validate_each(record, attribute, value)
    unless value =~ EMAIL_ADDRESS_EXACT_PATTERN
      record.errors[attribute] << (options[:message] || 'is not a valid email')
    end
  end
end

This will allow all sorts of valid emails, including tagged emails like "[email protected]" and so on.

To test this with rspec in your spec/validators/email_validator_spec.rb

require 'spec_helper'

describe "EmailValidator" do
  let(:validator) { EmailValidator.new({attributes: [:email]}) }
  let(:model) { double('model') }

  before :each do
    model.stub("errors").and_return([])
    model.errors.stub('[]').and_return({})  
    model.errors[].stub('<<')
  end

  context "given an invalid email address" do
    let(:invalid_email) { 'test test tes' }
    it "is rejected as invalid" do
      model.errors[].should_receive('<<')
      validator.validate_each(model, "email", invalid_email)
    end  
  end

  context "given a simple valid address" do
    let(:valid_simple_email) { '[email protected]' }
    it "is accepted as valid" do
      model.errors[].should_not_receive('<<')    
      validator.validate_each(model, "email", valid_simple_email)
    end
  end

  context "given a valid tagged address" do
    let(:valid_tagged_email) { '[email protected]' }
    it "is accepted as valid" do
      model.errors[].should_not_receive('<<')    
      validator.validate_each(model, "email", valid_tagged_email)
    end
  end
end

This is how I've done it anyway. YMMV

†Regular expressions are like violence; if they don't work you are not using enough of them.

4
votes

As Hallelujah suggests I think using the Mail gem is a good approach. However, I dislike some of the hoops there.

I use:

def self.is_valid?(email) 

  parser = Mail::RFC2822Parser.new
  parser.root = :addr_spec
  result = parser.parse(email)

  # Don't allow for a TLD by itself list (sam@localhost)
  # The Grammar is: (local_part "@" domain) / local_part ... discard latter
  result && 
     result.respond_to?(:domain) && 
     result.domain.dot_atom_text.elements.size > 1
end

You could be stricter by demanding that the TLDs (top level domains) are in this list, however you would be forced to update that list as new TLDs pop up (like the 2012 addition .mobi and .tel)

The advantage of hooking the parser direct is that the rules in Mail grammar are fairly wide for the portions the Mail gem uses, it is designed to allow it to parse an address like user<[email protected]> which is common for SMTP. By consuming it from the Mail::Address you are forced to do a bunch of extra checks.

Another note regarding the Mail gem, even though the class is called RFC2822, the grammar has some elements of RFC5322, for example this test.

4
votes

In Rails 3 it's possible to write a reusable validator, as this great post explains:

http://archives.ryandaigle.com/articles/2009/8/11/what-s-new-in-edge-rails-independent-model-validators

class EmailValidator < ActiveRecord::Validator   
  def validate()
    record.errors[:email] << "is not valid" unless
    record.email =~ /^([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})$/i   
  end
end

and use it with validates_with:

class User < ActiveRecord::Base   
  validates_with EmailValidator
end
3
votes

Noting the other answers, the question still remains - why bother being clever about it?

The actual volume of edge cases that many regex may deny or miss seems problematic.

I think the question is 'what am I trying to acheive?', even if you 'validate' the email address, you're not actually validating that it is a working email address.

If you go for regexp, just check for the presence of @ on the client side.

As for the incorrect email scenario, have a 'message failed to send' branch to your code.

1
votes

There are basically 3 most common options:

  1. Regexp (there is no works-for-all e-mail address regexp, so roll your own)
  2. MX query (that is what you use)
  3. Generating an activation token and mailing it (restful_authentication way)

If you don't want to use both validates_email_veracity_of and token generation, I'd go with old school regexp checking.

1
votes

The Mail gem has a built in address parser.

begin
  Mail::Address.new(email)
  #valid
rescue Mail::Field::ParseError => e
  #invalid
end
1
votes

This solution is based on answers by @SFEley and @Alessandro DS, with a refactor, and usage clarification.

You can use this validator class in your model like so:

class MyModel < ActiveRecord::Base
  # ...
  validates :colum, :email => { :allow_nil => true, :message => 'O hai Mark!' }
  # ...
end

Given you have the following in your app/validators folder (Rails 3):

class EmailValidator < ActiveModel::EachValidator

  def validate_each(record, attribute, value)
    return options[:allow_nil] == true if value.nil?

    unless matches?(value)
      record.errors[attribute] << (options[:message] || 'must be a valid email address')
    end
  end

  def matches?(value)
    return false unless value

    if /\A[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]+\z/.match(value).nil?
      false
    else
      true
    end

  end
end
1
votes

For Mailing Lists Validation. (I use Rails 4.1.6)

I got my regexp from here. It seems to be a very complete one, and it's been tested against a great number of combinations. You can see the results on that page.

I slightly changed it to a Ruby regexp, and put it in my lib/validators/email_list_validator.rb

Here's the code:

require 'mail'

class EmailListValidator < ActiveModel::EachValidator

  # Regexp source: https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php
  EMAIL_VALIDATION_REGEXP   = Regexp.new('\A(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22))(?:\.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\]))\z', true)

  def validate_each(record, attribute, value)
    begin
      invalid_emails = Mail::AddressList.new(value).addresses.map do |mail_address|
        # check if domain is present and if it passes validation through the regex
        (mail_address.domain.present? && mail_address.address =~ EMAIL_VALIDATION_REGEXP) ? nil : mail_address.address
      end

      invalid_emails.uniq!
      invalid_emails.compact!
      record.errors.add(attribute, :invalid_emails, :emails => invalid_emails.to_sentence) if invalid_emails.present?
    rescue Mail::Field::ParseError => e

      # Parse error on email field.
      # exception attributes are:
      #   e.element : Kind of element that was wrong (in case of invalid addres it is Mail::AddressListParser)
      #   e.value: mail adresses passed to parser (string)
      #   e.reason: Description of the problem. A message that is not very user friendly
      if e.reason.include?('Expected one of')
        record.errors.add(attribute, :invalid_email_list_characters)
      else
        record.errors.add(attribute, :invalid_emails_generic)
      end
    end
  end

end

And I use it like this in the model:

validates :emails, :presence => true, :email_list => true

It will validate mailing lists like this one, with different separators and synthax:

mail_list = 'John Doe <[email protected]>, [email protected]; David G. <[email protected]>'

Before using this regexp, I used Devise.email_regexp, but that is a very simple regexp and didn't get all the cases I needed. Some emails bumped.

I tried other regexps from the web, but this one's got the best results till now. Hope it helps in your case.