How to visit a link inside an email using capybara

Question

I am new to cucumber with capybara. I got an application to test whose flow is:'after submitting a form, an email will be sent to the user which contains the link to another app. In order to access the app we have to open the mail and click the link, which will redirect to the app.'. I don't have access to the mail Id. Is there any way to extract that link and continue with the flow? Please, give some possible way to do it.

Regards, Abhisek Das

thisisbrians thisisbrians · Accepted Answer · 2014-08-12T14:47:28

In your test, use whatever means you need in order to trigger the sending of the email by your application. Once the email is sent, use a regular expression to find the URL from the link within the email body (note this will work only for an email that contains a single link), and then visit the path from that URL with Capybara to continue with your test:

path_regex = /(?:"https?\:\/\/.*?)(\/.*?)(?:")/    

email = ActionMailer::Base.deliveries.last
path = email.body.match(path_regex)[1]
visit(path)

Regular expression explained

A regular expression (regex) itself is demarcated by forward slashes, and this regex in particular consists of three groups, each demarcated by pairs of parentheses. The first and third groups both begin with ?:, indicating that they are non-capturing groups, while the second is a capturing group (no ?:). I will explain the significance of this distinction below.

The first group, (?:"https?\:\/\/.*?), is a:

non-capturing group, ?:
that matches a single double quote, "
- we match a quote since we anticipate the URL to be in the href="..." attribute of a link tag
followed by the string http
optionally followed by a lowercase s, s?
- the question mark makes the preceding match, in this case s, optional
followed by a colon and two forward slashes, \:\/\/
- note the backslashes, which are used to escape characters that otherwise have a special meaning in a regex
followed by a wildcard, .*?, which will match any character any number of times up until the next match in the regex is reached
- the period, or wildcard, matches any character
- the asterisk, *, repeats the preceding match up to an unlimited number of times, depending on the successive match that follows
- the question mark makes this a lazy match, meaning the wildcard will match as few characters as possible while still allowing the next match in the regex to be satisfied

The second group, (\/.*?) is a capturing group that:

matches a single forward slash, \/
- this will match the first forward slash after the host portion of the URL (e.g. the slash at the end of http://www.example.com/) since the slashes in http:// were already matched by the first group
followed by another lazy wildcard, .*?

The third group, (?:"), is:

another non-capturing group, ?:
that matches a single double quote, "

And thus, our second group will match the portion of the URL starting with the forward slash after the host and going up to, but not including, the double quote at the end of our href="...".

When we call the match method using our regex, it returns an instance of MatchData, which behaves much like an array. The element at index 0 is a string containing the entire matched string (from all of the groups in the regex), while elements at subsequent indices contain only the portions of the string matched by the regex's capturing groups (only our second group, in this case). Thus, to get the corresponding match of our second group—which is the path we want to visit using Capybara—we grab the element at index 1.

How to visit a link inside an email using capybara

3 Answers