In your test, use whatever means you need in order to trigger the sending of the email by your application. Once the email is sent, use a regular expression to find the URL from the link within the email body (note this will work only for an email that contains a single link), and then visit the path from that URL with Capybara to continue with your test:
path_regex = /(?:"https?\:\/\/.*?)(\/.*?)(?:")/
email = ActionMailer::Base.deliveries.last
path = email.body.match(path_regex)[1]
visit(path)
Regular expression explained
A regular expression (regex) itself is demarcated by forward slashes, and this regex in particular consists of three groups, each demarcated by pairs of parentheses. The first and third groups both begin with ?:
, indicating that they are non-capturing groups, while the second is a capturing group (no ?:
). I will explain the significance of this distinction below.
The first group, (?:"https?\:\/\/.*?)
, is a:
- non-capturing group,
?:
- that matches a single double quote,
"
- we match a quote since we anticipate the URL to be in the
href="..."
attribute of a link tag
- followed by the string
http
- optionally followed by a lowercase s,
s?
- the question mark makes the preceding match, in this case
s
, optional
- followed by a colon and two forward slashes,
\:\/\/
- note the backslashes, which are used to escape characters that otherwise have a special meaning in a regex
- followed by a wildcard,
.*?
, which will match any character any number of times up until the next match in the regex is reached
- the period, or wildcard, matches any character
- the asterisk,
*
, repeats the preceding match up to an unlimited number of times, depending on the successive match that follows
- the question mark makes this a lazy match, meaning the wildcard will match as few characters as possible while still allowing the next match in the regex to be satisfied
The second group, (\/.*?)
is a capturing group that:
- matches a single forward slash,
\/
- this will match the first forward slash after the host portion of the URL (e.g. the slash at the end of
http://www.example.com/
) since the slashes in http://
were already matched by the first group
- followed by another lazy wildcard,
.*?
The third group, (?:")
, is:
- another non-capturing group,
?:
- that matches a single double quote,
"
And thus, our second group will match the portion of the URL starting with the forward slash after the host and going up to, but not including, the double quote at the end of our href="..."
.
When we call the match
method using our regex, it returns an instance of MatchData
, which behaves much like an array. The element at index 0
is a string containing the entire matched string (from all of the groups in the regex), while elements at subsequent indices contain only the portions of the string matched by the regex's capturing groups (only our second group, in this case). Thus, to get the corresponding match of our second group—which is the path we want to visit using Capybara—we grab the element at index 1
.