33
votes

I fairly frequently match strings against regular expressions. In Java:

java.util.regex.Pattern.compile("\w+").matcher("this_is").matches

Ouch. Scala has many alternatives.

  1. "\\w+".r.pattern.matcher("this_is").matches
  2. "this_is".matches("\\w+")
  3. "\\w+".r unapplySeq "this_is" isDefined
  4. val R = "\\w+".r; "this_is" match { case R() => true; case _ => false}

The first is just as heavy-weight as the Java code.

The problem with the second is that you can't supply a compiled pattern ("this_is".matches("\\w+".r")). (This seems to be an anti-pattern since almost every time there is a method that takes a regex to compile there is an overload that takes a regex).

The problem with the third is that it abuses unapplySeq and thus is cryptic.

The fourth is great when decomposing parts of a regular expression, but is too heavy-weight when you only want a boolean result.

Am I missing an easy way to check for matches against a regular expression? Is there a reason why String#matches(regex: Regex): Boolean is not defined? In fact, where is String#matches(uncompiled: String): Boolean defined?

4
It's worth noting that String#matches(string: String) is not defined by either the 2.9 spec or the StringLike type from the standard library. It is, in fact, an artifact of the definition of Strings in Java. - ig0774
I don't understand what you mean by too heavy-weight in the first example? Do you mean that the code is too long, or do you mean that it's doing too much work? - Ian McLaird
too much code, the work is exactly what I want - schmmd
@ig0774, thanks for that point. I was confused why I couldn't find it. - schmmd

4 Answers

33
votes

You can define a pattern like this :

scala> val Email = """(\w+)@([\w\.]+)""".r

findFirstIn will return Some[String] if it matches or else None.

scala> Email.findFirstIn("[email protected]")
res1: Option[String] = Some([email protected])

scala> Email.findFirstIn("test")
rest2: Option[String] = None

You could even extract :

scala> val Email(name, domain) = "[email protected]"
name: String = test
domain: String = example.com

Finally, you can also use conventional String.matches method (and even recycle the previously defined Email Regexp :

scala> "[email protected]".matches(Email.toString)
res6: Boolean = true

Hope this will help.

16
votes

I created a little "Pimp my Library" pattern for that problem. Maybe it'll help you out.

import util.matching.Regex

object RegexUtils {
  class RichRegex(self: Regex) {
    def =~(s: String) = self.pattern.matcher(s).matches
  }
  implicit def regexToRichRegex(r: Regex) = new RichRegex(r)
}

Example of use

scala> import RegexUtils._
scala> """\w+""".r =~ "foo"
res12: Boolean = true
4
votes

I usually use

val regex = "...".r
if (regex.findFirstIn(text).isDefined) ...

but I think that is pretty awkward.

1
votes

Currently (Aug 2014, Scala 2.11) @David's reply tells the norm.

However, it seems the r."..." string interpolator may be on its way to help with this. See How to pattern match using regular expression in Scala?