45
votes

I have String str, from which I want to extract the sub-string excluding a possible prefix "abc".

The first solution that comes to mind is:

if (str.startsWith("abc"))
    return str.substring("abc".length());
return str;

My questions are:

  1. Is there a "cleaner" way to do it using split and a regular expression for an "abc" prefix?

  2. If yes, is it less efficient than the method above (because it searches "throughout" the string)?

  3. If yes, is there any better way of doing it (where "better way" = clean and efficient solution)?

Please note that the "abc" prefix may appear elsewhere in the string, and should not be removed.

Thanks

7
Concerns about "efficiency" are rather silly here, unless you're attempting to do this millions of times per X and that's the bottleneck. You code, as is, reads well and conveys the intent. - Brian Roach

7 Answers

56
votes

Shorter than above code will be this line:

return str.replaceFirst("^abc", "");

But in terms of performance I guess there wont be any substantial difference between 2 codes. One uses regex and one doesn't use regex but does search and substring.

11
votes

Using String.replaceFirst with ^abc (to match leading abc)

"abcdef".replaceFirst("^abc", "")     // => "def"
"123456".replaceFirst("^abc", "")     // => "123456"
"123abc456".replaceFirst("^abc", "")  // => "123abc456"
6
votes

A regex-free solution (I needed this because the string I'm removing is configurable and contains backslashes, which need escaping for literal use in a regex):

Apache Commons Lang StringUtils.removeStart(str, remove) will remove remove from the start of str using String.startsWith and String.substring.

The source code of the method is informative:

public static String removeStart(final String str, final String remove) {
    if (isEmpty(str) || isEmpty(remove)) {
        return str;
    }
    if (str.startsWith(remove)){
        return str.substring(remove.length());
    }
    return str;
}
3
votes

Try this

str = str.replaceAll("^abc", "");
1
votes
  1. Using String#split can do this, but it's not better solution. Actually it'll be vague and I wouldn't recommend using it for that purpose.
  2. Don't waste time about efficiency in this case, it's not significant, focus on logic and clarity. But note that working with regex is usually slower because it involves additional operations so you might want to keep startsWith.
  3. Your approach is fine, if you want to check if the String begins with "abc", String#startsWith was designed for that.

You can easily measure the time that takes a code to run. Here what you can do:

Create a big loop, inside it you can append the counter of it to some dummy String in order to simulate the Strings you want to check, then try to have startsWith once, and replaceAll after:

for(int i = 0;i<900000;i++) {
    StringBuilder sb = new StringBuilder("abc");
    sb.append(i);
    if(sb.toString().startsWith("abc")) { ... } 
}
long time = System.currentTimeMillis() - start;
System.out.println(time); //Prints ~130

for(int i = 0;i<900000;i++){
   StringBuilder sb = new StringBuilder("abc");
   sb.append(i);
   sb.toString().replaceAll("^abc", "");        
}
long time = System.currentTimeMillis() - start;
System.out.println(time);  //Prints ~730
0
votes

If you are concerned about performance you can improve str.replaceFirst("^abc", "") solution by using same pre-compiled prefix Pattern for matching multiple strings.

final Pattern prefix = Pattern.compile("^abc"); // Could be static constant etc
for ... {
    final String result = prefix.matcher(str).replaceFirst("");
}

I guess the difference will be noticeable if you stripping the same prefix from a lot of strings.

-1
votes

As far as efficiency is concerned you may use StringBuilder where you have multiple operations on one string such as substring then, finding index, then substring etc etc.


Where cleanliness/efficiency is concerned, StringUtils (Apache Commons Lang) can be used.

Hope it helps.