I have a text, consisting of varying regex delimiters, followed by text. In this example, I have 3 regex delimiters (PatternA, B, C), and the text looks like this :
|..StringMatchingA..|..Text1..|..StringMatchingB..|..Text2..|..StringMatchingA..|..Text3..|..StringMatchingC..|..Text4..|
I am looking for an efficient Java solution to extract information as a list of triplet :
{PatternA, StringMatchingA, Text1}
{PatternB, StringMatchingB, Text2}
{PatternA, StringMatchingA, Text3}
{PatternC, StringMatchingC, Text4}
With this information, I know for each triplet, what is the pattern that has been matched, as well as the String that has matched it.
For the moment, I have this approach, but I guess I could do something far more efficient with advanced regex usage ?
String pattern = "?=(PatternA|PatternB|PatternC)";
String()[] tokens = input.split(pattern);
for(String token : tokens)
{
//if start of token matches patternA ...
//elseif start of token matches pattern B...
//etc...
}
Remarks :
- Patterns are mutually exclusive.
- String always starts with at least one pattern.
private static final Pattern
if you callsplit(pattern)
frequently, becauseString.split(String)
creates a newPattern
object every time it is called, which is costly in a loop. – Bobulous((PatternA)|(PatternB)|(PatternC))
. However, it's not clear whether the patterns are mutual exclusive, or whether there exist a string which two of them can match. It's also not clear whether you want the "bump-along" to happen when none of the patterns match at a certain position. – nhahtdh