Consider creating your own parser for this task (it is not that complicated).
- Iterate over string characters to find ranges where you can't remove
AND from. Create variable which will calculate level of nesting. Increase this level when you find ( and decrease it when you find ).
- if you find
( and you changed level from 0 to 1 then it is start of range,
- if you find
) and you changed level from 1 to 0 then it is end of range.
- Find positions of
AND in your string (indexOf(data,fromIndex) can be helpful here) and check if it is outside of ranges you shouldn't split on.
- When you have all positions you should split on create substrings from
start,position and update next start to be after positoon+"AND".length(). After this try to substring next part.
After point 3 you should have all parts you are interested in.
Below is example of parser class which seems to be doing what you want. To see it hover your mouse over it. But before you use it try to create your own implementation.
class Parser {
private static class Range {
private int start, end;
public Range(int start, int end) {
this.start = start;
this.end = end;
}
boolean isInside(int i) {
return start <= i && i <= end;
}
public int getStart() {
return start;
}
@Override
public String toString() {
return "Range [start=" + start + ", end=" + end + "]";
}
}
private List<Range> ranges = new ArrayList<Range>();
private boolean checkIfOutsideRanges(int i) {
if (ranges.size() == 0) return true;
if (ranges.get(0).getStart() > i) return true;
for (Range r : ranges) {
if (r.isInside(i))
return false;
}
return true;
}
private List<Range> setUpRanges(String data) {
int level = 0;
int startOfRange = 0;
int i = 0;
for (char ch : data.toCharArray()) {
if (ch == '(') {
level++;
if (level == 1)
startOfRange = i;
}
if (ch == ')') {
level--;
if (level == 0)
ranges.add(new Range(startOfRange, i));
}
i++;
}
return ranges;
}
public List<String> parse(String data) {
String toFind = "AND";
ranges = setUpRanges(data);
//find indexes of "AND" we should split on
List<Integer> toSplit = new ArrayList<Integer>();
int i = -1;
do {
i = data.indexOf(toFind, i + 1);
if (i != -1 && checkIfOutsideRanges(i))
toSplit.add(i);
} while (i != -1);
//split on correct AND indexes
List<String> results = new ArrayList<String>();
int start = 0;
for (Integer index : toSplit) {
results.add(data.substring(start, index));
start = index + toFind.length();
}
if (start < data.length())
results.add(data.substring(start));
return results;
}
}
Usage example
String data = "word1 AND ((word2 AND word3) AND word4) AND word5";
Parser p = new Parser();
for (String s : p.parse(data))
System.out.println(s);
word1 AND ((word2 AND word3) AND word4) AND word5beword1((word2 AND word3) AND word4)word5, OR maybe you want to split also middle word to((word2 AND word3)andword4)? I am asking because you accepted answer which splits also middle part. - Pshemo