following the idea of Mijoja, and drawing from the problems exposed by JasonS, i had this idea; i checked a bit but am not sure of myself, so a verification by someone more expert than me in js regex would be great :)
var re = /(?=(..|^.?)(ll))/g
// matches empty string position
// whenever this position is followed by
// a string of length equal or inferior (in case of "^")
// to "lookbehind" value
// + actual value we would want to match
, str = "Fall ball bill balll llama"
, str_done = str
, len_difference = 0
, doer = function (where_in_str, to_replace)
{
str_done = str_done.slice(0, where_in_str + len_difference)
+ "[match]"
+ str_done.slice(where_in_str + len_difference + to_replace.length)
len_difference = str_done.length - str.length
/* if str smaller:
len_difference will be positive
else will be negative
*/
} /* the actual function that would do whatever we want to do
with the matches;
this above is only an example from Jason's */
/* function input of .replace(),
only there to test the value of $behind
and if negative, call doer() with interesting parameters */
, checker = function ($match, $behind, $after, $where, $str)
{
if ($behind !== "ba")
doer
(
$where + $behind.length
, $after
/* one will choose the interesting arguments
to give to the doer, it's only an example */
)
return $match // empty string anyhow, but well
}
str.replace(re, checker)
console.log(str_done)
my personal output:
Fa[match] ball bi[match] bal[match] [match]ama
the principle is to call checker
at each point in the string between any two characters, whenever that position is the starting point of:
--- any substring of the size of what is not wanted (here 'ba'
, thus ..
) (if that size is known; otherwise it must be harder to do perhaps)
--- --- or smaller than that if it's the beginning of the string: ^.?
and, following this,
--- what is to be actually sought (here 'll'
).
At each call of checker
, there will be a test to check if the value before ll
is not what we don't want (!== 'ba'
); if that's the case, we call another function, and it will have to be this one (doer
) that will make the changes on str, if the purpose is this one, or more generically, that will get in input the necessary data to manually process the results of the scanning of str
.
here we change the string so we needed to keep a trace of the difference of length in order to offset the locations given by replace
, all calculated on str
, which itself never changes.
since primitive strings are immutable, we could have used the variable str
to store the result of the whole operation, but i thought the example, already complicated by the replacings, would be clearer with another variable (str_done
).
i guess that in terms of performances it must be pretty harsh: all those pointless replacements of '' into '', this str.length-1
times, plus here manual replacement by doer, which means a lot of slicing...
probably in this specific above case that could be grouped, by cutting the string only once into pieces around where we want to insert [match]
and .join()
ing it with [match]
itself.
the other thing is that i don't know how it would handle more complex cases, that is, complex values for the fake lookbehind... the length being perhaps the most problematic data to get.
and, in checker
, in case of multiple possibilities of nonwanted values for $behind, we'll have to make a test on it with yet another regex (to be cached (created) outside checker
is best, to avoid the same regex object to be created at each call for checker
) to know whether or not it is what we seek to avoid.
hope i've been clear; if not don't hesitate, i'll try better. :)
(?:[^abcdefg]|^)(m)
? As in"mango".match(/(?:[^abcdefg]|^)(m)/)[1]
– slebetman