I would like to capture all occurrences within a string that match a specific regular expression. I'm using DataWeave 2.0 (which means Mule Runtime 4.3 and, in my case Anypoint Studio 7.5)
I've tried to use scan() and match() from the DataWeave core library, but I can't quite get the result I want.
Here's some of the things I've tried:
%dw 2.0
output application/json
// sample input with hashtag keywords
var microList = 'Someone is giving away millions. See @realmcsrooge at #downtownmalls now!
#shoplocal and tell them #giveaway @barry sent you. #downtowndancehalls'
---
{
withscan: microList scan /(#[^\s]*).*/,
sanitized: microList replace /\n/
with ' ',
sani_match: microList replace /\n/
with ' ' match /.*(#[^\s]*).*/, // gives full string and last match
sani_scan: microList replace /\n/
with ' ' scan /.*(#[^\s]*).*/ // gives array of arrays, string and last match
}
Here are the respective results:
{
"withscan": [
[
"#downtownmalls now!",
"#downtownmalls"
],
[
"#shoplocal and tell them #giveaway @barry sent you. #downtowndancehalls",
"#shoplocal"
]
],
"sanitized": "Someone is giving away millions. See @realmcsrooge at #downtownmalls now! #shoplocal and tell them #giveaway @barry sent you. #downtowndancehalls",
"sani_match": [
"Someone is giving away millions. See @realmcsrooge at #downtownmalls now! #shoplocal and tell them #giveaway @barry sent you. #downtowndancehalls",
"#downtowndancehalls"
],
"sani_scan": [
[
"Someone is giving away millions. See @realmcsrooge at #downtownmalls now! #shoplocal and tell them #giveaway @barry sent you. #downtowndancehalls",
"#downtowndancehalls"
]
]
}
In the first example, it appears that the parser is doing line processing. So there is one element in the result array for each line. An element consists of the full matched portion and the tagged portion using the first occurrence of the pattern.
After stripping newlines, the third example (sani_match) gave me an array with the fully matched portion and the tagged portion, this time the last occurrence of the pattern on the line.
The final pattern (sani_scan) gives similar results, the only difference being that the result is embedded as an element in array of arrays.
What I want is simply an array with all occurrences of a specified pattern.