0
votes

I'm using Apache Jmeter's "Regular Expressions Extractor" and I'm trying to extract some parameters from an XML file like this one :-

http://search.spotxchange.com/vast/2.00/101458?VPAID=1&cb=1421845139

I'm extracting parameters like (height,width,type) of a video file from this this tag :-

<MediaFile delivery="progressive" apiFramework="VPAID" bitrate="0" height="360" width="480" type="application/x-shockwave-flash">

I used this regular expression in order to extract these params:-

<MediaFile delivery="(.+?)" type="(.+?)" bitrate="(.+?)" height="(.+?)" width="(.+?)"> 

the main problem is in these tag parameters, They don't appear in a specific order, So for example sometimes the (width="") appears at the beginning of the media file params and somtimes it's the last one like the previous example.

So, How can I write an efficient regular expression extractor to extract these parameters?

2
Use a real xml parser instead of regex, specially if the format is not fixed, you'll have to try with ored regexes and it will turn into a nightmare. IMHO regexes are ok to extract a single attribute or to do a bulk change to one parameter, but they're not done to parse html or xml structures. - Tensibai

2 Answers

1
votes

Assuming you don't mind matching the entire list of attributes from between the < and >, you could try this:

<MediaFile(\s\w+=\"[^"]+\")+>

<MediaFile # match '<MediaFile' exactly
(          # start of capturing group
\s         # exactly one space
\w+=       # one or more letters followed immediately by an equals
\"         # escaped(starting) quote mark
[^"]+      # match anything EXCEPT double-quote once or more times
\"         # escaped (end) quote
)+         # close capturing group and expect that group once or more times
>          # match >

The grouping '()' is used so that the whole expression within can have a + added to it at the end, for multiple attributes. However, it isn't necessary that it is a capturing group. It should really be a non-capturing group('(?:regex here)') but that looks a little more confusing and it doesn't appear too matter to much in this case.

1
votes

JMeter offers XPath Extractor designed for getting values from XML/XHTML responses. I.e. for receiving width attribute of MediaFile tag you can use the following XPath expression:

//MediaFile/@width

For getting delivery attribute:

//MediaFile/@delivery

etc.

For more information on the XPath Extractor and XPath language see the following references: