2
votes

I've been strugling with the flip-flop operator in order to replace a particular value in a input xml file.

The file structure is as follows:

<a>BLE</a>
<-- hur dur -->
<b>12345</b>
<-- hur der -->
<c>VER1.2.3</c>
<-- hur dest -->
<d>VER1.2.3</d>

<a>CLE</a>
<-- hur dur -->
<b>12345</b>
<-- hur der -->
<c>VERX'.Y'.Z'</c>
<-- hur dest -->
<d>VERX'.Y'.Z'</d>

<a>DLE</a>
<-- hur dur -->
<b>12345</b>
<-- hur der -->
<c>VERX".Y".Z"</c>
<-- hur dest -->
<d>VERX".Y".Z"</d>
etc ...

So lets say I want to change the version field (d and c) of any particular variable. For brevity sake let it be BLE. I have to first find the block with the values of BLE (lines 1-7) and then replace fields and with the new value.

I've been trying different stuff, like:

perl -pi.bak -ne 'if ( /BLE/ .. /<\/d/ ){ s/VER[[:digit:]]\+\.[[:digit:]]\+\.[A-Z][[:digit:]]\+[A-Z]\?/$ver/}' dur.xml

or

perl -ne "print if ( /BLE/ .. /<\/d>/ ) " dur.xml | sed "/VER[[:digit:]]\+\.[[:digit:]]\+\.[[:digit:]]\+[A-Z]\?/$ver/"

where $ver is a set variable (ver="VERX.Y.Z"). However the first one doesn't do anything to the input file (I'd prefer to do it in-place); the second produces more or less what I want but the output is limited to the BLE block:

<a>BLE</a>
<-- hur dur -->
<b>12345</b>
<-- hur der -->
<c>VERX.Y.Z</c>
<-- hur dest -->
<d>VERX.Y.Z</d>

but that this output can't be easily redirected to a file with > because it drops everything else besides BLE.

Is there any way one can modify the original file in a similar fashion as the one described above?

Thanks a bunch, drinker

1
Is that input supposed to be real XML? It doesn't look like it. Please include an exact file.simbabque
...also, here isn't any buildversion string in your example data...jm666
@simbabque No it's not really a XML - the file I'm editing just has a .xml extension- but the structure is kinda similar. Also the original looks exactly like this only the tags a,b,c and d have different namesdrinker
@jm666 Oops, copy-paste fail. Fixeddrinker
So the file has blocks of tags, and empty lines, and those not-comments <-- ... -->? And you only want to change the <d> where <a> has BLE?simbabque

1 Answers

1
votes

Here is one way to do the task as I see it

perl -pe's{<d>VER\K.*?(</d>)}{xxx$1} if m{BLE} .. m{</d>}' data.txt

This replaces whatever is between <d>VER and (first) </d> with xxx, on the lines in the given range. The \K is the form of the positive lookbehind which discards matches prior to it so we don't have to capture them and put them back in.

The -p prints $_ every time, while s{}{} changes it, so you get all lines printed, changed or not. It servers no purpose having both it and -n (same but doesn't print) but if you do then -p overrides the other. To have the file changed in-place with a backup indeed add -i.bak. See perlrun

If you need to access specific numbers you can instead use regex

s{<d>VER\K(\d+)\.(\d+)\.(\d+)</d>}{$1.$new.$3</d>}

In order to allow for an optional letter following a number, what your code hints at, use (\d+[A-Z]?). If ' and " in X' and X" are actual characters please adjust this (the first example is fine). If you meant to use them as "X-prime" (and double-prime), don't -- it is very misleading here.

For more flexibility convert this to a script, which is probably a good idea anyway. You may use variables directly instead of the literals above (VER and xxx) if this is in a shell script.


The problem in the posted code isn't with the "flip-flop" (range operator), which merely selects lines and which you use correctly. It is with the regex, in which you have [A-Z] before the last number and another one following it. The data just doesn't have a letter there, so it doesn't match.

Also, note that \? turns the quantifier ? into a literal ?, which also isn't in the data.