0
votes

I need to transform href links into different kind of links (Confluence has its own system), and I'm getting close using an Ant build.xml file with replaceregexp, but not quite there yet.

Basically I need to start with links like this:

<a class="xref" href="../Test_Topic_2/Test_Topic_2.txt">Test Topic 2</a>

And turn them into this:

<ac:link><ri:page ri:content-title="Test_Topic_2" /></ac:link>

I've got an Ant build.xml file that works on the above link, but it doesn't work if the path starts with ../../ instead of ../

Since the best place to pick up the topic name would be from the 'Test_Topic_2.txt' entry, I'm wondering if there's a way with regular expressions to work backwards from '.txt', telling it to match everything from '.txt' back to the first slash it encounters, leave that in place, and replace the rest.

There may be some entirely different approach, if anyone has any ideas please let me know.

Thanks,

2

2 Answers

0
votes

Assuming the input links exist in file input.txt on the same path as the buildfile with the following content:

<a class="xref" href="../Test_Topic_1/Test_Topic_1.txt">Test Topic 1</a>
<a class="xref" href="../Test_Topic_2/Test_Topic_2.txt">Test Topic 2</a>
<a class="xref" href="../Test_Topic_3/Test_Topic_3.txt">Test Topic 3</a>
<a class="xref" href="../Test_Topic_4/Test_Topic_4.txt">Test Topic 4</a>

You would load the file into a property, then loop over each line and replace it with the updated link, save the updated link in a property and append it to an output file, as shown below.

<loadfile property="file.content" srcFile="./input.txt" />

<for list="${file.content}" param="original.href" delimiter="${line.separator}">
    <sequential>
        <var name="updated.href" unset="true" />  
        <propertyregex input="@{original.href}" property="updated.href" regexp="&lt;a class=&quot;xref&quot; href=&quot;.+/([^/]+)\.txt&quot;&gt;.+&lt;/a&gt;"
              replace="&lt;ac:link&gt;&lt;ri:page ri:content-title=&quot;\1&quot; /&gt;&lt;/ac:link&gt;" />
        <echo message="${updated.href}${line.separator}" file="output.txt" append="true" />
    </sequential>
</for>

The output in this case would be:

<ac:link><ri:page ri:content-title="Test_Topic_1" /></ac:link>
<ac:link><ri:page ri:content-title="Test_Topic_2" /></ac:link>
<ac:link><ri:page ri:content-title="Test_Topic_3" /></ac:link>
<ac:link><ri:page ri:content-title="Test_Topic_4" /></ac:link>
0
votes
<replaceregexp byline="true">

   <regexp pattern="&lt;a class=.*?href=&quot;.*?([^\/]+)\.txt&quot;&gt;.*?&lt;\/a&gt;"/>
   <substitution expression="&lt;ac:link&gt;&lt;ri:page ri:content-title=&quot;\1&quot; \/&gt;&lt;\/ac:link&gt;"/>

   <fileset dir="${your.directory.containing.txt.files}">
      <include name="**/*.txt"/>
   </fileset>

</replaceregexp>  

Here, replaceregexp will process each line of the input file, match it with the <regex pattern="..."/> and replace the successful matches with <substitution expression="..."/>
This will be done for every file in the <fileset>.

so, for eg, if you have the following dir. structure:

../ a / b / 1.txt, 2.txt  
../ a / b / c / 3.txt

and you set ${your.directory.containing.txt.files} to ../a/b/ then the files 1.txt, 2.txt and 3.txt will be processed line-by-line, by <replaceregexp> and each matching expression will be replaced.

See Demo here.