11
votes

I need to match with a javascript RegExp the string: bimbo999 from this a tag: <a href="/game.php?village=828&amp;screen=info_player&amp;id=29956" >bimbo999</a>

The numbers from URL vars (village and id) are changing every time so I have to match the numbers somehow with RegExp.

</tr>
                    <tr><td>Sent</td><td >Oct 22, 2011  17:00:31</td></tr>
                                <tr>
                        <td colspan="2" valign="top" height="160" style="border: solid 1px black; padding: 4px;">
                            <table width="100%">
    <tr><th width="60">Supported player:</th><th>
    <a href="/game.php?village=828&amp;screen=info_player&amp;id=29956" >bimbo999</a></th></tr>
    <tr><td>Village:</td><td><a href="/game.php?village=828&amp;screen=info_village&amp;id=848" >bimbo999s village (515|520) K55</a></td></tr>
    <tr><td>Origin of the troops:</td><td><a href="/game.php?village=828&amp;screen=info_village&amp;id=828" >KaLa I (514|520) K55</a></td></tr>
    </table><br />

    <h4>Units:</h4>
    <table class="vis">

I tried with this:

var match = h.match(/Supported player:</th>(.*)<\/a><\/th></i);

but is not working. Can you guys, help me?

2
Why are you manipulating the HTML directly? It's much safer (and usually easier) to work through the DOM. Find the right <table>, then the appropriate <a> tags in the table using jQuery or a cross-browser selector library like Sizzle and then just get the innerHTML of the <a> tag to get bimbo999.jfriend00
Using regex to traverse html tags is not very good practice. Have you tried making a DOM element from the tag and getting innerHTML?name

2 Answers

24
votes

Try this:

/<a[^>]*>([\s\S]*?)<\/a>/
  • <a[^>]*> matches the opening a tag
  • ([\s\S]*?) matches any characters before the closing tag, as few as possible
  • <\/a> matches the closing tag

The ([\s\S]*?) captures the text between the tags as argument 1 in the array returned from an exec or match call.

This is really only good for finding text within a elements, it's not incredibly safe or reliable, but if you've got a big page of links and you just need their text, this will do it.


A much safer way to do this without RegExp would be:

function getAnchorTexts(htmlStr) {
    var div,
        anchors,
        i,
        texts;
    div = document.createElement('div');
    div.innerHTML = htmlStr;
    anchors = div.getElementsByTagName('a');
    texts = [];
    for (i = 0; i < anchors.length; i += 1) {
        texts.push(anchors[i].text);
    }
    return texts;
}
3
votes

I don't have experience with Regex, but I think you can use JQuery with .text() !

JQuery API - .text()

I mean if you use :

var hrefText = $("a").text(); 

You will get your text without using Regex!

.find("a") and then gives you a list of a's tags objects and then use .each() to loop on that list then you can get the text by using .text().

Or your can use a class selector, id or anything you want!