3
votes

I have a text area of text that can possibly contain tags in them. I need to identify these tags with a specific string of text in the src attribute ("/storyimages/") and delete them. So for instance, if I have the text

<br><img src="/storyimages/myimage.jpg" align="right" WIDTH="105" HEIGHT="131"><b>(CNS) </b>Lorem ipsum dolor...

I just want to get rid of the whole tag and replace it with ''. The regex pattern I'm trying to use is

/<img src=.*\/storyimages\/.*>/

but it's not working. What happens is that it identifies the start of the string ok, but it's not identifying the closing > character, so if I use preg_match(), the match starts with .

I know you're not supposed to use a regex on HTML, but this isn't embedded tags; it's just one tag in the midst of a bunch of text, so I should be ok. From what I can see, the > isn't a special character, but even if I escape it, I still get the same result.

Is there something simple I'm missing that would make this work? Or do I need to write some sort of function that loops over the string character by character to find the positions of the open and close brackets and then replace them?

The interesting thing is that when I try this with a regex tester, it works fine, but when I actually run the code, I get the problem described above.

Thanks.

1
Here's one way that doesn't use regex: eval.in/309744scrowler
If your issue is that you are not matching the first ever > after the img, you need to use the ? to make it lazy , regex would be <img src=.*\/storyimages\/.*?> , tested herearkoak

1 Answers

2
votes

Use <img src=.*?\/storyimages\/.*?> regex.

The main point is using *? quanitifier to make matching non-greedy (i.e. match the least matching characters as possible).

Here is a sample PHP code:

$re = "/<img src=.*?\\/storyimages\\/.*?>/"; 
$str = "<br><img src=\"/storyimages/myimage.jpg\" align=\"right\" WIDTH=\"105\" HEIGHT=\"131\"><b>(CNS) </b>Lorem ipsum dolor..."; 
preg_match($re, $str, $matches);

The match will look like <img src="/storyimages/myimage.jpg" align="right" WIDTH="105" HEIGHT="131">.