preg_match REGEX cleanup, identifying strings based on starting and ending

Question

I need to clean up Google News links in pages dynamicly, and get actual links of the content.

Google News link looks like this:

http://news.google.com/news/url?sa=t&fd=R&usg=AFQjCNGkF58EwDE7aA742GfVP9aE8azmhg&url=http://www.reuters.com/article/2012/01/15/us-obama-mlk-idUSTRE80E0PD20120115

I want to keep the actual link, everything after &url= :

http://www.reuters.com/article/2012/01/15/us-obama-mlk-idUSTRE80E0PD20120115

I NEED to preg_match/preg_replace and eliminate the "non-essential" part of the URL, in essence targeting everything starting with http://news.google.com and ending with &url= ?

http://news.google.com/news/url?sa=t&fd=R&usg=AFQjCNGkF58EwDE7aA742GfVP9aE8azmhg&url=

As you can probably tell, I'm no regex expert. :)

Thanks a lot!

mathematical.coffee mathematical.coffee · Accepted Answer · 2012-01-16T07:57:11

You could use preg_replace with ~http://new\.google\.com.*?&url=~, replacing with ''.

Or, you could use preg_match with &url=(.*)$ and pull out $1.

preg_match REGEX cleanup, identifying strings based on starting and ending

2 Answers