I want to match a web address through regex which should capture http://www.google.com as well as www.google.com i.e. with and without protocol.
0
votes
4 Answers
3
votes
Well it's going to depend on exactly what you want to capture ("FTP"? "/index.htm"?) because a general URI capture based on the RFC standard is very hard, but you could start with:
/^((https?\:\/\/)?([\w\d\-]+\.){2,}([\w\d]{2,})((\/[\w\d\-\.]+)*(\/[\w\d\-]+\.[\w\d]{3,4}(\?.*)?)?)?)$/
Complicated see?
2
votes
1
votes
Read RFC 3986. It is not just as easy as you might think it is. The job is easier if you only have a small set of URLs to parse.
0
votes
Why not
/google\.com/
?
It catches http://www.google.com , www.google.com , and even google.com for free! :-)