0
votes

I want to match a web address through regex which should capture http://www.google.com as well as www.google.com i.e. with and without protocol.

4
Did you mean to match specific domain names (e.g. google), or do you need to match arbitrary domain names? - Zach Scrivena
i want to catch domain names with and without protocols - shabby
I found this answer very useful: stackoverflow.com/a/4820675/1740705 - Philipp

4 Answers

3
votes

Well it's going to depend on exactly what you want to capture ("FTP"? "/index.htm"?) because a general URI capture based on the RFC standard is very hard, but you could start with:

/^((https?\:\/\/)?([\w\d\-]+\.){2,}([\w\d]{2,})((\/[\w\d\-\.]+)*(\/[\w\d\-]+\.[\w\d]{3,4}(\?.*)?)?)?)$/

Complicated see?

1
votes

Read RFC 3986. It is not just as easy as you might think it is. The job is easier if you only have a small set of URLs to parse.

0
votes

Why not

/google\.com/

?

It catches http://www.google.com , www.google.com , and even google.com for free! :-)