17
votes

I have a list of absolute URLs. I need to make sure that they all have trailing slashes, as applicable. So:

I'm guessing I need to use regex, but matching URLs are a pain. Was hoping for an easier solution. Ideas?

5
So you rather want that the URL path is not empty, right?Gumbo
What about http://www.domain.com?message=hello?Kobi
@Gumbo - I'm not sure what you mean.StackOverflowNewbie
@Kobi - good point. I suppose that should have a slash right before the question mark.StackOverflowNewbie
@StackOverflowNewbie: The path is right after the authority (i.e. the host name www.domain.com) and before an optional query or fragment.Gumbo

5 Answers

14
votes

Rather than doing this using regex, you could use parse_url() to do this. For example:

$url = parse_url("http://www.example.com/ab/abc.html?a=b#xyz");
if(!isset($url['path'])) $url['path'] = '/';
$surl = $url['scheme']."://".$url['host'].$url['path'].'?'.$url['query'].'#'.$url['fragment'];
echo $surl;
24
votes

For this very specific problem, not using a regex at all might be an option as well. If your list is long (several thousand URLs) and time is of any concern, you could choose to hand-code this very simple manipulation.

This will do the same:

$str .= (substr($str, -1) == '/' ? '' : '/');

It is of course not nearly as elegant or flexible as a regular expression, but it avoids the overhead of parsing the regular expression string and it will run as fast as PHP is able to do it.
It is arguably less readable than the regex, though this depends on how comfortable the reader is with regex syntax (some people might acually find it more readable).

It will certainly not check that the string is really a well-formed URL (such as e.g. zerkms' regex), but you already know that your strings are URLs anyway, so that is a bit redundant.

Though, if your list is something like 10 or 20 URLs, forget this post. Use a regex, the difference will be zero.

4
votes
$url = 'http://www.domain.com';

$need_to_add_trailing_slash = preg_match('~^https?://[^/]+$~', $url);
1
votes

Try this:

if (!preg_match("/.*\/$/", $url)) {

     $url = "$url" . "/";
}
1
votes

This may not be the most elegant solution, but it works like a charm. First we get the full url, then check to see if it has a a trailing slash. If not, check to see that there is no query string, it isn't an actual file, and isn't an actual directory. If the url meets all these conditions we do a 301 redirect with the trailing slash added.

If you're unfamiliar with PHP headers... note that there cannot be any output - not even whitespace - before this code.

$url = $_SERVER['REQUEST_URI'];
$lastchar = substr( $url, -1 );
if ( $lastchar != '/' ):
    if ( !$_SERVER['QUERY_STRING'] and !is_file( $_SERVER['DOCUMENT_ROOT'].$url ) and     !is_dir( $_SERVER['DOCUMENT_ROOT'].$url ) ):
        header("HTTP/1.1 301 Moved Permanently");
        header( "Location: $url/" );
    endif;
endif;