41
votes

Does someone have a regular expression that gets a link to a Youtube video (not embedded object) from (almost) all the possible ways of linking to Youtube?

I think this is a pretty common problem and I'm sure there are a lot of ways to link that.

A starting point would be:

19

19 Answers

66
votes

So far I got this Regular expression working for the examples I posted, and it gets the ID on the first group:

http(?:s?):\/\/(?:www\.)?youtu(?:be\.com\/watch\?v=|\.be\/)([\w\-\_]*)(&(amp;)?‌​[\w\?‌​=]*)?
16
votes

You can use this expression below.

(?:https?:\/\/)?(?:www\.)?youtu\.?be(?:\.com)?\/?.*(?:watch|embed)?(?:.*v=|v\/|\/)([\w\-_]+)\&?

I'm using it, and it cover the most used URLs. I'll keep updating it on This Gist. You can test it on this tool.

5
votes

I improved the links posted above with a friend for a script I wrote for IRC to recognize even links without http at all. It worked on all stress tests I got so far, including garbled text with barely recognizable youtube urls, so here it is:

~(?:https?://)?(?:www\.)?youtu(?:be\.com/watch\?(?:.*?&(?:amp;)?)?v=|\.be/)([\w\-]+)(?:&(?:amp;)?[\w\?=]*)?~
4
votes

I testet all the regular expressions that are shown here and none could cover all url types that my client was using.

I built this pretty much through trial and error, but it seems to work with all the patterns that Poppy Deejay posted.

"(?:.+?)?(?:\/v\/|watch\/|\?v=|\&v=|youtu\.be\/|\/v=|^youtu\.be\/)([a-zA-Z0-9_-]{11})+"

Maybe it helps someone who is in a similar situation that I had today ;)

4
votes

Piggy backing on Fanmade, this covers the below links including the url encoded version of attribution_links:

(?:.+?)?(?:\/v\/|watch\/|\?v=|\&v=|youtu\.be\/|\/v=|^youtu\.be\/|watch\%3Fv\%3D)([a-zA-Z0-9_-]{11})+



https://www.youtube.com/attribution_link?a=tolCzpA7CrY&u=%2Fwatch%3Fv%3DMoBL33GT9S8%26feature%3Dshare
https://www.youtube.com/watch?v=MoBL33GT9S8&feature=share
http://www.youtube.com/watch?v=iwGFalTRHDA 
https://www.youtube.com/watch?v=iwGFalTRHDA 
http://www.youtube.com/watch?v=iwGFalTRHDA&feature=related 
http://youtu.be/iwGFalTRHDA 
http://www.youtube.com/embed/watch?feature=player_embedded&v=iwGFalTRHDA
http://www.youtube.com/embed/watch?v=iwGFalTRHDA
http://www.youtube.com/embed/v=iwGFalTRHDA
http://www.youtube.com/watch?feature=player_embedded&v=iwGFalTRHDA
http://www.youtube.com/watch?v=iwGFalTRHDA
www.youtube.com/watch?v=iwGFalTRHDA 
www.youtu.be/iwGFalTRHDA 
youtu.be/iwGFalTRHDA 
youtube.com/watch?v=iwGFalTRHDA 
http://www.youtube.com/watch/iwGFalTRHDA
http://www.youtube.com/v/iwGFalTRHDA
http://www.youtube.com/v/i_GFalTRHDA
http://www.youtube.com/watch?v=i-GFalTRHDA&feature=related 
http://www.youtube.com/attribution_link?u=/watch?v=aGmiw_rrNxk&feature=share&a=9QlmP1yvjcllp0h3l0NwuA
http://www.youtube.com/attribution_link?a=fF1CWYwxCQ4&u=/watch?v=qYr8opTPSaQ&feature=em-uploademail
http://www.youtube.com/attribution_link?a=fF1CWYwxCQ4&feature=em-uploademail&u=/watch?v=qYr8opTPSaQ
4
votes

I like @brunodles's solution the most but you can still match non video links like https://www.youtube.com/feed/subscriptions

I went with this solution

(?:https?:\/\/)?(?:www\.)?youtu(?:\.be\/|be.com\/\S*(?:watch|embed)(?:(?:(?=\/[^&\s\?]+(?!\S))\/)|(?:\S*v=|v\/)))([^&\s\?]+)

It can also be used to match multiple whitespace separated links. The video id will be captured in the first group.

Tested with the following urls:

youtu.be/iwGFalTRHDA
youtube.com/watch?v=iwGFalTRHDA
www.youtube.com/watch?v=iwGFalTRHDA
http://www.youtube.com/watch?v=iwGFalTRHDA
https://www.youtube.com/watch?v=iwGFalTRHDA
https://www.youtube.com/watch?v=MoBL33GT9S8&feature=share
https://www.youtube.com/embed/watch?feature=player_embedded&v=iwGFalTRHDA
https://www.youtube.com/embed/watch?v=iwGFalTRHDA
https://www.youtube.com/embed/v=iwGFalTRHDA
https://www.youtube.com/watch/iwGFalTRHDA
http://www.youtube.com/attribution_link?u=/watch?v=aGmiw_rrNxk&feature=share

// will not match
https://www.youtube.com/feed/subscriptions
https://www.youtube.com/channel/UCgc00bfF_PvO_2AvqJZHXFg
https://www.youtube.com/c/NatGeoEdOrg/videos

https://regex101.com/r/mPyKKP/5

2
votes

I've been having problems lately with the atttribution_link urls so i tried making my own regex that works for those too.

Here is my regex string:

(https?://)?(www\\.)?(yotu\\.be/|youtube\\.com/)?((.+/)?(watch(\\?v=|.+&v=))?(v=)?)([\\w_-]{11})(&.+)?

and here are some test cases i've tried:

http://www.youtube.com/watch?v=iwGFalTRHDA 
https://www.youtube.com/watch?v=iwGFalTRHDA 
http://www.youtube.com/watch?v=iwGFalTRHDA&feature=related 
http://youtu.be/iwGFalTRHDA 
http://www.youtube.com/embed/watch?feature=player_embedded&v=iwGFalTRHDA
http://www.youtube.com/embed/watch?v=iwGFalTRHDA
http://www.youtube.com/embed/v=iwGFalTRHDA
http://www.youtube.com/watch?feature=player_embedded&v=iwGFalTRHDA
http://www.youtube.com/watch?v=iwGFalTRHDA
www.youtube.com/watch?v=iwGFalTRHDA 
www.youtu.be/iwGFalTRHDA 
youtu.be/iwGFalTRHDA 
youtube.com/watch?v=iwGFalTRHDA 
http://www.youtube.com/watch/iwGFalTRHDA
http://www.youtube.com/v/iwGFalTRHDA
http://www.youtube.com/v/i_GFalTRHDA
http://www.youtube.com/watch?v=i-GFalTRHDA&feature=related 
http://www.youtube.com/attribution_link?u=/watch?v=aGmiw_rrNxk&feature=share&a=9QlmP1yvjcllp0h3l0NwuA
http://www.youtube.com/attribution_link?a=fF1CWYwxCQ4&u=/watch?v=qYr8opTPSaQ&feature=em-uploademail
http://www.youtube.com/attribution_link?a=fF1CWYwxCQ4&feature=em-uploademail&u=/watch?v=qYr8opTPSaQ

Also remember to check the string you get for your video url, sometimes it may get the percent characters. If so just do this

url = [url stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

and it should fix it.

Remember also that the index of the youtube key is now index 9.

NSRange youtubeKey = [result rangeAtIndex:9]; //the youtube key
NSString * strKey = [url substringWithRange:youtubeKey] ;
1
votes

It'd be the longest RegEx in the world if you managed to cover all link formats, but here's one to get you started which will cover the first couple of link formats:

http://(www\.)?youtube\.com/watch\?.*v=([a-zA-Z0-9]+).*

The second group will match the video ID if you need to get that out.

1
votes
(?:http?s?:\/\/)?(?:www.)?(?:m.)?(?:music.)?youtu(?:\.?be)(?:\.com)?(?:(?:\w*.?:\/\/)?\w*.?\w*-?.?\w*\/(?:embed|e|v|watch|.*\/)?\??(?:feature=\w*\.?\w*)?&?(?:v=)?\/?)([\w\d_-]{11})(?:\S+)?

https://regex101.com/r/nJzgG0/3

Detects YouTube and YouTube Music link in any string

1
votes

I took all variants from here:

https://gist.github.com/rodrigoborgesdeoliveira/987683cfbfcc8d800192da1e73adc486#file-youtubeurlformats-txt

And built this regexp (YouTube ID is in group 2):

(\/|%3D|v=|vi=)([0-9A-z-_]{11})[%#?&\s]

Check it here: https://regexr.com/4u4ud

Edit: Works for any single string w/o breaks.

0
votes

I'm working with that kind of links:

http://www.youtube.com/v/M-faNJWc9T0?fs=1&rel=0

And here's the regEx I'm using to get ID from it:

"(.+?)(\/v/)([a-zA-Z0-9_-]{11})+"
0
votes

This is iterating on the existing answers and handles edge cases better. (for example http://thisisnotyoutu.be/thing)

/(?:https?:\/\/|www\.|m\.|^)youtu(?:be\.com\/watch\?(?:.*?&(?:amp;)?)?v=|\.be\/)([\w‌​\-]+)(?:&(?:amp;)?[\w\?=]*)?/
0
votes

here is the complete solution for getting youtube video id for java or android, i didn't found any link which doesn't work with this function

public static String getValidYoutubeVideoId(String youtubeUrl)
{
    if(youtubeUrl == null || youtubeUrl.trim().contentEquals(""))
    {
        return "";
    }
    youtubeUrl = youtubeUrl.trim();
    String validYoutubeVideoId = "";
    String regexPattern = "^(?:https?:\\/\\/)?(?:[0-9A-Z-]+\\.)?(?:youtu\\.be\\/|youtube\\.com\\S*[^\\w\\-\\s])([\\w\\-]{11})(?=[^\\w\\-]|$)(?![?=&+%\\w]*(?:['\"][^<>]*>|<\\/a>))[?=&+%\\w]*";
    Pattern regexCompiled = Pattern.compile(regexPattern, Pattern.CASE_INSENSITIVE);
    Matcher regexMatcher = regexCompiled.matcher(youtubeUrl);
    if(regexMatcher.find())
    {
        try
        {
            validYoutubeVideoId = regexMatcher.group(1);
        }
        catch(Exception ex)
        {
        }
    }
    return validYoutubeVideoId;
}
0
votes

This is my answer to use in Scala. This is useful to extract 11 digits from Youtube's URL.

"https?://(?:[0-9a-zA-Z-]+.)?(?:www.youtube.com/|youtu.be\S*[^\w-\s])([\w -]{11})(?=[^\w-]|$)(?![?=&+%\w](?:[\'"][^<>]>|))[?=&+%\w-]*"

def getVideoLinkWR: UserDefinedFunction = udf(f = (videoLink: String) => {
    val youtubeRgx = """https?://(?:[0-9a-zA-Z-]+\.)?(?:youtu\.be/|youtube\.com\S*[^\w\-\s])([\w \-]{11})(?=[^\w\-]|$)(?![?=&+%\w]*(?:[\'"][^<>]*>|</a>))[?=&+%\w-./]*""".r
    videoLink match {
        case youtubeRgx(a) => s"$a".toString
        case _ => videoLink.toString
    }
}
0
votes

Youtube video URL Change to iframe supported link:

REGEX: https://regex101.com/r/LeZ9WH/2/

http://www.youtube.com/watch?v=iwGFalTRHDA
http://www.youtube.com/watch?v=iwGFalTRHDA&feature=related
http://youtu.be/iwGFalTRHDA
http://youtu.be/n17B_uFF4cA
http://www.youtube.com/embed/watch?feature=player_embedded&v=r5nB9u4jjy4
http://www.youtube.com/watch?v=t-ZRX8984sc
http://youtu.be/t-ZRX8984sc
https://youtu.be/2sFlFPmUfNo?t=1

Php function example:

if (!function_exists('clean_youtube_link')) {

        /**
         * @param $link
         * @return string|string[]|null
         */
        function clean_youtube_link($link)
        {
            return preg_replace(
                '#(.+?)(\/)(watch\x3Fv=)?(embed\/watch\x3Ffeature\=player_embedded\x26v=)?([a-zA-Z0-9_-]{11})+#',
                "https://www.youtube.com/embed/$5",
                $link
            );
        }
}
0
votes

This should work for almost all youtube links when extracting from a string:

((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|v\/)?)([\w\-]{10}).\b
0
votes
    var isValidYoutubeLink: Bool{
        // working for all the youtube url's
        NSPredicate(format: "SELF MATCHES %@", "(?:http?s?:\\/\\/)?(?:www.)?(?:m.)?(?:music.)?youtu(?:\\.?be)(?:\\.com)?(?:(?:\\w*.?:\\/\\/)?\\w*.?\\w*-?.?\\w*\\/(?:embed|e|v|watch|.*\\/)?\\??(?:feature=\\w*\\.?\\w*)?&?(?:v=)?\\/?)([\\w\\d_-]{11})(?:\\S+)?").evaluate(with: self)
    }
0
votes

With this Javascript Regex, the first capture is a video ID :

^(?:https?:)?(?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube(?:\-nocookie)?\.(?:[A-Za-z]{2,4}|[A-Za-z]{2,3}\.[A-Za-z]{2})\/)(?:watch|embed\/|vi?\/)*(?:\?[\w=&]*vi?=)?([^#&\?\/]{11}).*$

-1
votes

This regex solve my problem, I can get youtube link having watch, embed or shared link

(?:http(?:s)?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'<> #]+)

You can check here https://regex101.com/r/Kvk0nB/1