13
votes

I am curious what type of ID that youtube uses for identifying their videos? They seem to be the same type of IDs used for short urls on sites like Digg and Bit.ly.

4

4 Answers

5
votes

It's probably a modified Base 64 representation of a GUID. (Common Base 64 implementations include problem characters for URLs).

33
votes

It's modified base64 as Ishmael guessed. Normal Base64 is [a-zA-Z0-9\+\/][=]*. That is, numbers contain A-Z, a-z, 0-9, +, or /, and are padded with 0, 1, or 2 "=" at the end. YouTube seems to skip the padding (like UTF-7 Base64 for MIME), and since + and / pose problems for URLs, - and _ are substituted respectively.

Therefore, the YouTube ID should match REGEXP: /[a-zA-Z0-9\-_]+/ or /[\w\-]+/ (they're equivalent since \w is [A-Za-z0-9_])

I use this in a dynamic YouTube SWFObject loader implementation and it works fine. I've observed both - and _ in YouTube IDs, but never any other non-alpha-numeric character. More Base64 info can be found on Wikipedia: URL applications of Base64

Best of luck!

3
votes

They use this ID to prevent people from farming/spamming the videos by simply incrementing a number.

1
votes

I've seen at least one with a "_" underscore in the mix. Which surprised me... since I had assumed the same regexp as Piskvor... until now...