I am curious what type of ID that youtube uses for identifying their videos? They seem to be the same type of IDs used for short urls on sites like Digg and Bit.ly.
13
votes
4 Answers
5
votes
33
votes
It's modified base64 as Ishmael guessed. Normal Base64 is [a-zA-Z0-9\+\/][=]*. That is, numbers contain A-Z, a-z, 0-9, +, or /, and are padded with 0, 1, or 2 "=" at the end. YouTube seems to skip the padding (like UTF-7 Base64 for MIME), and since + and / pose problems for URLs, - and _ are substituted respectively.
Therefore, the YouTube ID should match REGEXP: /[a-zA-Z0-9\-_]+/ or /[\w\-]+/ (they're equivalent since \w is [A-Za-z0-9_])
I use this in a dynamic YouTube SWFObject loader implementation and it works fine. I've observed both - and _ in YouTube IDs, but never any other non-alpha-numeric character. More Base64 info can be found on Wikipedia: URL applications of Base64
Best of luck!
3
votes