70
votes

Media Source Extension (MSE) needs fragmented mp4 for playback in the browser.

1

1 Answers

150
votes

A fragmented MP4 contains a series of segments which can be requested individually if your server supports byte-range requests.

Boxes aka Atoms

All MP4 files use an object oriented format that contains boxes aka atoms.

You can view a representation of the boxes in your MP4 using an online tool such as MP4 Parser or if you're using Windows, MP4 Explorer. Let's compare a normal MP4 with one that is fragmented:

Non-Fragmented MP4

This screenshot (from MP4 Parser) shows an MP4 that hasn't been fragmented and quite simply has one massive mdat (Movie Data) box.

Representation of boxes within a normal, non fragmented MP4, generated using MP4 Parser

If we were building a video player that supports adaptive bitrate, we might need to know the byte position of the 10 sec mark in a 0.5Mbps and a 1Mbps file in order to switch the video source between the two files at that moment. Determining this exact byte position within one massive mdat in each respective file is not trivial.

Fragmented MP4

This screenshot shows a fragmented MP4 which has been segmented using MP4Box with the onDemand profile.

Representation of boxes within a fragmented MP4, generated using MP4 Parser

You'll notice the sidx and series of moof+mdat boxes. The sidx is the Segment Index and stores meta data of the precise byte range locations of the moof+mdat segments.

Essentially, you can independently load the sidx (its byte-range will be defined in the accompanying .mpd Media Presentation Descriptor file) and then choose which segments you'd like to subsequently load and add to the MSE SourceBuffer.

Importantly, each segment is created at a regular interval of your choosing (ie. every 5 seconds), so the segments can have temporal alignment across files of different bitrates, making it easy to adapt the bitrate during playback.