5
votes

I've been trying to figure out the best way to go about implementing an idea I've had for a while.

Currently, I have an icecast mp3 stream of a radio scanner, with "now playing" metadata that is updated in realtime depending on what channel the scanner has landed on. When using a dedicated media player such as VLC, the metadata is perfectly lined up with the received audio and it functions exactly as I want it to - essentially a remote radio scanner. I would like to implement something similar via a webpage, and on the surface this seems like a simple task.

If all I wanted to do was stream audio, using simple <audio> tags would suffice. However, HTML5 audio players have no concept of the embedded in-stream metadata that icecast encodes along with the mp3 audio data. While I could query the current "now playing" metadata from the icecast server status json, due to client & serverside buffering there could be upwards of 20 seconds of delay between audio and metadata when done in this fashion. When the scanner is changing its "now playing" metadata upwards of every second in some cases, this is completely unsuitable for my application.

There is a very interesting Node.JS solution that was developed with this exact goal in mind - realtime metadata in a radio scanner application: icecast-metadata-js. This shows that it is indeed possible to handle both audio and metadata from a single icecast stream. The live demo is particularly impressive: https://eshaz.github.io/icecast-metadata-js/

However, I'm looking for a solution that can run totally clientside without needing a Node.JS installation and it seems like that should be relatively trivial.

After searching most of the day today, it seems that there are several similar questions asked on this site and elsewhere, without any cohesive, well-laid out answers or recommendations. From what I've been able to gather so far, I believe my solution is to use a Javascript streaming function (such as fetch) to pull the raw mp3 & metadata from the icecast server, playing the audio via Web Audio API and handling the metadata blocks as they arrive. Something like the diagram below:

Data Processing Block Diagram

I'm wondering if anyone has any good reading and/or examples for playing mp3 streams via the Web Audio API. I'm still a relative novice at most things JS, but I get the basic idea of the API and how it handles audio data. What I'm struggling with is the proper way to implement a) the live processing of data from the mp3 stream, and b) detecting metadata chunks embedded in the stream and handling those accordingly.

Apologies if this is a long-winded question, but I wanted to give enough backstory to explain why I want to go about things the specific way I do.

Thanks in advance for the suggestions and help!

1

1 Answers

4
votes

I'm glad you found my library icecast-metadata-js! This library can actually be used both client-side and in NodeJS. All of the source code for the live demo, which runs completely client side, is here in the repository: https://github.com/eshaz/icecast-metadata-js/tree/master/src/demo. The streams in the demo are unaltered and are just normal Icecast streams on the server side.

What you have in your diagram is essentially correct. ICY metadata is interlaced within the actual MP3 "stream" data. The metadata interval or frequency that ICY metadata updates happen can be configured in the Icecast server configuration XML. Also, it may depend on your how frequent / accurate your source is for sending metadata updates to Icecast. The software used in the police scanner on my demo page updates almost exactly in time with the audio.

Usually, the default metadata interval is 16,000 bytes meaning that for every 16,000 stream (mp3) bytes, a metadata update will sent from Icecast. The metadata update always contains a length byte. If the length byte is greater than 0, the length of the metadata update is the metadata length byte * 16.

ICY Metadata is a string of key='value' pairs delimited by a semicolon. Any unused length in the metadata update is null padded.

i.e. "StreamTitle='The Stream Title';StreamUrl='https://example.com';\0\0\0\0\0\0"

read [metadataInterval bytes]        -> Stream data
read [1 byte]                        -> Metadata Length
if [Metadata Length > 0]
   read [Metadata Length * 16 bytes] -> Metadata

byte length response data action
ICY Metadata Interval stream data send to your audio decoder
1 metadata length byte use to determine length of metadata string (do not send to audio decoder)
Metadata Length * 16 metadata string decode and update your "Now Playing" (do not send to audio decoder)

The initial GET request to your Icecast server will need to include the Icy-MetaInt: 1 header, which tells Icecast to supply the interlaced metadata. The response header will contain the ICY metadata interval Icy-MetaInt, which should be captured (if possible) and used to determine the metadata interval.

In the demo, I'm using the client-side fetch API to make that GET request, and the response data is supplied into an instance of IcecastReadableStream which splits out the stream and metadata, and makes each available via callbacks. I'm using the Media Source API to play the stream data, and to get the timing data to properly synchronize the metadata updates.

This is the bare-minimum CORS configuration needed for reading ICY Metadata:

Access-Control-Allow-Origin: '*' // this can be scoped further down to your domain also
Access-Control-Allow-Methods: 'GET, OPTIONS'
Access-Control-Allow-Headers: 'Content-Type, Icy-Metadata'

icecast-metadata-js can detect the ICY metadata interval if needed, but it's better to allow clients to read it from the header with this additional CORS configuration:

Access-Control-Expose-Headers: 'Icy-MetaInt'

Also, I'm planning on releasing a new feature (after I finish with Ogg metadata) that encapsulates the fetch api logic so that all a user needs to do is supply an Icecast endpoint, and get audio / metadata back.