I've been trying over the past week to implement H.264 streaming over RTP, using x264 as an encoder and libavformat to pack and send the stream. Problem is, as far as I can tell it's not working correctly.
Right now I'm just encoding random data (x264_picture_alloc) and extracting NAL frames from libx264. This is fairly simple:
x264_picture_t pic_out;
x264_nal_t* nals;
int num_nals;
int frame_size = x264_encoder_encode(this->encoder, &nals, &num_nals, this->pic_in, &pic_out);
if (frame_size <= 0)
{
return frame_size;
}
// push NALs into the queue
for (int i = 0; i < num_nals; i++)
{
// create a NAL storage unit
NAL nal;
nal.size = nals[i].i_payload;
nal.payload = new uint8_t[nal.size];
memcpy(nal.payload, nals[i].p_payload, nal.size);
// push the storage into the NAL queue
{
// lock and push the NAL to the queue
boost::mutex::scoped_lock lock(this->nal_lock);
this->nal_queue.push(nal);
}
}
nal_queue
is used for safely passing frames over to a Streamer class which will then send the frames out. Right now it's not threaded, as I'm just testing to try to get this to work. Before encoding individual frames, I've made sure to initialize the encoder.
But I don't believe x264 is the issue, as I can see frame data in the NALs it returns back. Streaming the data is accomplished with libavformat, which is first initialized in a Streamer class:
Streamer::Streamer(Encoder* encoder, string rtp_address, int rtp_port, int width, int height, int fps, int bitrate)
{
this->encoder = encoder;
// initalize the AV context
this->ctx = avformat_alloc_context();
if (!this->ctx)
{
throw runtime_error("Couldn't initalize AVFormat output context");
}
// get the output format
this->fmt = av_guess_format("rtp", NULL, NULL);
if (!this->fmt)
{
throw runtime_error("Unsuitable output format");
}
this->ctx->oformat = this->fmt;
// try to open the RTP stream
snprintf(this->ctx->filename, sizeof(this->ctx->filename), "rtp://%s:%d", rtp_address.c_str(), rtp_port);
if (url_fopen(&(this->ctx->pb), this->ctx->filename, URL_WRONLY) < 0)
{
throw runtime_error("Couldn't open RTP output stream");
}
// add an H.264 stream
this->stream = av_new_stream(this->ctx, 1);
if (!this->stream)
{
throw runtime_error("Couldn't allocate H.264 stream");
}
// initalize codec
AVCodecContext* c = this->stream->codec;
c->codec_id = CODEC_ID_H264;
c->codec_type = AVMEDIA_TYPE_VIDEO;
c->bit_rate = bitrate;
c->width = width;
c->height = height;
c->time_base.den = fps;
c->time_base.num = 1;
// write the header
av_write_header(this->ctx);
}
This is where things seem to go wrong. av_write_header
above seems to do absolutely nothing; I've used wireshark to verify this. For reference, I use Streamer streamer(&enc, "10.89.6.3", 49990, 800, 600, 30, 40000);
to initialize the Streamer instance, with enc
being a reference to an Encoder
object used to handle x264 previously.
Now when I want to stream out a NAL, I use this:
// grab a NAL
NAL nal = this->encoder->nal_pop();
cout << "NAL popped with size " << nal.size << endl;
// initalize a packet
AVPacket p;
av_init_packet(&p);
p.data = nal.payload;
p.size = nal.size;
p.stream_index = this->stream->index;
// send it out
av_write_frame(this->ctx, &p);
At this point, I can see RTP data appearing over the network, and it looks like the frames I've been sending, even including a little copyright blob from x264. But, no player I've used has been able to make any sense of the data. VLC quits wanting an SDP description, which apparently isn't required.
I then tried to play it through gst-launch
:
gst-launch udpsrc port=49990 ! rtph264depay ! decodebin ! xvimagesink
This will sit waiting for UDP data, but when it is received, I get:
ERROR: element /GstPipeline:pipeline0/GstRtpH264Depay:rtph264depay0: No RTP format was negotiated. Additional debug info: gstbasertpdepayload.c(372): gst_base_rtp_depayload_chain (): /GstPipeline:pipeline0/GstRtpH264Depay:rtph264depay0: Input buffers need to have RTP caps set on them. This is usually achieved by setting the 'caps' property of the upstream source element (often udpsrc or appsrc), or by putting a capsfilter element before the depayloader and setting the 'caps' property on that. Also see http://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/rtp/README
As I'm not using GStreamer to stream itself, I'm not quite sure what it means with RTP caps. But, it makes me wonder if I'm not sending enough information over RTP to describe the stream. I'm pretty new to video and I feel like there's some key thing I'm missing here. Any hints?