Qt - H.264 video streaming using FFmpeg libraries

Question

I am trying to get my IP camera stream in my Qt Widget application. First, I connect to UDP port of IP camera. IP camera is streaming H.264 encoded video. After socket is bound, on each readyRead() signal, I am filling the buffer with received datagrams in order to get a full frame.

Variable initialization:

AVCodec *codec;
AVCodecContext *codecCtx;
AVFrame *frame;
AVPacket packet;
this->buffer.clear();
this->socket = new QUdpSocket(this);

QObject::connect(this->socket, &QUdpSocket::connected, this, &H264VideoStreamer::connected);
QObject::connect(this->socket, &QUdpSocket::disconnected, this, &H264VideoStreamer::disconnected);
QObject::connect(this->socket, &QUdpSocket::readyRead, this, &H264VideoStreamer::readyRead);
QObject::connect(this->socket, &QUdpSocket::hostFound, this, &H264VideoStreamer::hostFound);
QObject::connect(this->socket, SIGNAL(error(QAbstractSocket::SocketError)), this, SLOT(error(QAbstractSocket::SocketError)));
QObject::connect(this->socket, &QUdpSocket::stateChanged, this, &H264VideoStreamer::stateChanged);

avcodec_register_all();

codec = avcodec_find_decoder(AV_CODEC_ID_H264);
if (!codec){
   qDebug() << "Codec not found";
   return;
}

codecCtx = avcodec_alloc_context3(codec);
if (!codecCtx){
    qDebug() << "Could not allocate video codec context";
    return;
}

if (codec->capabilities & CODEC_CAP_TRUNCATED)
      codecCtx->flags |= CODEC_FLAG_TRUNCATED;

codecCtx->flags2 |= CODEC_FLAG2_CHUNKS;

AVDictionary *dictionary = nullptr;

if (avcodec_open2(codecCtx, codec, &dictionary) < 0) {
    qDebug() << "Could not open codec";
    return;
}

Algorithm is as follows:

void H264VideoImageProvider::readyRead() {
    QByteArray datagram;
    datagram.resize(this->socket->pendingDatagramSize());
    QHostAddress sender;
    quint16 senderPort;

    this->socket->readDatagram(datagram.data(), datagram.size(), &sender, &senderPort);

    QByteArray rtpHeader = datagram.left(12);
    datagram.remove(0, 12);

    int nal_unit_type = datagram[0] & 0x1F;
    bool start = (datagram[1] & 0x80) != 0;

    int seqNo = rtpHeader[3] & 0xFF;

    qDebug() << "H264 video decoder::readyRead()"
             << "from: " << sender.toString() << ":" << QString::number(senderPort)
             << "\n\tDatagram size: " << QString::number(datagram.size())
             << "\n\tH264 RTP header (hex): " << rtpHeader.toHex()
             << "\n\tH264 VIDEO data (hex): " << datagram.toHex();

    qDebug() << "nal_unit_type = " << nal_unit_type << " - " << getNalUnitTypeStr(nal_unit_type);
    if (start)
        qDebug() << "START";

    if (nal_unit_type == 7){
        this->sps = datagram;
        qDebug() << "Sequence parameter found = " << this->sps.toHex();
        return;
    } else if (nal_unit_type == 8){
        this->pps = datagram;
        qDebug() << "Picture parameter found = " << this->pps.toHex();
        return;
    }

    //VIDEO_FRAME
    if (start){
        if (!this->buffer.isEmpty())
            decodeBuf();

        this->buffer.clear();
        qDebug() << "Initializing new buffer...";

        this->buffer.append(char(0x00));
        this->buffer.append(char(0x00));
        this->buffer.append(char(0x00));
        this->buffer.append(char(0x01));

        this->buffer.append(this->sps);

        this->buffer.append(char(0x00));
        this->buffer.append(char(0x00));
        this->buffer.append(char(0x00));
        this->buffer.append(char(0x01));

        this->buffer.append(this->pps);

        this->buffer.append(char(0x00));
        this->buffer.append(char(0x00));
        this->buffer.append(char(0x00));
        this->buffer.append(char(0x01));
    }

    qDebug() << "Appending buffer data...";
    this->buffer.append(datagram);
}

first 12 bytes of datagram is RTP header
everything else is VIDEO DATA
last 5 bits of first VIDEO DATA byte, says which NAL unit type it is. I always get one of the following 4 values (1 - coded non-IDR slice, 5 code IDR slice, 7 SPS, 8 PPS)
5th bit in 2nd VIDEO DATA byte says if this datagram is START data in frame
all VIDEO DATA is stored in buffer starting with START
once new frame arrives - START is set, it is decoded and new buffer is generated
frame for decoding is generated like this:
```
 00 00 00 01
 SPS
 00 00 00 01
 PPS
 00 00 00 01
```
concatenated VIDEO DATA

decoding is made using avcodec_decode_video2() function from FFmpeg library

 void H264VideoStreamer::decode() {
     av_init_packet(&packet);
     av_new_packet(&packet, this->buffer.size());
     memcpy(packet.data, this->buffer.data_ptr(), this->buffer.size());
     packet.size = this->buffer.size();    
     frame = av_frame_alloc();
     if(!frame){
         qDebug() << "Could not allocate video frame";
         return;
     }
     int got_frame = 1;
     int len = avcodec_decode_video2(codecCtx, frame, &got_frame, &packet);
     if (len < 0){
         qDebug() << "Error while encoding frame.";
         return;
     }
     //if(got_frame > 0){ // got_frame is always 0
     //    qDebug() << "Data decoded: " << frame->data[0];
     //}
     char * frameData = (char *) frame->data[0];
     QByteArray decodedFrame;
     decodedFrame.setRawData(frameData, len);
     qDebug() << "Data decoded: " << decodedFrame;
     av_frame_unref(frame);
     av_free_packet(&packet);
     emit imageReceived(decodedFrame);
 }

My idea is in UI thread which receives imageReceived signal, convert decodedFrame directly in QImage and refresh it once new frame is decoded and sent to UI.

Is this good approach for decoding H.264 stream? I am facing following problems:

avcodec_decode_video2() returns value that is the same like encoded buffer size. Is it possible that encoded and decoded date are always same size?
got_frame is always 0, so it means that I never really received full frame in the result. What can be the reason? Video frame incorrectly created? Or video frame incorrectly converted from QByteArray to AVframe?
How can I convert decoded AVframe back to QByteArray, and can it just be simply converted to QImage?

This statement this->buffer.append(datagram); looks wrong to me. If you receive UDP packets, then the order in which you receive them is not known (well, it is on a Unix socket, but not on a regular network connection). What that means is you must know the packet number and using that number where to add the UDP packet data to the buffer. — Alexis Wilke

k_kaz k_kaz · Accepted Answer · 2016-07-28T13:03:22

The whole process of manually rendering the frames can be left to another library. If the only purpose is a Qt GUI with live feed from the IP camera you can use libvlc library. You can find an example here: https://wiki.videolan.org/LibVLC_SampleCode_Qt

Qt - H.264 video streaming using FFmpeg libraries

1 Answers