I am trying to write a live video broadcaster over RTSP from an ios device. I am utilizing AVAssetWriter so I can take advantage of hardware encoding. To send over RTSP I have to get the avcC information out of the MOOV block, however the MOOV block is only written from AVAssetWriter when you have finished the session, which of course is not finished as I am streaming this live.
I have gotten around this with the video by encoding, writing, and then finishing a single sample buffer to file, and the parsing the file to get the avcC information out. That works just fine.
After that for the live stream, since AVAssetWriter will only write to a file, I am writing it out to file and then reading from that file with a chasing file offset. When I do this with video only, I can read the Nalu's from the MDAT Atom in the written file without any MOOV information as the size of each Nalu is given in the first 4 bytes of the Nalu. So I can read that amount, process it, and send it on its way over an RTSP stream. So with video only, everything works perfectly fine and I get real good HD stream to a stream server.
The problem I am now having is when I try to incorporate audio into the stream from the mic. I can encode it just fine with AVAssetWriter and I get proper interleaved formated mp4 file to read from, however unlike the H264 Nalu's, the audio samples in the file do not have the size of the sample as their first byte. So far the only way I can see to define that is with the STSZ and STCO Atoms in the MOOV, which of course I dont have because it is a live stream.
With all that in mind, does any one know a way to identify audio sample segments in an MDAT Atom without the information from the MOOV Atom? As soon as I figure that out, Im home free.
Thanks in advance for any insight.