1
votes

I'm receiving H.264 video over RTP and decoding it with libavcodec. I'm unpackaging the NAL units from the RTP packets before feeding them to avcodec (including reassembling fragmentation units).

I'm trying to show effective decoding frame rate. I used to log the time after a successful decode video call where *got_picture_ptr is non-zero. So far this worked since I only ever got video where there was one slice per frame. But now I receive video where both I and P frames consist of 2 NAL units each, of types 5 and 1 respectively. Now when I feed the either slice of a frame, decode_video return that it got a picture, and the pAVFrame->coded_picture_number is increased from every slice.

How do I go about reliably finding the beginning or end of a video frame/picture/access unit?

I've dumped out a few NAL units from the stream and run them through h264_analyze from h264bitstream.

Output from h264_analyze on 4 NAL Units

!! Found NAL at offset 695262 (0xA9BDE), size 25 (0x0019) 
==================== NAL ====================
 forbidden_zero_bit : 0 
 nal_ref_idc : 1 
 nal_unit_type : 7 ( Sequence parameter set ) 
======= SPS =======
 profile_idc : 66 
 constraint_set0_flag : 1 
 constraint_set1_flag : 1 
 constraint_set2_flag : 1 
 constraint_set3_flag : 0 
 reserved_zero_4bits : 0 
 level_idc : 32 
 seq_parameter_set_id : 0 
 chroma_format_idc : 0 
 residual_colour_transform_flag : 0 
 bit_depth_luma_minus8 : 0 
 bit_depth_chroma_minus8 : 0 
 qpprime_y_zero_transform_bypass_flag : 0 
 seq_scaling_matrix_present_flag : 0 
 log2_max_frame_num_minus4 : 12 
 pic_order_cnt_type : 2 
   log2_max_pic_order_cnt_lsb_minus4 : 0 
   delta_pic_order_always_zero_flag : 0 
   offset_for_non_ref_pic : 0 
   offset_for_top_to_bottom_field : 0 
   num_ref_frames_in_pic_order_cnt_cycle : 0 
 num_ref_frames : 1 
 gaps_in_frame_num_value_allowed_flag : 0 
 pic_width_in_mbs_minus1 : 79 
 pic_height_in_map_units_minus1 : 44 
 frame_mbs_only_flag : 1 
 mb_adaptive_frame_field_flag : 0 
 direct_8x8_inference_flag : 1 
 frame_cropping_flag : 0 
   frame_crop_left_offset : 0 
   frame_crop_right_offset : 0 
   frame_crop_top_offset : 0 
   frame_crop_bottom_offset : 0 
 vui_parameters_present_flag : 1 
=== VUI ===
 aspect_ratio_info_present_flag : 1 
   aspect_ratio_idc : 1 
     sar_width : 0 
     sar_height : 0 
 overscan_info_present_flag : 0 
   overscan_appropriate_flag : 0 
 video_signal_type_present_flag : 1 
   video_format : 5 
   video_full_range_flag : 1 
   colour_description_present_flag : 0 
     colour_primaries : 0 
   transfer_characteristics : 0 
   matrix_coefficients : 0 
 chroma_loc_info_present_flag : 0 
   chroma_sample_loc_type_top_field : 0 
   chroma_sample_loc_type_bottom_field : 0 
 timing_info_present_flag : 1 
   num_units_in_tick : 1 
   time_scale : 25 
   fixed_frame_rate_flag : 0 
 nal_hrd_parameters_present_flag : 0 
 vcl_hrd_parameters_present_flag : 0 
   low_delay_hrd_flag : 0 
 pic_struct_present_flag : 0 
 bitstream_restriction_flag : 1 
   motion_vectors_over_pic_boundaries_flag : 1 
   max_bytes_per_pic_denom : 0 
   max_bits_per_mb_denom : 0 
   log2_max_mv_length_horizontal : 6 
   log2_max_mv_length_vertical : 6 
   num_reorder_frames : 0 
   max_dec_frame_buffering : 1 
=== HRD ===
 cpb_cnt_minus1 : 0 
 bit_rate_scale : 0 
 cpb_size_scale : 0 
 initial_cpb_removal_delay_length_minus1 : 0 
 cpb_removal_delay_length_minus1 : 0 
 dpb_output_delay_length_minus1 : 0 
 time_offset_length : 0 
!! Found NAL at offset 695290 (0xA9BFA), size 4 (0x0004) 
==================== NAL ====================
 forbidden_zero_bit : 0 
 nal_ref_idc : 1 
 nal_unit_type : 8 ( Picture parameter set ) 
======= PPS =======
 pic_parameter_set_id : 0 
 seq_parameter_set_id : 0 
 entropy_coding_mode_flag : 0 
 pic_order_present_flag : 0 
 num_slice_groups_minus1 : 0 
 slice_group_map_type : 0 
 num_ref_idx_l0_active_minus1 : 0 
 num_ref_idx_l1_active_minus1 : 0 
 weighted_pred_flag : 0 
 weighted_bipred_idc : 0 
 pic_init_qp_minus26 : 3 
 pic_init_qs_minus26 : 0 
 chroma_qp_index_offset : 0 
 deblocking_filter_control_present_flag : 1 
 constrained_intra_pred_flag : 0 
 redundant_pic_cnt_present_flag : 0 
 transform_8x8_mode_flag : 1 
 pic_scaling_matrix_present_flag : 0 
 second_chroma_qp_index_offset : 1 
!! Found NAL at offset 695297 (0xA9C01), size 50725 (0xC625) 
==================== NAL ====================
 forbidden_zero_bit : 0 
 nal_ref_idc : 1 
 nal_unit_type : 5 ( Coded slice of an IDR picture ) 
======= Slice Header =======
 first_mb_in_slice : 0 
 slice_type : 2 ( I slice ) 
 pic_parameter_set_id : 0 
 frame_num : 0 
 field_pic_flag : 0 
 bottom_field_flag : 0 
 idr_pic_id : 0 
 pic_order_cnt_lsb : 0 
 delta_pic_order_cnt_bottom : 0 
 redundant_pic_cnt : 0 
 direct_spatial_mv_pred_flag : 0 
 num_ref_idx_active_override_flag : 0 
 num_ref_idx_l0_active_minus1 : 0 
 num_ref_idx_l1_active_minus1 : 0 
 cabac_init_idc : 0 
 slice_qp_delta : 5 
 sp_for_switch_flag : 0 
 slice_qs_delta : 0 
 disable_deblocking_filter_idc : 0 
 slice_alpha_c0_offset_div2 : 0 
 slice_beta_offset_div2 : 0 
 slice_group_change_cycle : 0 
=== Prediction Weight Table ===
 luma_log2_weight_denom : 0 
 chroma_log2_weight_denom : 0 
 luma_weight_l0_flag : 0 
 chroma_weight_l0_flag : 0 
 luma_weight_l1_flag : 0 
 chroma_weight_l1_flag : 0 
=== Ref Pic List Reordering ===
 ref_pic_list_reordering_flag_l0 : 0 
 ref_pic_list_reordering_flag_l1 : 0 
=== Decoded Ref Pic Marking ===
 no_output_of_prior_pics_flag : 0 
 long_term_reference_flag : 0 
 adaptive_ref_pic_marking_mode_flag : 0 
!! Found NAL at offset 746025 (0xB6229), size 38612 (0x96D4) 
==================== NAL ====================
 forbidden_zero_bit : 0 
 nal_ref_idc : 1 
 nal_unit_type : 5 ( Coded slice of an IDR picture ) 
======= Slice Header =======
 first_mb_in_slice : 1840 
 slice_type : 2 ( I slice ) 
 pic_parameter_set_id : 0 
 frame_num : 0 
 field_pic_flag : 0 
 bottom_field_flag : 0 
 idr_pic_id : 0 
 pic_order_cnt_lsb : 0 
 delta_pic_order_cnt_bottom : 0 
 redundant_pic_cnt : 0 
 direct_spatial_mv_pred_flag : 0 
 num_ref_idx_active_override_flag : 0 
 num_ref_idx_l0_active_minus1 : 0 
 num_ref_idx_l1_active_minus1 : 0 
 cabac_init_idc : 0 
 slice_qp_delta : 5 
 sp_for_switch_flag : 0 
 slice_qs_delta : 0 
 disable_deblocking_filter_idc : 0 
 slice_alpha_c0_offset_div2 : 0 
 slice_beta_offset_div2 : 0 
 slice_group_change_cycle : 0 
=== Prediction Weight Table ===
 luma_log2_weight_denom : 0 
 chroma_log2_weight_denom : 0 
 luma_weight_l0_flag : 0 
 chroma_weight_l0_flag : 0 
 luma_weight_l1_flag : 0 
 chroma_weight_l1_flag : 0 
=== Ref Pic List Reordering ===
 ref_pic_list_reordering_flag_l0 : 0 
 ref_pic_list_reordering_flag_l1 : 0 
=== Decoded Ref Pic Marking ===
 no_output_of_prior_pics_flag : 0 
 long_term_reference_flag : 0 
 adaptive_ref_pic_marking_mode_flag : 0 

Both I slices show the frame_num = 0. The next 2 (not shown) have frame_num = 1.

1

1 Answers

0
votes

What kind of packetization do you have with this H.264 stream? For example, with FU-A/FU-B fragmentation http://tools.ietf.org/html/rfc3984#page-11 you always can tell end of NAL unit since it's aligned with end of fragment marked as last fragment for current NALU.