I am trying to understand how to work the rendering for YV12 format. For example, I took a simple sample. See this graph:
The webcam creates frames by size 640x480 in RGB24 or MJPEG . After it the LAV decoder transforms the frames to YV12 and sends them to DS renderer (EVR or VMR9).
The decoder changes the frame width (stride) 640 on 1024. Hence, the output size of frame will be 1.5*1024*640=737280. The normal size for YV12 is 1.5*640*480=460800. I know the stride can be more than the width of real frame (https://docs.microsoft.com/en-us/windows/desktop/medfound/image-stride). My first question - why did the renderer select that value (1024) than another? Can I get it programmatically?
When I replace the LAV decoder with my filter for transformation RGB24/YV12 (https://gist.github.com/thedeemon/8052fb98f8ba154510d7), the renderer shows me a shifted image, though all parameters are the same, as for the first graph:
Why? I noted that VIDEOINFOHEADER2 had the set interlacing flag dwInterlaceFlags. Therefore my next question: Do I have to add interlacing into my filter for normal work of renderer?