4
votes

I have to use MS DirectShow to capture video frames from a camera (I just want the raw pixel data).
I was able to build the Graph/Filter network (capture device filter and ISampleGrabber) and implement the callback (ISampleGrabberCB). I receive samples of appropriate size.

However, they are always upside down (flipped vertically that is, not rotated) and the color channels are BGR order (not RGB).

I tried setting the biHeight field in the BITMAPINFOHEADER to both positive and negative values, but it doesn't have any effect. According to MSDN documentation, ISampleGrapper::SetMediaType() ignores the format block for video data anyways.

Here is what I see (recorded with a different camera, not DS), and what DirectShow ISampleGrabber gives me: The "RGB" is actually in red, green and blue respectively:

Here is what I see (recorded with a different camera, not DS)

Here is what DirectShow ISampleGrabber gives me.

Sample of the code I'm using, slightly simplified:

// Setting the media type...
AM_MEDIA_TYPE*   media_type   = 0 ;
this->ds.device_streamconfig->GetFormat(&media_type); // The IAMStreamConfig of the capture device
// Find the BMI header in the media type struct
BITMAPINFOHEADER* bmi_header;
if (media_type->formattype != FORMAT_VideoInfo) {
    bmi_header = &((VIDEOINFOHEADER*)media_type->pbFormat)->bmiHeader;
} else if (media_type->formattype != FORMAT_VideoInfo2) {
    bmi_header = &((VIDEOINFOHEADER2*)media_type->pbFormat)->bmiHeader;
} else {
    return false;
}
// Apply changes
media_type->subtype  = MEDIASUBTYPE_RGB24;
bmi_header->biWidth  = width;
bmi_header->biHeight = height;
// Set format to video device
this->ds.device_streamconfig->SetFormat(media_type);
// Set format for sample grabber
// bmi_header->biHeight = -(height); // tried this for either and both interfaces, no effect
this->ds.sample_grabber->SetMediaType(media_type);

// Connect filter pins
IPin* out_pin= getFilterPin(this->ds.device_filter, OUT,  0); // IBaseFilter interface for the capture device
IPin* in_pin = getFilterPin(this->ds.sample_grabber_filter,  IN,  0); // IBaseFilter interface for the sample grabber filter
out_pin->Connect(in_pin, media_type);

// Start capturing by callback
this->ds.sample_grabber->SetBufferSamples(false);
this->ds.sample_grabber->SetOneShot(false);
this->ds.sample_grabber->SetCallback(this, 1);
// start recording
this->ds.media_control->Run(); // IMediaControl interface

I'm checking return types for every function and don't get any errors.

I'm thankful for any hint or idea.

Things I already tried:

  1. Setting the biHeight field to a negative value for either the capture device filter or the sample grabber or for both or for neither - doesn't have any effect.

  2. Using IGraphBuilder to connect the pins - same problem.

  3. Connecting the pins before changing the media type - same problem.

  4. Checking if the media type was actually applied by the filter by querying it again - but it apparently is applied or at least stored.

  5. Interpreting the image as total byte reversed (last byte first, first byte last) - then it would be flipped horizontally.

  6. Checking if it's a problem with the video camera - when I test it with VLC (DirectShow capture) it looks normal.

2
I suppose when you get data back from Sample Grabber - you incorrectly treat order of rows. It is typically bottom to top and you process lines in the opposite order - hence the issue.Roman R.
Roman, thanks for your reply, but isn't it possible to receive the frame in normal row order (starting at the top)? I don't think the camera sends them this way anyway. It also does not explain the BRG color channel flip. Since the code should work for other cameras later as well, I would like to be able to figure out what is going on...Makx
"Normal" Windows RGB order is bottom to top. Some components are capable to reverse it but it's a fragile assumption. Way more robust is to let it go either original order, or force bottom to top. Then having the buffer already available to process either actual order of rows or reverse rows yourself if needed. I suppose camera does not let you down, and your code snippet does not persuade me you make it top to bottom on Sample Grabber buffer.Roman R.
Roman, thanks again for the answer. "your code snippet does not persuade me you make it top to bottom on Sample Grabber buffer" - this is exactly the point: I'm trying to get it "top-bottom", but I can't get it to work. Optimal solution would be that the capture device captures it right away like that to avoid unnecessary flipping. I tried setting the biHeight negative which according to windows should have that effect, but it doesn't work. Also: I'm still stuck with the BGR color channel flip which apparently only happens to me. Could you imagine any reason for that?Makx
Most of capture devices and transform filters out there are just not capable to do top to bottom RGB. Only rare filters are.Roman R.

2 Answers

0
votes

I noticed that when using the I420 color space turning disappears. In addition, most current codecs (VP8) is used as a format raw I/O I420 color space.

I wrote a simple mirroring frame function in color space I420.

void Camera::OutputCallback(unsigned char* data, int len, uint32_t timestamp, void *instance_)
{
    Camera *instance = reinterpret_cast<Camera*>(instance_);

    Transport::RTPPacket packet;

    packet.rtpHeader.ts = timestamp;

    packet.payload = data;
    packet.payloadSize = len;

    if (instance->mirror)
    {
        Video::ResolutionValues rv = Video::GetValues(instance->resolution);
        int k = 0;

        // Chroma values
        for (int i = 0; i != rv.height; ++i)
        {
            for (int j = rv.width; j != 0; --j)
            {
                int l = ((rv.width * i) + j);
                instance->buffer[k++] = data[l];
            }
        }

        // U values
        for (int i = 0; i != rv.height/2; ++i)
        {
            for (int j = (rv.width/2); j != 0; --j)
            {
                int l = (((rv.width / 2) * i) + j) + rv.height*rv.width;
                instance->buffer[k++] = data[l];
            }
        }

        // V values
        for (int i = 0; i != rv.height / 2; ++i)
        {
            for (int j = (rv.width / 2); j != 0; --j)
            {
                int l = (((rv.width / 2) * i) + j) + rv.height*rv.width + (rv.width/2)*(rv.height/2);
                if (l == len)
                {
                    instance->buffer[k++] = 0;
                }
                else
                {
                    instance->buffer[k++] = data[l];
                }
            }
        }

        packet.payload = instance->buffer;
    }

    instance->receiver->Send(packet);
}
-1
votes

My quick hack for this:

void Camera::OutputCallback(unsigned char* data, int len, void *instance_)
{
    Camera *instance = reinterpret_cast<Camera*>(instance_);

    int j = 0;
    for (int i = len-4; i > 0; i-=4)
    {
        instance->buffer[j] = data[i];
        instance->buffer[j + 1] = data[i + 1];
        instance->buffer[j + 2] = data[i + 2];
        instance->buffer[j + 3] = data[i + 3];
        j += 4;
    }

    Transport::RTPPacket packet;

    packet.payload = instance->buffer;
    packet.payloadSize = len;

    instance->receiver->Send(packet);
}

It's correct on RGB32 color space, for other color spaces this code need to be corrected