2
votes

For benchmarking purpose lets take this famous PBO Read-back code.

Problem:

  1. Using PBO is having no effect in my PC. Even with Latest driver Update & correct Pixel format BGRA.

Update 1: I have also tried the same example with 3 PBO's. But there is no difference even then.

NOTE:Intel(R) Core(TM) i5-3470S CPU @ 2.90GHz, 2901 Mhz, 4 Core(s), Video Card : Intel(R) HD Graphics 2500

PBO: off  
Read Time: 9 ms
Process Time: 2 ms
Transfer Rate: 39.5 Mpixels/s. (45.0 FPS)

PBO: on 
Read Time: 7 ms
Process Time: 2 ms
PBO: on Transfer Rate: 38.8 Mpixels/s. (44.2 FPS)

UPDATE 2: PBO is working correctly in External GPU & also in Intel i-7 series.

PC config: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 3400 Mhz, 4 Core(s), 8 Logical Processor(s),Video Card: Geforce 210. So it turns out to be the Problem with Integrated GPU & External GPU. I believe this will be a useful Hint for lot of People who are wondering why their code is not working!

PBO: on 
PBO: on Read Time: 0.06 ms
Process Time: 2 ms
Transfer Rate: 112.4 Mpixels/s. (127.9 FPS)

PBO: off 
Read Time: 4 ms
Process Time: 2 ms
Transfer Rate: 93.3 Mpixels/s. (106.1 FPS)
1
Did you try yourself? I don't want to sound offensive but this sounds like you're questioning everyone else's competence but you're not willing to do your own research; If you have any benchmarks of your own, add them. Actually, this is rather ill-fitting for stackoverflow, so maybe write an article about your benchmarks and put it online somewhere, then cite it here.Marcus Müller
@Marcus You are also welcome to improve this question to make it better!Balaji R
It is likely to be related to having separate GPU/VRAM and integrated one. I see very close results on intel GPU. Will check on separate nvidia card in a few hours.keltar
@BalajiR yes I see very significant difference on discrete graphics card (64 vs 41 Mpix/s). On integrated GPU, system RAM and VRAM is the same, so CPU can access data directly without using relatively narrow/slow PCIe.keltar
@keltar Sounds great! it would be great if you can demonstrate that in code as an answer.Balaji R

1 Answers

1
votes

In the link :

Mapping PBO ...

Note that if GPU is still working with the buffer object, glMapBufferARB() will not return until GPU finishes its job with the corresponding buffer object. To avoid this stall(wait), call glBufferDataARB() with NULL pointer right before glMapBufferARB(). Then, OpenGL will discard the old buffer, and allocate new memory space for the buffer object.

You may need to apply the change proposed above to the code below:

// "index" is used to read pixels from framebuffer to a PBO
// "nextIndex" is used to update pixels in the other PBO
index = (index + 1) % 2;
nextIndex = (index + 1) % 2;

// set the target framebuffer to read
glReadBuffer(GL_FRONT);

// read pixels from framebuffer to PBO
// glReadPixels() should return immediately.
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, pboIds[index]);
glReadPixels(0, 0, WIDTH, HEIGHT, GL_BGRA, GL_UNSIGNED_BYTE, 0);

// map the PBO to process its data by CPU
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, pboIds[nextIndex]);
glBufferDataARB(GL_PIXEL_PACK_BUFFER_ARB, 0, NULL, GL_STATIC_DRAW_ARB);
GLubyte* ptr = (GLubyte*)glMapBufferARB(GL_PIXEL_PACK_BUFFER_ARB,
                                        GL_READ_ONLY_ARB);
if(ptr)
{
    processPixels(ptr, ...);
    glUnmapBufferARB(GL_PIXEL_PACK_BUFFER_ARB);
}

// back to conventional pixel operation
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, 0);