1
votes

In OpenGL wiki on Performance, it says:

"OpenGL implementations are almost always pipelined - that is to say, things are not necessarily drawn when you tell OpenGL to draw them - and the fact that an OpenGL call returned doesn't mean it finished rendering."

Since it says "almost", that means there are some mplementations are not pipelined.

Here I find one: OpenGL Pixel Buffer Object (PBO)

"Conventional glReadPixels() blocks the pipeline and waits until all pixel data are transferred. Then, it returns control to the application. On the contrary, glReadPixels() with PBO can schedule asynchronous DMA transfer and returns immediately without stall. Therefore, the application (CPU) can execute other process right away, while transferring data with DMA by OpenGL (GPU)."

So this means conventional glReadPixels() (not with PBO) blocks the pipeline. But actually in OpenGL reference of glReadPixels I cannot tell the fact.

Then I am wondering: which OpenGL implementations are not pipelined?

How about glDrawArrays?

3
Not 100% sure, but i think all operations that do not return anything are pipelined. For example glDrawArrays is pipelined, while glGenBuffers ist not.dari

3 Answers

3
votes

The OpenGL specification itself does not specify the term "pipeline" but rather "command stream". The runtime behavior of command stream execution is deliberately left open, to give implementors maximal flexibility.

The important term is "OpenGL sychronization point": https://www.opengl.org/wiki/Synchronization

Here I find one: (Link to songho article)

Note that this is not an official OpenGL specification resource. The wording "blocks the OpenGL pipeline" is a bit unfortunate, because it gets the actual blocking and bottleneck turned "upside down". Essentially it means, that glReadPixels can only return once all the commands leading up to the image it will fetch have been executed.

So this means conventional glReadPixels() (not with PBO) blocks the pipeline. But actually in OpenGL reference of glReadPixels I cannot tell the fact.

Actually it's not the OpenGL pipeline that gets blocked, but the execution of the program on the CPU. It means, that the GPU sees no further commands coming from the CPU. So the pipeline doesn't get "blocked" but in fact drained. When a pipeline drains, or needs to be restarted one says the pipeline has been stalled (i.e. the flow in the pipeline came to a halt).

From the GPUs point of view everything happens with maximum throughput: Render the stuff until the point glReadPixels got called, do a DMA transfer, unfortunately no further commands are available after initiating the transfer.

How about glDrawArrays?

glDrawArrays returns immediately after the data has been queued and necessary been made.

1
votes

Actually it means that this specific operation can't be pipelined because all data needs to be transfered before the function returns, it doesn't mean other things can't be.

Operations like that are said to stall the pipeline. One function that will always stall the pipeline is glFinish.

Usually when the function returns a value like getting the contents of a buffer, it will induce a stall.

Depending on the driver implementation creating programs and buffers and such can be done without stalling.

0
votes

Then I am wondering: which OpenGL implementations are not pipelined?

I could imagine that a pure software implementation might not be pipelined. Not much reason to queue up work if you end up executing it on the same CPU. Unless you wanted to take advantage of multi-threading.

But it's probably safe to say that any OpenGL implementation that uses dedicated hardware (commonly called GPU) will be pipelined. This allows the CPU and GPU to work in parallel, which is critical to get good system performance. Also, submitting work to the GPU incurs a certain amount of overhead, so it's beneficial to queue up work, and then submit it in larger batches.

But actually in OpenGL reference of glReadPixels I cannot tell the fact.

True. The man pages don't directly specify which calls cause a synchronization. In general, anything that returns values/data produced by the GPU causes synchronization. Examples that come to mind:

  • glFinish(). Explicitly requires a full synchronization, which is actually its only purpose.
  • glReadPixels(), in the non PBO case. The GPU has to finish rendering before you can read back the result.
  • glGetQueryObjectiv(id, GL_QUERY_RESULT, ...). Blocks until the GPU reaches the point where the query was submitted.
  • glClientWaitSync(). Waits until the GPU reaches the point where the corresponding glFenceSync() was submitted.

Note that there can be different types of synchronizations that are not directly tied to specific OpenGL calls. For example, in the case where the whole workload is GPU limited, the CPU would queue up an infinite about of work unless there is some throttling. So the driver will block the CPU at more or less arbitrary points to let the GPU catch up to a certain point. This could happen at frame boundaries, but it does not have to be. Similar synchronization can be necessary if memory runs low, or if internal driver resources are exhausted.