Let me say my way of uniform buffering first, I have a buffer in device local memory and one (for staging) in host coherent memory, and each is divided to number-of-framebuffers sections, in each frame, before beginning render pass, I update the host located one and then copy that to device located and I wait till the command buffer ends.
(Assume that my GPU is a discrete one with no shared memory between CPU and GPU)
Now my questions:
- Is it a best way for managing uniform buffer with staging and copying in each frame?
- Indeed, I know the synchronization mechanism I use, is not OK, what is the best way for doing so?
- If your answer is to do barrier synchronization, what is the exact way of doing that? (because I have not seen any sample like this.)
Till here every sample code I have seen, use host coherent uniform buffers, I will be appreciated if you refer a sample code for something like this.