14
votes

The msdn documentation explains that in directx 11 there are multiple ways to fill a directx 11 texture programmatically:

(1) Create the texture with default usage texture and initialize it with data from memory

(2) Create the texture with dynamic usage, use DeviceContext Map to get a pointer to texture memory, write to it, then use Unmap to indicate you are done (at which point I guess it is copied to the gpu)

(3) Create the texture with staging usage and follow same steps as for dynamic texture, but follow that with a call to ID3D11DeviceContext.CopyResource to use this staging texture to in turn fill a (non immutable) default or dynamic texture.

However the documentation doesn't explain pros and cons of each method at all, and I am still quite new to directx, so it is not at all clear to me.

What are the pros and cons of each of these ways of creating a texture programmatically in directx 11?

Side note: I have read that in the context of staging textures, reading back from the gpu isn't buffered so you have to do your own double buffering. But I don't know whether this was accurate and whether it applies to writing using staging textures (or even really what it means).

Second side note: The Map method documentation says it gets a pointer to data in a subresource and denies the GPU access to that subresource. When the GPU wants to access a texture whose underlying data has been called by Map, what does it do? Stall? (I ask because this sounds like part of the pros and cons I inquired about)

1
Adding notes as I learn. Apparently you can't create a multisampled default texture programatically using method (1) above.user334911
While extremely vague the D3D11_MAP_FLAG in the call to map in directx 11 apparently tells the GPU what to do if it needs the mapped resource. Only one flag is currently documented, and it isn't clear whether using 0 means to stall.user334911
Apparently a D3D11_USAGE_DYNAMIC resource must have miplevels equal to 1.user334911

1 Answers

24
votes

The right answer depends on what you're going to use the texture for. Those three options are different ways of getting data from the CPU into the texture. If this is a rendertarget, you generally aren't providing initial data from the CPU, so you can ignore these: create the texture, and when you're ready render into it (perhaps Clear()ing it first).

So assuming you do have data in application memory that you want to get into the texture:

If this is just a static texture (by that I mean the texture is read from much more than it is written to), then you want a USAGE_DEFAULT or USAGE_IMMUTABLE texture. These are generally optimized for GPU read performance compared to USAGE_DYNAMIC. If you have the data handy when you create the texture, then option (1) is easiest, uses the least intermediate memory, and in DX11 the data transfer to the GPU can be done on a separate thread from your rendering thread. If you don't have the data at the time you create the texture, use UpdateSubresource() or option (3) to provide the data when you have it.

If it's a dynamic texture, meaning that you provide new contents from the CPU frequently (CPU-based video playback is the canonical case: data is provided by CPU once per frame, then read by the GPU once per frame) then you probably want to use USAGE_DYNAMIC and option (2). USAGE_DYNAMIC textures are optimized for streaming data from the CPU to the GPU rather than simply for GPU reads. The details (and performance implications) vary between hardware vendors, but usually you only want to use USAGE_DYNAMIC if you really are streaming data from CPU to GPU, rather than simply because it's a convenient way to load static data up-front.

Option (3) is more specialized, and can be used for either initial data load into a static texture (reuse the staging surface(s) for loading data for many textures) or for streaming data for relatively dynamic use. It gives you precise control over GPU/CPU synchronization and over the intermediate memory used for transfers. Usually you'd use a ring of staging buffers, and D3D11_MAP_FLAG_DO_NOT_WAIT to check whether each buffer is still in use by a previous CopyResource. I consider this an expert option -- if you're not careful you can hurt perf badly by preventing the CPU and GPU from running asynchronously.

Full disclosure: I work on the D3D driver at Nvidia, but these are my personal opinions.