2
votes

In Vulkan, we have two global functions that need to load with dlopen/LoadLibrary. They are vkGetInstanceProcAddr and vkGetDeviceProcAddr.

I have one gpu and install vulkan driver. Should I better load library at runtime or linktime? I give difference logical devices(created from the same gpu) to vkGetDeviceProcAddr to query the same function. They both return the same address to me. I think the reload could be wasteful.

My question is how this idea come from? Is it for multi-implementation or multi-gpus?

My loader function currently look like this:

class VulkanDevice
{
public:
    VkDevice m_Device;
    void LoadAllCoreFunctions();
    void LoadExtension(const char *name);
    void LoadExtensions(const char *postfix); // for example: "KHR"

    PFN_vkCreateCommandBuffers vkCreateCommandBuffers;
    // Then a lot of function pointers.....
}

int main() {
    // After creating instance and creating device with vkCreateDevice
    VulkanDevice vkd(device);
    vkd.LoadAllCoreFunctions();
    vkd.vkCreateCommandBuffers(vkd.m_Device, ....);
}

As you can see, if I have multiple device it will be pretty wasteful to reload... and the function pointer could use a lot of memory too....

2

2 Answers

4
votes

The distinction between "instance function pointers" and "device function pointers" is for people who want faster function call performance.

You can use Vulkan just fine with only vkGetInstanceProcAddr. This function will retrieve function pointers for all Vulkan functions. Those function pointers will use dispatch information stored in the various Vulkan objects you pass in order to determine which device you're talking to. These pointers can be used with any instance, device, or device-dependent object.

The pointers you get from vkGetDeviceProcAddr know that they work with a specific device. They don't need to use dispatch logic to call into that device. So they're slightly lower-level. The downside is that you can only use them with that specific VkDevice or device-derived objects.

The dispatch overhead probably won't be significant enough to be worth the bother for most people. If you really care about such things however, you have the option to avoid it.

if I have multiple device it will be pretty wasteful to reload... and the function pointer could use a lot of memory too....

Those functions exist whether you query their pointers or not; getting their pointers therefore only takes up the memory you use to store the pointers to them. Vulkan's API contains about 140 functions. At 8 bytes per function pointer, that's 1120 bytes, just over 1KB.

As for the time spent loading them... I would be shocked if loading 140 function pointers took more than a few microseconds. Which you do once at start-up time.

4
votes

In Vulkan, we have two global functions that need to load with dlopen/LoadLibrary. They are vkGetInstanceProcAddr and vkGetDeviceProcAddr.

Wrong. Only vkGetInstanceProcAddr. vkGetDeviceProcAddr itself can be loaded from the vkGetInstanceProcAddr.

Should I better load library at runtime or linktime? I give difference logical devices(created from the same gpu) to vkGetDeviceProcAddr to query the same function. They both return the same address to me.

VkInstance taking commands obtained from vkGetInstanceProcAddr are limited to exactly the same instance used to obtain them.

Similarly commands obtained from vkGetDeviceProcAddr are to be used only with the device(of type VkDevice) used to obtain them.

They may, and often will be the same, but you cannot know that in advance and have it working that way on every platform/PC. You could load single command only, to test it is the same pointer and infer the others will be too — but that's getting on a thin ice, for no reasonable benefit.

Do it at link-time with The Official Loader for convenience, unless you have reason not to.

I think the reload could be wasteful.

Unless you plan to make HW with millions of GPUs constantly connecting and disconnecting, then don't worry about it and load the commands properly. It is small one(or few)-time cost, which will be quickly amortized in all the rendering you will undoubtedly do after it.

Also there's no "reload". The new pointers can peacefully coexist with the old ones. In C++ you would probably make those member functions of an Instance or Device...

My question is how this idea come from? Is it for multi-implementation or multi-gpus?

Yes, sort of.

You definitely see that if you have two GPUs (from different vendors no less), with different driver file (usually some *.dll or equivalent on other platform), then the fpointer to the correct file has to be chosen somehow.

vkGetInstanceProcAddr solves it so that it gives you pointer to another function which chooses and calls the right pointer. ("Everything can be solved by another level of indirection", right?). Statically loaded The Khronos/Official/LunargSDK loader will probably do something similar.

To vkGetDeviceProcAddr you simply give the exact device, and in turn it will simply give you a direct function pointer to the driver of that particular GPU.

Same can happen for the instance (a loader must only export the vkGetInstanceProcAddr and rest can be elsewhere). Though usually (as in the case of The Loader) it exports all the instance-level and even the indirect device-level function pointers for convenience.

As you can see, if I have multiple device it will be pretty wasteful to reload... and the function pointer could use a lot of memory too....

Not really that much CPU-time wastefull, as said before.

If you don't have those few kB for the pointers, you probably won't be able to do viable Vulkan based application anyway. You could load only those commands you actually use, but I don't see a reason for such type of premature optimization. Do you have some particular special HW, that needs such desperate measures or doing a 64K demo?