Recently I'm reading https://github.com/ARM-software/vulkan_best_practice_for_mobile_developers/blob/master/samples/vulkan_basics.md, and it said:
OpenGL ES uses a synchronous rendering model, which means that an API call must behave as if all earlier API calls have already been processed. In reality no modern GPU works this way, rendering workloads are processed asynchronously and the synchronous model is an elaborate illusion maintained by the device driver. To maintain this illusion the driver must track which resources are read or written by each rendering operation in the queue, ensure that workloads run in a legal order to avoid rendering corruption, and ensure that API calls which need a data resource block and wait until that resource is safely available.
Vulkan uses an asynchronous rendering model, reflecting how the modern GPUs work. Applications queue rendering commands into a queue, use explicit scheduling dependencies to control workload execution order, and use explicit synchronization primitives to align dependent CPU and GPU processing.
The impact of these changes is to significantly reduce the CPU overhead of the graphics drivers, at the expense of requiring the application to handle dependency management and synchronization.
Could someone help explain why asynchronous rendering model could reduce CPU overhead? Since in Vulkan you still have to track state yourself.