This feature is indeed useful, and surprisingly difficult to implement even with commercial tools such as Loadrunner. I would compare it to finding a loudspeakers maximum volume. You would manually turn the volume up until it started to crackle, then turn it back down slightly to maintain that maximum volume. In the same way, to find the peak capacity of an application, you want control to 'turn the volume up' until errors are seen, then back it down slightly to see if it stabilizes. You can then maintain that load to find where the bottleneck is.
Anyway, to answer the question, what I have done in the past is use an external influence, such as a file name or similar. Then combine that with the thread unique reference you can control which threads run and which are held (by pausing or similar).
For example, if you start with 100 threads, then create a file called '5.txt' in a specific location, you can add code such that if the threads sees that it's own reference is equal to or lower than the number then it can run. If not then it drops into a pause. At the start of this example 5 threads would run, and 95 would pause. You can then rename the file to '25.txt', and threads 6 to 25 would start running. It would work the other way too, changing it to '20.txt' would mean threads 21-25 pause again.
The key is to start enough threads to exceed your expected peak.