After doing some research, I can answer my own question.
os.getrandom is a wrapper for getrandom() syscall offered in Linux Kernel 3.17 and newer. The flag is a number (0, 1, 2 or 3) that corresponds to bitmask in following way:
GETRANDOM with ChaCha20 DRNG
os.getrandom(32, flags=0)
GRND_NONBLOCK = 0 (=Block until the ChaCha20 DRNG seed level reaches 256 bits)
GRND_RANDOM = 0 (=Use ChaCha20 DRNG)
= 00 (=flag 0)
This is a good default to use with all Python 3.6 programs on all platforms (including live distros) when no backwards compatibility with Python 3.5 and pre-3.17 kernels is needed.
The PEP 524 is incorrect when it claims
On Linux, getrandom(0) blocks until the kernel initialized urandom with 128 bits of entropy.
According to page 84 of the BSI report, the 128-bit limit is used during boot time for callers of kernel module's get_random_bytes() function, if the code was made to properly wait for the triggering of the add_random_ready_callback() function. (Not waiting means get_random_bytes() might return insecure random numbers.) According to page 112
When reaching the state of being fully seeded and thus having the ChaCha20 DRNG seeded with 256 bits of entropy -- the getrandom system call unblocks and generates random numbers.
So, GETRANDOM() never returns random numbers until the ChaCha20 DRNG is fully seeded.
os.getrandom(32, flags=1)
GRND_NONBLOCK = 1 (=If the ChaCha20 DRNG is not fully seeded, raise BlockingIOError instead of blocking)
GRND_RANDOM = 0 (=Use ChaCha20 DRNG)
= 01 (=flag 1)
Useful if the application needs to do other tasks while it waits for the ChaCha20 DRNG to be fully seeded. The ChaCha20 DRNG is almost always fully seeded during boot time, so flags=0 is most likely a better choice. Needs the try-except logic around it.
GETRANDOM with blocking_pool
The blocking_pool is also accessible via the /dev/random device file. The pool was designed with the idea in mind that entropy runs out. This idea applies only when trying to create one-time pads (that strive for information theoretic security). The quality of entropy in blocking_pool for that purpose is not clear, and the performance is really bad. For every other use, properly seeded DRNG is enough.
The only situation where blocking_pool might be more secure is with pre-4.17 kernels that have the CONFIG_RANDOM_TRUST_CPU flag set during compile time, and if the CPU HWRNG happened to have a backdoor. Since in that case the ChaCha20 DRNG is initially seeded with RDSEED/RDRAND instruction, bad CPU HWRNG would be a problem. However, according to page page 134 of the BSI report:
[As of kernel version 4.17] The Linux-RNG now considers the ChaCha20 DRNG fully seeded after it received 128 bit of entropy from the noise sources. Previously it was sufficient that it received at least 256 interrupts.
Thus the ChaCha20 DRNG wouldn't be considered fully seeded until entropy is also mixed from input_pool, that pools and mixes random events from all LRNG noise sources together.
By using os.getrandom() with flags 2 or 3, the entropy comes from blocking_pool, that receives entropy from input_pool, that in turn receives entropy from several additional noise sources. The ChaCha20 DRNG is reseeded also from the input_pool, thus the CPU RNG does not have permanent control over the DRNG state. Once this happens, ChaCha20 DRNG is as secure as blocking_pool.
os.getrandom(32, flags=2)
GRND_NONBLOCK = 0 (=Return 32 bytes or less if entropy counter of blocking_pool is low. Block if no entropy is available.)
GRND_RANDOM = 1 (=Use blocking_pool)
= 10 (=flag 2)
This needs an external loop that runs the function and stores returned bytes into a buffer until the buffer size is 32 bytes. The major problem here is due to the blocking behavior of the blocking_pool, obtaining the bytes needed might take a very long time, especially if other programs are also requesting random numbers from the same syscall or /dev/random. Another issue is loop that uses os.getrandom(32, flags=2) spends more time idle waiting for random bytes than it would with flag 3 (see below).
os.getrandom(32, flags=3)
GRND_NONBLOCK = 1 (=return 32 bytes or less if entropy counter of blocking_pool is low. If no entropy is available, raise BlockingIOError instead of blocking).
GRND_RANDOM = 1 (=use blocking_pool)
= 11 (=flag 3)
Useful if the application needs to do other tasks while it waits for blocking_pool to have some amount of entropy. Needs the try-except logic around it plus an external loop that runs the function and stores returned bytes into a buffer until the buffer size is 32 bytes.
Otheropen('/dev/urandom', 'rb').read(32)
To ensure backwards compatibility, unlike GETRANDOM() with ChaCha20 DRNG, reading from /dev/urandom device file never blocks. There is no guarantee for the quality of random numbers, which is bad. This is the least recommended option.
os.urandom(32)
os.urandom(n) provides best effort security:
Python3.6
On Linux 3.17 and newer, os.urandom(32) is the equivalent of os.getrandom(32, flags=0). On older kernels it quietly falls back to the equivalent of open('/dev/urandom', 'rb').read(32) which is not good.
os.getrandom(32, flags=0) should be preferred as it can not fall back to insecure mode.
Python3.5 and earlier
Always the equivalent of open('/dev/urandom', 'rb').read(32) which is not good. As os.getrandom() is not available, Python3.5 should not be used.
secrets.token_bytes(32) (Python 3.6 only)
Wrapper for os.urandom(). Default length of keys is 32 bytes (256 bits). On Linux 3.17 and newer, secrets.token_bytes(32) is the equivalent of os.getrandom(32, flags=0). On older kernels it quietly falls back to the equivalent of open('/dev/urandom', 'rb').read(32) which is not good.
Again, os.getrandom(32, flags=0) should be preferred as it can not fall back to insecure mode.
tl;dr
Use os.getrandom(32, flags=0).
What about other RNG sources, random, SystemRandom() etc?import random
random.<anything>()
is never safe for creating passwords, cryptographic keys etc.
import random
sys_rand = random.SystemRandom()
is safe for cryptographic use WITH EXCEPTIONS!
sys_rand.sample()
Generating a random password with sys_rand.sample(list_of_password_chars, counts=password_length) is not safe because to quote the documentation, the sample() method is used for "random sampling without replacement". This means that each consequtive character in the password is guaranteed not to contain any of the previous characters. This will lead to passwords that are not uniformly random.
sys_rand.choices()
The sample() method was used for random sampling without replacement. The choices() method is used for random sampling with replacement. However, the to quote the documentation on choices,
The algorithm used by choices() uses floating point arithmetic for internal consistency and speed. The algorithm used by choice() defaults to integer arithmetic with repeated selections to avoid small biases from round-off error.
The floating point arithmetic choices() method uses thus introduces cryptographically non-negligible biases to the sampled passwords. Thus, random.choices() must not be used for password/key generation!
sys_random.choice()
As per the previously quoted piece of documentation, the sys_random.choice() method uses integer arithmetic as opposed to floating point arithmetic, thus generating passwords/keys with repeated calls to sys_random.choice() is therefore safe.
secrets.choice()
The secrets.choice() is a wrapper for sys_random.choice(), and can be used interchangeably with random.SystemRandom().choice(): they are the same thing.
The recipe for best practice to generate a passphrase with secrets.choice() is
import secrets
# On standard Linux systems, use a convenient dictionary file.
# Other platforms may need to provide their own word-list.
with open('/usr/share/dict/words') as f:
words = [word.strip() for word in f]
passphrase = ' '.join(secrets.choice(words) for i in range(4))
How can I ensure the generated passphrase meets some security level, e.g. 128 bits?
Here's a recipe for that
import math
import secrets
def generate_passphrase() -> str:
PASSWORD_MIN_BIT_STRENGTH = 128 # Set desired minimum bit strength here
with open('/usr/share/dict/words') as f:
wordlist = [word.strip() for word in f]
word_space = len(wordlist)
word_count = math.ceil(math.log(2 ** PASSWORD_MIN_BIT_STRENGTH, word_space))
passphrase = ' '.join(secrets.choice(wordlist) for _ in range(word_count))
# pwd_bit_strength = math.floor(math.log2(word_space ** word_count))
# print(f"Generated {pwd_bit_strength}-bit passphrase.")
return passphrase