6
votes

libc has random which

uses a nonlinear additive feedback random number generator employing a default table of size 31 long integers to return successive pseudo-random numbers

I'm looking for a random function written in Rust with the same speed. It doesn't need to be crypto-safe, pseudo-random is good enough.

Looking through the rand crate it seems that XorShiftRng is the closest fit to this need:

The Xorshift algorithm is not suitable for cryptographic purposes but is very fast

When I use it like this:

extern crate time;
use rand::Rng;

let mut rng = rand::XorShiftRng::new_unseeded();
rng.next_u64();

it is about 33% slower than libc's random. (Sample code generating 8'000'000 random numbers).

In the end, I'll need i64 random numbers, so when I run rng.gen() this is already 100% slower than libc's random. When casting with rng.next_u64() as i64, then is is 60% slower.

Is there any way to achieve the same speed in rust without using any unsafe code?

1
Just to make sure: you are compiling both in Release mode, right? (Because when I do use Release mode, in your example, XorShiftRng is completely optimized away... => play.integer32.com/…)Matthieu M.
uh, no, I didn't! In release mode the XorShiftRng version with rng.gen() is 760'000x as fast as random(). So my case is solved, but I don't understand why the release version is so much faster, and why XorShiftRng is now so much faster than random()hansaplast
Because your benchmark is flawed (it's hard to benchmkar): everything is hard-coded, so the compiler optimizes the benchmark to let start = time(); let end = time(); println!("...", start.to(end));. This only happens for XorShiftRng because its implementation is available to the compiler, whereas the libc::random is dynamically loaded at run-time. My suggestion would be to use a realistic workload to evaluate them.Matthieu M.

1 Answers

14
votes

Be sure to compile the code you are measuring in release mode, otherwise your benchmark is not representative of Rust's performance.

In order to obtain meaningful numbers, you must also modify the benchmark to do something with the generated numbers, e.g. collect them into a vector1. Failing to do so can make the compiler optimize the whole loop away because it has no side effects. This is what happened in your second attempt that led you to conclude that XorShiftRng is 760,000 thousand times faster than libc::random.

With the changed benchmark run in release mode, XorShiftRng ends up approximately 2x faster than libc::random:

PT0.101378490S seconds for libc::random
PT0.050827393S seconds for XorShiftRng

1 The compiler could also be smart enough to realize that the vector is also unused and optimize it away as well, but current rustc does not do so, and storing elements into a vector is enough. A simple and future-proof way to ensure that the generation is not optimized away is to sum up the numbers and write out the result.