32
votes

This is something of a controversial topic, so let me start by explaining my use case, and then talk about the actual problem.

I find that for a bunch of unsafe things, it's important to make sure that you don't leak memory; this is actually quite easy to do if you start using transmute() and forget(). For example, passing a boxed instance to C code for an arbitrary amount of time, then fetching it back out and 'resurrecting it' by using transmute.

Imagine I have a safe wrapper for this sort of API:

trait Foo {}
struct CBox;

impl CBox {
    /// Stores value in a bound C api, forget(value)
    fn set<T: Foo>(value: T) {
        // ...
    }

    /// Periodically call this and maybe get a callback invoked
    fn poll(_: Box<Fn<(EventType, Foo), ()> + Send>) {
        // ...
    }
}

impl Drop for CBox {
    fn drop(&mut self) {
        // Safely load all saved Foo's here and discard them, preventing memory leaks
    }
}

To test this is actually not leaking any memory, I want some tests like this:

#[cfg(test)]
mod test {

    struct IsFoo;
    impl Foo for IsFoo {}
    impl Drop for IsFoo {
        fn drop(&mut self) {
            Static::touch();
        }
    }

    #[test]
    fn test_drops_actually_work() {
        guard = Static::lock(); // Prevent any other use of Static concurrently
        Static::reset(); // Set to zero
        {
            let c = CBox;
            c.set(IsFoo);
            c.set(IsFoo);
            c.poll(/*...*/);
        }
        assert!(Static::get() == 2); // Assert that all expected drops were invoked
        guard.release();
    }
}

How can you create this type of static singleton object?

It must use a Semaphore style guard lock to ensure that multiple tests do not concurrently run, and then unsafely access some kind of static mutable value.

I thought perhaps this implementation would work, but practically speaking it fails because occasionally race conditions result in a duplicate execution of init:

/// Global instance
static mut INSTANCE_LOCK: bool = false;
static mut INSTANCE: *mut StaticUtils = 0 as *mut StaticUtils;
static mut WRITE_LOCK: *mut Semaphore = 0 as *mut Semaphore;
static mut LOCK: *mut Semaphore = 0 as *mut Semaphore;

/// Generate instances if they don't exist
unsafe fn init() {
    if !INSTANCE_LOCK {
        INSTANCE_LOCK = true;
        INSTANCE = transmute(box StaticUtils::new());
        WRITE_LOCK = transmute(box Semaphore::new(1));
        LOCK = transmute(box Semaphore::new(1));
    }
}

Note specifically that unlike a normal program where you can be certain that your entry point (main) is always running in a single task, the test runner in Rust does not offer any kind of single entry point like this.

Other, obviously, than specifying the maximum number of tasks; given dozens of tests, only a handful need to do this sort of thing, and it's slow and pointless to limit the test task pool to one just for this one case.

3

3 Answers

39
votes

It looks like a use case for std::sync::Once:

use std::sync::{Once, ONCE_INIT};
static INIT: Once = ONCE_INIT;

Then in your tests call

INIT.doit(|| unsafe { init(); });

Once guarantees that your init will only be executed once, no matter how many times you call INIT.doit().

19
votes

See also lazy_static, which makes things a little more ergonomic. It does essentially the same thing as a static Once for each variable, but wraps it in a type that implements Deref so that you can access it like a normal reference.

Usage looks like this (from the documentation):

#[macro_use]
extern crate lazy_static;

use std::collections::HashMap;

lazy_static! {
    static ref HASHMAP: HashMap<u32, &'static str> = {
        let mut m = HashMap::new();
        m.insert(0, "foo");
        m.insert(1, "bar");
        m.insert(2, "baz");
        m
    };
    static ref COUNT: usize = HASHMAP.len();
    static ref NUMBER: u32 = times_two(21);
}

fn times_two(n: u32) -> u32 { n * 2 }

fn main() {
    println!("The map has {} entries.", *COUNT);
    println!("The entry for `0` is \"{}\".", HASHMAP.get(&0).unwrap());
    println!("A expensive calculation on a static results in: {}.", *NUMBER);
}

Note that autoderef means that you don't even have to use * whenever you call a method on your static variable. The variable will be initialized the first time it's Deref'd.

However, lazy_static variables are immutable (since they're behind a reference). If you want a mutable static, you'll need to use a Mutex:

lazy_static! {
    static ref VALUE: Mutex<u64>;
}

impl Drop for IsFoo {
    fn drop(&mut self) {
        let mut value = VALUE.lock().unwrap();
        *value += 1;
    }
}

#[test]
fn test_drops_actually_work() {
    // Have to drop the mutex guard to unlock, so we put it in its own scope
    {
        *VALUE.lock().unwrap() = 0;
    }
    {
        let c = CBox;
        c.set(IsFoo);
        c.set(IsFoo);
        c.poll(/*...*/);
    }
    assert!(*VALUE.lock().unwrap() == 2); // Assert that all expected drops were invoked
}
2
votes

If you're willing to use nightly Rust you can use SyncLazy instead of the external lazy_static crate:

#![feature(once_cell)]

use std::collections::HashMap;

use std::lazy::SyncLazy;

static HASHMAP: SyncLazy<HashMap<i32, String>> = SyncLazy::new(|| {
    println!("initializing");
    let mut m = HashMap::new();
    m.insert(13, "Spica".to_string());
    m.insert(74, "Hoyten".to_string());
    m
});

fn main() {
    println!("ready");
    std::thread::spawn(|| {
        println!("{:?}", HASHMAP.get(&13));
    }).join().unwrap();
    println!("{:?}", HASHMAP.get(&74));

    // Prints:
    //   ready
    //   initializing
    //   Some("Spica")
    //   Some("Hoyten")
}