2
votes

I use bindgen to generate a C interface for my Rust code. I want to return a structure that contains an Option<Vec<f64>> from Rust to C. In Rust I have created the following structure:

#[repr(C)]
pub struct mariettaSolverStatus {
    lagrange: *const c_double
}

which bindgen translates into the following C structure:

/* Auto-generated structure */
typedef struct {
  const double *lagrange;
} mariettaSolverStatus;

the corresponding structure in Rust is

pub struct AlmOptimizerStatus {
    lagrange_multipliers: Option<Vec<f64>>,
}

impl AlmOptimizerStatus {

    pub fn lagrange_multipliers(&self) -> &Option<Vec<f64>> {
        &self.lagrange_multipliers
    }

}

The idea is to map AlmOptimizerStatus (in Rust) to mariettaSolverStatus (in C). When lagrange_multipliers is None, a null pointer will be assigned to the pointer in C.

Now in Rust, I have the following function:

#[no_mangle]
pub extern "C" fn marietta_solve(
    instance: *mut mariettaCache,
    u: *mut c_double,
    params: *const c_double
) -> mariettaSolverStatus {

  /* obtain an instance of `AlmOptimizerStatus`, which contains
   *  an instance of `&Option<Vec<f64>>` 
   */
  let status = solve(params, &mut instance.cache, u, 0, 0);

  /* At this point, if we print status.langrange_multipliers() we get 
   *
   *  Some([-14.079295698854809,
   *         12.321753192707693,
   *         2.5355683425384417
   *       ])
   *
   */

  /* return an instance of `mariettaSolverStatus` */
  mariettaSolverStatus {
    lagrange: match &status.lagrange_multipliers() {
        /* cast status.lagrange_multipliers() as a `*const c_double`,
         * i.e., get a constant pointer to the data
         */
        Some(y) => {y.as_ptr() as *const c_double},
        /* return NULL, otherwise */
        None => {0 as *const c_double},
    }
  }
}

Bindgen generates a C header and library files that allow us to invoke Rust functions in C. Up to this point I should say that I get no warnings from Rust.

However, when I call the above function from C, using the auto-generated C interface, the first element of mariettaSolverStatus.lagrange is always 0, whereas, all subsequent elements are correctly stored.

This is my C code:

#include <stdio.h>
#include "marietta_bindings.h"

int main() {
    int i;
    double p[MARIETTA_NUM_PARAMETERS] = {2.0, 10.0};  /* parameters    */
    double u[MARIETTA_NUM_DECISION_VARIABLES] = {0};  /* initial guess */
    double init_penalty = 10.0;
    double y[MARIETTA_N1] = {0.0};

    /* obtain cache */
    mariettaCache *cache = marietta_new();

    /* solve  */
    mariettaSolverStatus status = marietta_solve(cache, u, p, y, &init_penalty);

    /* prints:
     * y[0] = 0  <------- WRONG!
     * y[1] = 12.3218
     * y[2] = 2.5356
     */
    for (i = 0; i < MARIETTA_N1; ++i) {
        printf("y[%d] = %g\n", i, status.lagrange[i]);
    }


    /* free memory */
    marietta_free(cache);

    return 0;
}

I would guess that somehow, somewhere, some pointer goes out of scope.

1

1 Answers

2
votes

I'm pretty sure the issue lies in your implementation of marietta_solve. Let's walk through line by line

let status = solve(params, &mut instance.cache, u, 0, 0);

You've assigned an AlmOptimizerStatus and all its inner members. Up to here, everything is kosher (assuming solve doesn't do silly things)

mariettaSolverStatus {
  lagrange: match &status.lagrange_multipliers() {
    /* cast status.lagrange_multipliers() as a `*const c_double`,
     * i.e., get a constant pointer to the data
     */
    Some(y) => {y.as_ptr() as *const c_double},
    /* return NULL, otherwise */
    None => {0 as *const c_double},
  }
}

You then decide to return a raw pointer to a struct that is about to get out of scope and get dropped (status). Inside, you have the Option<Vec<f64>> you are returning a pointer to.

As a result, this leads to UB - your vector is no longer in memory, but you have a raw pointer to it. And, since rust does not protect you against this when using raw pointers, no error comes out. The moment you allocate something else (as you do when you defined int i), you potentially overwrite some of the memory you've used (and freed) prior.

You can convince yourself of this with this playground example, where I have replaced the raw pointers with references to trigger the borrow checker.

In order to get out of this problem, you will need to forcibly cause Rust to forget the existence of the vector, like so (playground):

impl AlmOptimizerStatus {

    pub fn lagrange_multipliers(self) -> Vec<f64> {
        self.lagrange_multipliers.unwrap_or(vec![])
    }

}
fn test() -> *const c_double {

   let status = solve();

   let output = status.lagrange_multipliers();
   let ptr = output.as_ptr();
   std::mem::forget(output);
   ptr
}

Notice the changes:

  • lagrange_multipliers() now destructures your struct and takes the inner vector. If you do not want this, you'll need to make a copy of it instead. As this wasn't the purpose of the question, I went with destructuring to keep the code down
  • std::mem::forget forgets a rust object, allowing it to go out of scope without being deallocated. This is how you typically pass objects across the FFI boundary, the second option being allocating memory via MaybeUninit, std::ptr or other means.

And the evident gotcha: doing this without dealing with the memory leak we have created on either the C side (via free) or the rust side (by recombining the Vec and then properly dropping it) will, evidently, leak memory