Rust Data Structure

Question

I am currently learning Rust for fun. I have some experience in C / C++ and other experience in other programming languages that use more complex paradigms like generics.

Background

For my first project (after the tutorial), I wanted to create a N-Dimensional array (or Matrix) data structure to practice development in Rust.

Here is what I have so far for my Matrix struct and a basic fill and new initializations.

Forgive the absent bound checking and parameter testing

pub struct Matrix<'a, T> {
    data: Vec<Option<T>>,
    dimensions: &'a [usize],
}

impl<'a, T: Clone> Matrix<'a, T> {
    pub fn fill(dimensions: &'a [usize], fill: T) -> Matrix<'a, T> {
        let mut total = if dimensions.len() > 0 { 1 } else { 0 };
        for dim in dimensions.iter() {
            total *= dim;
        }
        Matrix {
            data: vec![Some(fill); total],
            dimensions: dimensions,
        }
    }

    pub fn new(dimensions: &'a [usize]) -> Matrix<'a, T> {
        ...
        Matrix {
            data: vec![None; total],
            dimensions: dimensions,
        }
    }
}

I wanted the ability to create an "empty" N-Dimensional array using the New fn. I thought using the Option enum would be the best way to accomplish this, as I can fill the N-Dimensional with None and it would allocate space for this T generic automatically.

So then it comes down to being able to set the entries for this. I found the IndexMut and Index traits that looked like I could do something like m[&[2, 3]] = 23. Since the logic is similar to each other here is the IndexMut impl for Matrix.

impl<'a, T> ops::IndexMut<&[usize]> for Matrix<'a, T> {
    fn index_mut(&mut self, indices: &[usize]) -> &mut Self::Output {
        match self.data[get_matrix_index(self.dimensions, indices)].as_mut() {
            Some(x) => x,
            None => {
                NOT SURE WHAT TO DO HERE.
            }
        }
    }
}

Ideally what would happen is that the value (if there) would be changed i.e.

let mut mat = Matrix::fill(&[4, 4], 0)
mat[&[2, 3]] = 23

This would set the value from 0 to 23 (which the above fn does via returning &mut x from Some(x)). But I also want None to set the value i.e.

let mut mat = Matrix::new(&[4, 4])
mat[&[2, 3]] = 23

Question

Finally, is there a way to make m[&[2,3]] = 23 possible with what the Vec struct requires to allocate the memory? If not what should I change and how can I still have an array with "empty" spots. Open to any suggestions as I am trying to learn. :)

Final Thoughts

Through my research, the Vec struct impls I see that the type T is typed and has to be Sized. This could be useful as to allocate the Vec with the appropriate size via vec![pointer of T that is null but of size of T; total]. But I am unsure of how to do this.

What do you have for the Index trait? What do you return if the value at that index doesn't exist? — kmdreko
I ask because implementing IndexMut as desired is possible, but since IndexMut's output type must match a corresponding Index trait, it would have troubling implications for the Index implementation. — kmdreko
For the Index trait I have the same problem if the value of the Option<T> enum is None. I don't know what to return. Currently I just have it panic but I am unsure of what to do that is idiomatic to Rust and has the functionality to return a "null" pointer or something that makes sense to the user. — IpFruion
Ideally this would mean in situations such as println!("{}", m[&[1, 1]]); where m[&[1, 1]] is the option enum None it would error out because it would be doing a to_string on a "null" value or something similar. But for functionality it Index should return the value that is at that spot in the matrix. — IpFruion

Buzzec Buzzec · Accepted Answer · 2021-02-25T04:08:49

So there are a few ways to make this more similar to idiomatic rust, but first, let's look at why the none branch doesn't make sense.

So the Output type for IndexMut I'm going to assume is &mut T as you don't show the index definition but I feel safe in that assumption. The type &mut T means a mutable reference to an initialized T, unlike pointers in C/C++ where they can point to initialized or uninitialized memory. What this means is that you have to return an initialized T which the none branch cannot because there is no initialized value. This leads to the first of the more idiomatic ways.

Return an `Option<T>`

The easiest way would be to change Index::Output to be an Option<T>. This is better because the user can decide what to do if there was no value there before and is close to what you are actually storing. Then you can also remove the panic in your index method and allow the caller to choose what to do if there is no value. At this point, I think you can go a little further with gentrifying the structure in the next option.

Store a `T` directly

This method allows the caller to directly change what the type is that's stored rather than wrapping it in an option. This cleans up most of your index code nicely as you just have to access what's already stored. The main problem is now initialization, how do you represent uninitialized values? You were correct that option is the best way to do this¹, but now the caller can decide to have this optional initialization capability by storing an Option themselves. So that means we can always store initialized Ts without losing functionality. This only really changes your new function to instead not fill with None values. My suggestion here is to make a bound T: Default for the new function²:

impl<'a, T: Default> Matrix<'a, T> {
  pub fn new(dimensions: &'a [usize]) -> Matrix<'a, T> {
    Matrix {
      data: (0..total).into_iter().map(|_|Default::default()).collect(),
      dimensions: dimensions,
    }
  }
}

This method is much more common in the rust world and allows the caller to choose whether to allow for uninitialized values. Option<T> also implements default for all T and returns None So the functionality is very similar to what you have currently.

Aditional Info

As you're new to rust there are a few comments that I can make about traps that I've fallen into before. To start your struct contains a reference to the dimensions with a lifetime. What this means is that your structs cannot exist longer than the dimension object that created them. This hasn't caused you a problem so far as all you've been passing is statically created dimensions, dimensions that are typed into the code and stored in static memory. This gives your object a lifetime of 'static, but this won't occur if you use dynamic dimensions.

How else can you store these dimensions so that your object always has a 'static lifetime (same as no lifetime)? Since you want an N-dimensional array stack allocation is out of the question since stack arrays must be deterministic at compile time (otherwise known as const in rust). This means you have to use the heap. This leaves two real options Box<[usize]> or Vec<usize>. Box is just another way of saying this is on the heap and adds Sized to values that are ?Sized. Vec is a little more self-explanatory and adds the ability to be resized at the cost of a little overhead. Either would allow your matrix object to always have a 'static lifetime.

1. The other way to represent this without Option<T>'s discriminate is MaybeUninit<T> which is unsafe territory. This allows you to have a chunk of initialized memory big enough to hold a T and then assume it's initialized unsafely. This can cause a lot of problems and is usually not worth it as Option is already heavily optimized in that if it stores a type with a pointer it uses compiler magic to store the discriminate in whether or not that value is a null pointer.
2. The reason this section doesn't just use vec![Default::default(); total] is that this requires T: Clone as the way this macro works the first part is called once and cloned until there are enough values. This is an extra requirement that we don't need to have so the interface is smoother without it.

Rust Data Structure

1 Answers

Return an Option<T>

Store a T directly

Aditional Info

Return an `Option<T>`

Store a `T` directly