I need rust code to read lines of a file, and break them into an array of slices. The working code is
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
let mut f = stdin.lock();
let mut line : Vec<u8> = Vec::new();
loop {
line.clear();
let sz = f.read_until(b'\n', &mut line).unwrap();
if sz == 0 {break};
let body : Vec<&[u8]> = line.split(|ch| *ch == b'\t').collect();
DoStuff(body);
}
}
However, that code is slower than I'd like. The code I want to write is
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
let mut f = stdin.lock();
let mut line : Vec<u8> = Vec::new();
let mut body: Vec<&[u8]> = Vec::new();
loop {
line.clear();
let sz = f.read_until(b'\n', &mut line).unwrap();
if sz == 0 {break};
body.extend(&mut line.split(|ch| *ch == b'\t'));
DoStuff(body);
body.clear();
}
}
but that runs afoul of the borrow checker.
In general, I'd like a class containing a Vec<u8> and an associated Vec<&[u8]>, which is the basis of a lot of C++ code I'm trying to replace.
Is there any way I can accomplish this?
I realize that I could replace the slices with pairs of integers, but that seems clumsy.
No, I can't just use the items from the iterator as they come through -- I need random access to the individual column values. In the simplified case where I do use the iterator directly, I get a 3X speedup, which is why I suspect a significant speedup by replacing collect with extend.
Other comments on this code is also welcome.
DoStuff
? It looks like it currently takes ownership of aVec<&[u8]>
. And is that signature open for changes? Transferring ownership is antithetical to reusing allocations. – kmdreko&[&[u8]]
wouldn't take ownership of the vec. – kmdrekoline.clear()
. How much slower than the cpp implementation are we talking @Andy? Do you have an upper bound for the number of splits of a line? DoesVec::with_capacity()
help? After trying that, I might measure the performance ofbody.extend(line.split(|ch| *ch == b'\t').map(|v| v as *const [u8]));
and see if it really makes much of a difference – flumpbstd::iter::Extend<*const [u8]>
is not implemented for `std::vec::Vec<&[u8]> – Andy Jewell