I've only recently started dwelling into boost and it's containers, and I read a few articles on the web and on stackoverflow that a boost::unordered_map is the fastest performing container for big collections. So, I have this class State, which must be unique in the container (no duplicates) and there will be millions if not billions of states in the container. Therefore I have been trying to optimize it for small size and as few computations as possible. I was using a boost::ptr_vector before, but as I read on stackoverflow a vector is only good as long as there are not that many objects in it. In my case, the State descibes sensorimotor information from a robot, so there can be an enormous amount of states, and therefore fast lookup is of topemost priority. Following the boost documentation for unordered_map I realize that there are two things I could do to speed things up: use a hash_function, and use an equality operator to compare States based on their hash_function. So, I implemented a private hash() function which takes in State information and using boost::hash_combine, creates an std::size_t hash value. The operator== compares basically the state's hash values. So:
is std::size_t enough to cover billions of possible hash_function combinations ? In order to avoid duplicate states I intend to use their hash_values.
When creating a state_map, should I use as key the State* or the hash value ? i.e:
boost::unordered_map<State*,std::size_t> state_map;
Orboost::unordered_map<std::size_t,State*> state_map;
Are the lookup times with a boost::unordered_map::iterator = state_map.find() faster than going through a boost::ptr_vector and comparing each iterator's key value ?
Finally, any tips or tricks on how to optimize such an unordered map for speed and fast lookups would be greatly appreciated.
EDIT: I have seen quite a few answers, one being not to use boost but C++0X, another not to use an unordered_set, but to be honest, I still want to see how boost::unordered_set is used with a hash function. I have followed boost's documentation and implemented, but I still cannot figure out how to use the hash function of boost with the ordered set.
ordered_set
instead as a State* index only – lurscher