15
votes

I've read a lot about unordered_map (c++11) time-complexity here at stackoverflow, but I haven't found the answer for my question.

Let's assume indexing by integer (just for example):

Insert/at functions work constantly (in average time), so this example would take O(1)

std::unordered_map<int, int> mymap = {
            { 1, 1},
            { 100, 2},
            { 100000, 3 }
};

What I am curious about is how long does it take to iterate through all (unsorted) values stored in map - e.g.

for ( auto it = mymap.begin(); it != mymap.end(); ++it ) { ... }

Can I assume that each stored value is accessed only once (or twice or constant-times)? That would imply that iterate through all values is in N-valued map O(N). The other possibility is that my example with keys {1,10,100000} could take up to 1000000 iteration (if represented by array)

Is there any other container, that can be iterated linearly and value accessed by given key constantly?

What I would really need is (pseudocode)

myStructure.add(key, value) // O(1)
value = myStructure.at(key) // O(1)
for (auto key : mySructure) {...} // O(1) for each key/value pair = O(N) for N values

Is std::unordered_map the structure I need?

Integer indexing is sufficient, average complexity as well.

3
If you're concerned that the enumeration will require walking over pairings that you've not inserted into your container, rest assured it won't. The decision to use a regular map vs unordered_map should be based on whether you need a relative strict weak ordering of your keys retained. If you do, you need a regular map. If you don't, unordered_map is the most logical choice (provided the keys can be hashed to a reasonable distribution).WhozCraig
@WhozCraig: another functional factor to consider when choosing map or unordered_map is whether the latter's invalidation of existing iterators/references/pointers during insert/emplace/[] triggering rehashing is acceptable, then there're performance differences which tend to be in unordered_map's favour but should be measured by those whose profilers/instrumentation says must really care....Tony Delroy

3 Answers

16
votes

Regardless of how they're implemented, standard containers provide iterators that meet the iterator requirements. Incrementing an iterator is required to be constant time, so iterating through all the elements of any standard container is O(N).

4
votes

The complexity guarantees of all standard containers are specified in the C++ Standard.

std::unordered_map element access and element insertion is required to be of complexity O(1) on average and O(N) worst case (cf. Sections 23.5.4.3 and 23.5.4.4; pages 797-798).

A specific implementation (that is, a specific vendor's implementation of the Standard Library) can choose whatever data structure they want. However, to be compliant with the Standard, their complexity must be at least as specified.

3
votes

There's a few different ways that a hash table can be implemented, and I suggest you read more on those if you're interested, but the main two are through chaining and open addressing.

In the first case you have an array of linked lists. Each entry in the array could be empty, each item in the hashtable will be in some bucket. So iteration is walking down the array, and walking down each non-empty list in it. Clearly O(N), but could potentially be very memory inefficient depending on how the linked lists themselves are allocated.

In the second case, you just have one very large array which will have lots of empty slots. Here, iteration is again clearly linear, but could be inefficient if the table is mostly empty (which is should be for lookup purposes) because the elements that are actually present will be in different cache lines.

Either way, you're going to have linear iteration and you're going to be touching every element exactly once. Note that this is true of std::map also, iteration will be linear there as well. But in the case of the maps, iteration will definitely be far less efficient that iterating a vector, so keep that in mind. If your use-case involves requiring BOTH fast lookup and fast iteration, if you insert all your elements up front and never erase, it could be much better to actually have both the map and the vector. Take the extra space for the added performance.