Algorithm Design - When to use a dictionary vs a list to track values

Question

For the following problem, I used a dictionary to track values while the provided answer used a list. Is there a quick way to determine the most efficient data structures for problems like these?

A robot moves in a plane starting from the original point (0,0). The robot can move toward UP, DOWN, LEFT and RIGHT with a given steps. The trace of robot movement is shown as the following: UP 5 DOWN 3 LEFT 3 RIGHT 2. The numbers after the direction are steps. Please write a program to compute the distance from current position after a sequence of movement and original point. If the distance is a float, then just print the nearest integer. Example: If the following tuples are given as input to the program: UP 5 DOWN 3 LEFT 3 RIGHT 2 Then, the output of the program should be: 2

My answer uses a dictionary (origin["y"] for y and origin["x"] for x):

direction = 0
steps = 0
command = (direction, steps)
command_list = []
origin = {"x": 0, "y": 0}
while direction is not '':
    direction = input("Direction (U, D, L, R):")
    steps = input("Number of steps:")
    command = (direction, steps)
    command_list.append(command)
print(command_list)

while len(command_list) > 0:
    current = command_list[-1]
    if current[0] == 'U':
        origin["y"] += int(current[1])
    elif current[0] == 'D':
        origin["y"] -= int(current[1])
    elif current[0] == 'L':
        origin["x"] -= int(current[1])
    elif current[0] == 'R':
        origin["x"] += int(current[1])
    command_list.pop()
distance = ((origin["x"])**2 + (origin["y"])**2)**0.5
print(distance)

The provided answer uses a list (pos[0] for y, and pos[1] for x):

import math
pos = [0,0]
while True:
    s = raw_input()
    if not s:
        break
    movement = s.split(" ")
    direction = movement[0]
    steps = int(movement[1])
    if direction=="UP":
        pos[0]+=steps
    elif direction=="DOWN":
        pos[0]-=steps
    elif direction=="LEFT":
        pos[1]-=steps
    elif direction=="RIGHT":
        pos[1]+=steps
    else:
        pass

print int(round(math.sqrt(pos[1]**2+pos[0]**2)))

How is the distance calculated? In the example, the robot moved 2 up and 1 to the left. How does that result in a distance of 2? — user2390182
The big difference is that the provided answer is computing the robot position while reading the input. So there is no storage for the inputs. Your code reads all of the inputs, stores those inputs, and then computes the robot position. The choice of dictionary versus list to hold the position is a minor difference. I like the dictionary since it makes the code a little more readable, but the list is probably slightly faster. — user3386109
@schwobaseggl sorry that was unclear. the distance is computed by pythagorean theorem. The distance of 2 is rounded. — PanczerTank
If you're going for maximum efficiency, then I would use individual variables, x and y. Since there's only one robot to keep track of, there's really no need for a dictionary or a list. But to answer your question directly, it's slightly faster to index into a list, than it is to look up a key in a dictionary. — user3386109
I am sorry but I have to inform you that this question is totally opinion based in your situation. You could also have used a class Point (which btw would have been much nicer, in my opinion). — moooeeeep

Gene Gene · Accepted Answer · 2017-11-20T05:40:05

I'll offer a few points on your question because I strongly disagree with the close recommendations. There's much in your question that's not opinion.

In general, your choice of dictionary wasn't appropriate. For a toy program like this it doesn't make much difference, but I assume you're interested in best practice for serious programs. In production software, you wouldn't make this choice. Why?

Error prone-ness. A typo in future code, e.g. origin["t"] = 3 when you meant origin["y"] = 3 is a nasty bug, maybe difficult to find. t = 3 is more likely to cause a "fast failure." (In a statically typed language like C++ or Java, it's a sure compile-time error.)
Space overhead. A simple scalar variable requires essentially no space beyond the value itself. An array has a fixed overhead for the "dope vector" that tracks its location, current, and maximum size. A dictionary requires yet more extra space for open addressing, unused hash buckets, and fill tracking.
Speed.
- Accessing a scalar variable is very fast: just a few processor instructions.
- Accessing a tuple or array element when you know its index is also very fast, though not as fast as variable access. Extra instructions are needed to check array bounds. Adding one element to an array may take O(current array size) to copy current contents into a larger block of memory. The advantage of tuples and arrays is that you can access elements quickly based on a computed integer index. Scalar variables don't do this. Choose an array/tuple when you need integer index access. Favor tuples when you know the exact size and it's unlikely to change. Their immutability tends to make code more understandable (and thread safe).
- Accessing a dictionary element is still more expensive because a hash value must be computed and buckets traversed with possible collision resolution. Adding a single element can also trigger a table reorganization, which is O(table size) with constant factor much bigger than list reorganization because all the elements must be rehashed. The big advantage of dictionaries is that accessing all stored pairs is likely to take the same amount of time. You should choose a dict only when you need that capability: to store a "map" from keys to values.

Conclude from all the above that the best choice for your origin coordinates would have been simple variables. If you later enhance the program in a way that requires passing (x, y) pairs to/from methods, then you'd consider a Point class.

Algorithm Design - When to use a dictionary vs a list to track values

1 Answers