99
votes

Why doesn't 'example'[999:9999] result in error? Since 'example'[9] does, what is the motivation behind it?

From this behavior I can assume that 'example'[3] is, essentially/internally, not the same as 'example'[3:4], even though both result in the same 'm' string.

3
[999:9999] isn't an index, it's a slice, and has different semantics. From the python intro: "Degenerate slice indices are handled gracefully: an index that is too large is replaced by the string size, an upper bound smaller than the lower bound returns an empty string." - Wooble
@Wooble that is the actual answer - jondavidjohn
@Wooble And do you know why it’s this way? Thank you for your clarification. - ijverig
Why? You'd have to ask Guido, but I think it's elegant to be able to assume a slice is always the same type of sequence as the original sequence, myself. - Wooble
@Lapinot yes I've written code that depends on this behavior. Unfortunately I can't remember the exact code so I can't tell you why. Probably had to do with substrings; getting an empty string can be exactly what you want at times. - Mark Ransom

3 Answers

75
votes

You're correct! 'example'[3:4] and 'example'[3] are fundamentally different, and slicing outside the bounds of a sequence (at least for built-ins) doesn't cause an error.

It might be surprising at first, but it makes sense when you think about it. Indexing returns a single item, but slicing returns a subsequence of items. So when you try to index a nonexistent value, there's nothing to return. But when you slice a sequence outside of bounds, you can still return an empty sequence.

Part of what's confusing here is that strings behave a little differently from lists. Look what happens when you do the same thing to a list:

>>> [0, 1, 2, 3, 4, 5][3]
3
>>> [0, 1, 2, 3, 4, 5][3:4]
[3]

Here the difference is obvious. In the case of strings, the results appear to be identical because in Python, there's no such thing as an individual character outside of a string. A single character is just a 1-character string.

(For the exact semantics of slicing outside the range of a sequence, see mgilson's answer.)

38
votes

For the sake of adding an answer that points to a robust section in the documentation:

Given a slice expression like s[i:j:k],

The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j). When k is positive, i and j are reduced to len(s) if they are greater

if you write s[999:9999], python is returning s[len(s):len(s)] since len(s) < 999 and your step is positive (1 -- the default).

7
votes

Slicing is not bounds-checked by the built-in types. And although both of your examples appear to have the same result, they work differently; try them with a list instead.