34
votes

We are used to different forms of indexing in Matlab:

  • standard (using integers along each dimension),
  • logical (using logical values),
  • linear (using a single index to traverse an array with more than one dimension).

At first sight, it may appear that these forms are exclusive: an index is either standard, or logical, or linear. However, sometimes there appears to be a blend between several of these forms. For example,

>> A = magic(3)
A =
     8     1     6
     3     5     7
     4     9     2
>> A(A>5)
ans =
     8
     9
     6
     7

This is logical indexing, right? But it also has some features of linear indexing, because a column vector is returned. In fact, the logical index A>5 has the same effect as the linear index find(A>5).

As a second example, consider

>> A = magic(3)
A =
     8     1     6
     3     5     7
     4     9     2
>> A(1:2, [true false true])
ans =
     8     6
     3     7

In this expression, standard (integer-valued) indexing is used for the first coordinate, and logical indexing is used for the second.

These examples (and more complicated ones that arise in practice) pose the following questions:

  • What types of indexing are there in Matlab?
  • How can they be combined?
  • How should they be referred to?
1

1 Answers

41
votes

In the following I use terminology that I think is more or less in line with standard Matlab practice. However, in some cases I've had to sort-of make up a name because I wasn't aware of an existing one. Please let me know if there are more standard names than those I'm using.

This answer tries to clarify the different types of indexing and how they can be combined. A different question is how the shape (size) of the output array is determined as a function of the shape of the index variables. A good post on this is Essence of indexing by Loren Shure.

The description to follow focuses on indexing of numerical arrays, but it can be applied to cell arrays with either parenthesis or curly-brace indexing, with the obvious change of output type (cell array or comma-separated list, respectively). This will be briefly discussed at the end.

Types of indexing in numerical arrays

Indexing can be classified considering the following two attributes.

  1. According to the number of dimensions each index variable refers to, indexing can be multidimensional or linear. But these are only two extreme cases. An intermediate situation exists, which may be termed partially linear indexing:

    • Pure multidimensional indexing specifies an index variable for each dimension of the array. The individual indices are sometimes referred to as subscripts in Matlab documentation (see for example sub2ind).
    • Pure linear indexing specifies a single index variable that traverses the array across all dimensions (this can be viewed as if all dimensions collapse into one). As we know, the traversal is along columns first, then along rows, then along third-dim slices, etc (so-called column-major order).
    • Partially linear indexing: given an array with m+n dimensions, n>=2, one can specify m index variables for the first m dimensions (thus using multidimensional indexing in those dimensions) and one index variable for the last n dimensions, which is interpreted as a linear index for those dimensions only (the last n dimensions collapse into one).
  2. Acccording to the type of the index values, each index variable can be integer-valued or logical:

    • It is integer-valued if the index variable contains positive integers;
    • It is logical if the index variable contains logical values.

Classification criteria 1 and 2 are independent. The category of the index from the point of view of criterion 1 has no relationship with its category according to criterion 2. All combinations are possible.

Thus, according to the above classification, there are 6 basic types of indexing. To clarify, following is an example for each. All examples use the array A = cat(3, magic(3), 9+magic(3)), that is,

A(:,:,1) =
     8     1     6
     3     5     7
     4     9     2
A(:,:,2) =
    17    10    15
    12    14    16
    13    18    11
  1. Multidimensional, integer-valued:

    >> A([1 2], 2, 2)
    ans =
        10
        14
    
  2. Linear, integer-valued:

    >> A([2 5:7])
    ans =
         3     5     9     6
    
  3. Partially linear, integer-valued:

    >> A([1 2], 2:4)
    ans =
         1     6    17
         5     7    12
    
  4. Multidimensional, logical:

    >> A([true true false], [false true false], [false true])
    ans =
        10
        14
    

    Interestingly, the number of logical values may be smaller, or even larger, than the size in the dimension the index refers to:

    >> A([true true], [false true false false], [false true])
    ans =
        10
        14
    

    Missing values are interpreted as false, and surplus values must be false or an error will occur. See for example this page by Mathworks or this answer by Jonas.

  5. Linear, logical:

    >> A([false true false false true true true])
    ans =
         3     5     9     6
    

    (Note that 11 trailing false values have been left out in the indexing vector.)

  6. Partially linear, logical:

    >> A([true true false], [false true true true false false])
    ans =
         1     6    17
         5     7    12
    

In multidimensional or partially linear indexing, in which there is more than one index variable, each can independently be integer-valued or logical. This gives rise to different mixed types. For example:

  1. Multidimensional, logical/integer-valued:

    >> A([true false true], [false true true], 2)
    ans =
        10    15
        18    11
    
  2. Partially linear, integer-valued/logical:

    >> A([1 2], [true false true false true false])
    ans =
         8     6    10
         3     7    14
    

If the array that is being indexed is a sparse matrix all of the above still applies, except that partially linear indexing doesn't exist for matrices; and of course the result is also sparse.

Indexing of cell arrays

All the types of indexing described for numerical arrays can be applied to cell arrays, with one additional consideration. Cell arrays can be indexed with parentheses or with curly braces. In the first case the result of the indexing is a cell array. In the second it is a comma-separated list of the cell contents.

As an example, suppose the numerical array used in the previous examples is transformed into the cell array C = num2cell(A), that is,

C(:,:,1) = 
    [8]    [1]    [6]
    [3]    [5]    [7]
    [4]    [9]    [2]
C(:,:,2) = 
    [17]    [10]    [15]
    [12]    [14]    [16]
    [13]    [18]    [11]

Then the indexing used in example 8 above would yield the cell array

>> C([1 2], [true false true false true false])
ans = 
    [8]    [6]    [10]
    [3]    [7]    [14]

whereas using curly braces would yield the comma-separated list

>> C{[1 2], [true false true false true false]}
ans =
     8
ans =
     3
ans =
     6
ans =
     7
ans =
    10
ans =
    14 

Take-away message / TL;DR

Logical and linear indexing are not exclusive types of indexing. Rather, they are two independent features of indexing. "Logical" refers to the type of the index values, and "linear" indicates that several dimensions are being collapsed and indexed as one. Both features can happen simultaneously.