586
votes

R provides two different methods for accessing the elements of a list or data.frame: [] and [[]].

What is the difference between the two, and when should I use one over the other?

12

12 Answers

362
votes

The R Language Definition is handy for answering these types of questions:

R has three basic indexing operators, with syntax displayed by the following examples

    x[i]
    x[i, j]
    x[[i]]
    x[[i, j]]
    x$a
    x$"a"

For vectors and matrices the [[ forms are rarely used, although they have some slight semantic differences from the [ form (e.g. it drops any names or dimnames attribute, and that partial matching is used for character indices). When indexing multi-dimensional structures with a single index, x[[i]] or x[i] will return the ith sequential element of x.

For lists, one generally uses [[ to select any single element, whereas [ returns a list of the selected elements.

The [[ form allows only a single element to be selected using integer or character indices, whereas [ allows indexing by vectors. Note though that for a list, the index can be a vector and each element of the vector is applied in turn to the list, the selected component, the selected component of that component, and so on. The result is still a single element.

190
votes

The significant differences between the two methods are the class of the objects they return when used for extraction and whether they may accept a range of values, or just a single value during assignment.

Consider the case of data extraction on the following list:

foo <- list( str='R', vec=c(1,2,3), bool=TRUE )

Say we would like to extract the value stored by bool from foo and use it inside an if() statement. This will illustrate the differences between the return values of [] and [[]] when they are used for data extraction. The [] method returns objects of class list (or data.frame if foo was a data.frame) while the [[]] method returns objects whose class is determined by the type of their values.

So, using the [] method results in the following:

if( foo[ 'bool' ] ){ print("Hi!") }
Error in if (foo["bool"]) { : argument is not interpretable as logical

class( foo[ 'bool' ] )
[1] "list"

This is because the [] method returned a list and a list is not valid object to pass directly into an if() statement. In this case we need to use [[]] because it will return the "bare" object stored in 'bool' which will have the appropriate class:

if( foo[[ 'bool' ]] ){ print("Hi!") }
[1] "Hi!"

class( foo[[ 'bool' ]] )
[1] "logical"

The second difference is that the [] operator may be used to access a range of slots in a list or columns in a data frame while the [[]] operator is limited to accessing a single slot or column. Consider the case of value assignment using a second list, bar():

bar <- list( mat=matrix(0,nrow=2,ncol=2), rand=rnorm(1) )

Say we want to overwrite the last two slots of foo with the data contained in bar. If we try to use the [[]] operator, this is what happens:

foo[[ 2:3 ]] <- bar
Error in foo[[2:3]] <- bar : 
more elements supplied than there are to replace

This is because [[]] is limited to accessing a single element. We need to use []:

foo[ 2:3 ] <- bar
print( foo )

$str
[1] "R"

$vec
     [,1] [,2]
[1,]    0    0
[2,]    0    0

$bool
[1] -0.6291121

Note that while the assignment was successful, the slots in foo kept their original names.

126
votes

Double brackets accesses a list element, while a single bracket gives you back a list with a single element.

lst <- list('one','two','three')

a <- lst[1]
class(a)
## returns "list"

a <- lst[[1]]
class(a)
## returns "character"
95
votes

From Hadley Wickham:

From Hadley Wickham

My (crappy looking) modification to show using tidyverse / purrr:

enter image description here

54
votes

[] extracts a list, [[]] extracts elements within the list

alist <- list(c("a", "b", "c"), c(1,2,3,4), c(8e6, 5.2e9, -9.3e7))

str(alist[[1]])
 chr [1:3] "a" "b" "c"

str(alist[1])
List of 1
 $ : chr [1:3] "a" "b" "c"

str(alist[[1]][1])
 chr "a"
23
votes

Just adding here that [[ also is equipped for recursive indexing.

This was hinted at in the answer by @JijoMatthew but not explored.

As noted in ?"[[", syntax like x[[y]], where length(y) > 1, is interpreted as:

x[[ y[1] ]][[ y[2] ]][[ y[3] ]] ... [[ y[length(y)] ]]

Note that this doesn't change what should be your main takeaway on the difference between [ and [[ -- namely, that the former is used for subsetting, and the latter is used for extracting single list elements.

For example,

x <- list(list(list(1), 2), list(list(list(3), 4), 5), 6)
x
# [[1]]
# [[1]][[1]]
# [[1]][[1]][[1]]
# [1] 1
#
# [[1]][[2]]
# [1] 2
#
# [[2]]
# [[2]][[1]]
# [[2]][[1]][[1]]
# [[2]][[1]][[1]][[1]]
# [1] 3
#
# [[2]][[1]][[2]]
# [1] 4
#
# [[2]][[2]]
# [1] 5
#
# [[3]]
# [1] 6

To get the value 3, we can do:

x[[c(2, 1, 1, 1)]]
# [1] 3

Getting back to @JijoMatthew's answer above, recall r:

r <- list(1:10, foo=1, far=2)

In particular, this explains the errors we tend to get when mis-using [[, namely:

r[[1:3]]

Error in r[[1:3]] : recursive indexing failed at level 2

Since this code actually tried to evaluate r[[1]][[2]][[3]], and the nesting of r stops at level one, the attempt to extract through recursive indexing failed at [[2]], i.e., at level 2.

Error in r[[c("foo", "far")]] : subscript out of bounds

Here, R was looking for r[["foo"]][["far"]], which doesn't exist, so we get the subscript out of bounds error.

It probably would be a bit more helpful/consistent if both of these errors gave the same message.

18
votes

Being terminological, [[ operator extracts the element from a list whereas [ operator takes subset of a list.

17
votes

Both of them are ways of subsetting. The single bracket will return a subset of the list, which in itself will be a list. ie:It may or may not contain more than one elements. On the other hand a double bracket will return just a single element from the list.

-Single bracket will give us a list. We can also use single bracket if we wish to return multiple elements from the list. consider the following list:-

>r<-list(c(1:10),foo=1,far=2);

Now please note the way the list is returned when I try to display it. I type r and press enter

>r

#the result is:-

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

$foo

[1] 1

$far

[1] 2

Now we will see the magic of single bracket:-

>r[c(1,2,3)]

#the above command will return a list with all three elements of the actual list r as below

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

$foo

[1] 1


$far

[1] 2

which is exactly the same as when we tried to display value of r on screen, which means the usage of single bracket has returned a list, where at index 1 we have a vector of 10 elements, then we have two more elements with names foo and far. We may also choose to give a single index or element name as input to the single bracket. eg:

> r[1]

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

In this example we gave one index "1" and in return got a list with one element(which is an array of 10 numbers)

> r[2]

$foo

[1] 1

In the above example we gave one index "2" and in return got a list with one element

> r["foo"];

$foo

[1] 1

In this example we passed the name of one element and in return a list was returned with one element.

You may also pass a vector of element names like:-

> x<-c("foo","far")

> r[x];

$foo

[1] 1

$far
[1] 2

In this example we passed an vector with two element names "foo" and "far"

In return we got a list with two elements.

In short single bracket will always return you another list with number of elements equal to the number of elements or number of indices you pass into the single bracket.

In contrast, a double bracket will always return only one element. Before moving to double bracket a note to be kept in mind. NOTE:THE MAJOR DIFFERENCE BETWEEN THE TWO IS THAT SINGLE BRACKET RETURNS YOU A LIST WITH AS MANY ELEMENTS AS YOU WISH WHILE A DOUBLE BRACKET WILL NEVER RETURN A LIST. RATHER A DOUBLE BRACKET WILL RETURN ONLY A SINGLE ELEMENT FROM THE LIST.

I will site a few examples. Please keep a note of the words in bold and come back to it after you are done with the examples below:

Double bracket will return you the actual value at the index.(It will NOT return a list)

  > r[[1]]

     [1]  1  2  3  4  5  6  7  8  9 10


  >r[["foo"]]

    [1] 1

for double brackets if we try to view more than one elements by passing a vector it will result in an error just because it was not built to cater to that need, but just to return a single element.

Consider the following

> r[[c(1:3)]]
Error in r[[c(1:3)]] : recursive indexing failed at level 2
> r[[c(1,2,3)]]
Error in r[[c(1, 2, 3)]] : recursive indexing failed at level 2
> r[[c("foo","far")]]
Error in r[[c("foo", "far")]] : subscript out of bounds
15
votes

To help newbies navigate through the manual fog, it might be helpful to see the [[ ... ]] notation as a collapsing function - in other words, it is when you just want to 'get the data' from a named vector, list or data frame. It is good to do this if you want to use data from these objects for calculations. These simple examples will illustrate.

(x <- c(x=1, y=2)); x[1]; x[[1]]
(x <- list(x=1, y=2, z=3)); x[1]; x[[1]]
(x <- data.frame(x=1, y=2, z=3)); x[1]; x[[1]]

So from the third example:

> 2 * x[1]
  x
1 2
> 2 * x[[1]]
[1] 2
8
votes

For yet another concrete use case, use double brackets when you want to select a data frame created by the split() function. If you don't know, split() groups a list/data frame into subsets based on a key field. It's useful if when you want to operate on multiple groups, plot them, etc.

> class(data)
[1] "data.frame"

> dsplit<-split(data, data$id)
> class(dsplit)
[1] "list"

> class(dsplit['ID-1'])
[1] "list"

> class(dsplit[['ID-1']])
[1] "data.frame"
-1
votes

Please refer the below-detailed explanation.

I have used Built-in data frame in R, called mtcars.

> mtcars 
               mpg cyl disp  hp drat   wt ... 
Mazda RX4     21.0   6  160 110 3.90 2.62 ... 
Mazda RX4 Wag 21.0   6  160 110 3.90 2.88 ... 
Datsun 710    22.8   4  108  93 3.85 2.32 ... 
           ............

The top line of the table is called the header which contains the column names. Each horizontal line afterward denotes a data row, which begins with the name of the row, and then followed by the actual data. Each data member of a row is called a cell.

single square bracket "[]" operator

To retrieve data in a cell, we would enter its row and column coordinates in the single square bracket "[]" operator. The two coordinates are separated by a comma. In other words, the coordinates begin with row position, then followed by a comma, and ends with the column position. The order is important.

Eg 1:- Here is the cell value from the first row, second column of mtcars.

> mtcars[1, 2] 
[1] 6

Eg 2:- Furthermore, we can use the row and column names instead of the numeric coordinates.

> mtcars["Mazda RX4", "cyl"] 
[1] 6 

Double square bracket "[[]]" operator

We reference a data frame column with the double square bracket "[[]]" operator.

Eg 1:- To retrieve the ninth column vector of the built-in data set mtcars, we write mtcars[[9]].

mtcars[[9]] [1] 1 1 1 0 0 0 0 0 0 0 0 ...

Eg 2:- We can retrieve the same column vector by its name.

mtcars[["am"]] [1] 1 1 1 0 0 0 0 0 0 0 0 ...

-1
votes

In addition:

Following the L I N K of the A N S W E R here.

Here is a little example addressing the following point:

x[i, j] vs x[[i, j]]

df1   <- data.frame(a = 1:3)
df1$b <- list(4:5, 6:7, 8:9)

df1[[1,2]]
df1[1,2]

str(df1[[1,2]])
str(df1[1,2])