142
votes

I am a newbie for R, and I am quite confused with the usage of local and global variables in R.

I read some posts on the internet that say if I use = or <- I will assign the variable in the current environment, and with <<- I can access a global variable when inside a function.

However, as I remember in C++ local variables arise whenever you declare a variable inside brackets {}, so I'm wondering if this is the same for R? Or is it just for functions in R that we have the concept of local variables.

I did a little experiment, which seems to suggest that only brackets are not enough, am I getting anything wrong?

{
   x=matrix(1:10,2,5)
}
print(x[2,2])
[1] 4
3
Some code to run in addition to these answers: globalenv(); globalenv() %>% parent.env; globalenv() %>% parent.env %>% parent.env, …isomorphismes
@isomorphismes, Error: could not find function "%>%" . Is that another form of assignment?Aaron McDaid
Relevant thread on R-help: What does the "<<-" operator mean?.Henrik
@AaronMcDaid Hi, sorry for not responding sooner! That is from require(magrittr). It's a way of applying functions on the right (x | f1 | f2 | f3) instead of on the left (f3( f2( f1( x ) ) )).isomorphismes

3 Answers

165
votes

Variables declared inside a function are local to that function. For instance:

foo <- function() {
    bar <- 1
}
foo()
bar

gives the following error: Error: object 'bar' not found.

If you want to make bar a global variable, you should do:

foo <- function() {
    bar <<- 1
}
foo()
bar

In this case bar is accessible from outside the function.

However, unlike C, C++ or many other languages, brackets do not determine the scope of variables. For instance, in the following code snippet:

if (x > 10) {
    y <- 0
}
else {
    y <- 1
}

y remains accessible after the if-else statement.

As you well say, you can also create nested environments. You can have a look at these two links for understanding how to use them:

  1. http://stat.ethz.ch/R-manual/R-devel/library/base/html/environment.html
  2. http://stat.ethz.ch/R-manual/R-devel/library/base/html/get.html

Here you have a small example:

test.env <- new.env()

assign('var', 100, envir=test.env)
# or simply
test.env$var <- 100

get('var') # var cannot be found since it is not defined in this environment
get('var', envir=test.env) # now it can be found
144
votes

<- does assignment in the current environment.

When you're inside a function R creates a new environment for you. By default it includes everything from the environment in which it was created so you can use those variables as well but anything new you create will not get written to the global environment.

In most cases <<- will assign to variables already in the global environment or create a variable in the global environment even if you're inside a function. However, it isn't quite as straightforward as that. What it does is checks the parent environment for a variable with the name of interest. If it doesn't find it in your parent environment it goes to the parent of the parent environment (at the time the function was created) and looks there. It continues upward to the global environment and if it isn't found in the global environment it will assign the variable in the global environment.

This might illustrate what is going on.

bar <- "global"
foo <- function(){
    bar <- "in foo"
    baz <- function(){
        bar <- "in baz - before <<-"
        bar <<- "in baz - after <<-"
        print(bar)
    }
    print(bar)
    baz()
    print(bar)
}
> bar
[1] "global"
> foo()
[1] "in foo"
[1] "in baz - before <<-"
[1] "in baz - after <<-"
> bar
[1] "global"

The first time we print bar we haven't called foo yet so it should still be global - this makes sense. The second time we print it's inside of foo before calling baz so the value "in foo" makes sense. The following is where we see what <<- is actually doing. The next value printed is "in baz - before <<-" even though the print statement comes after the <<-. This is because <<- doesn't look in the current environment (unless you're in the global environment in which case <<- acts like <-). So inside of baz the value of bar stays as "in baz - before <<-". Once we call baz the copy of bar inside of foo gets changed to "in baz" but as we can see the global bar is unchanged. This is because the copy of bar that is defined inside of foo is in the parent environment when we created baz so this is the first copy of bar that <<- sees and thus the copy it assigns to. So <<- isn't just directly assigning to the global environment.

<<- is tricky and I wouldn't recommend using it if you can avoid it. If you really want to assign to the global environment you can use the assign function and tell it explicitly that you want to assign globally.

Now I change the <<- to an assign statement and we can see what effect that has:

bar <- "global"
foo <- function(){
    bar <- "in foo"   
    baz <- function(){
        assign("bar", "in baz", envir = .GlobalEnv)
    }
    print(bar)
    baz()
    print(bar)
}
bar
#[1] "global"
foo()
#[1] "in foo"
#[1] "in foo"
bar
#[1] "in baz"

So both times we print bar inside of foo the value is "in foo" even after calling baz. This is because assign never even considered the copy of bar inside of foo because we told it exactly where to look. However, this time the value of bar in the global environment was changed because we explicitly assigned there.

Now you also asked about creating local variables and you can do that fairly easily as well without creating a function... We just need to use the local function.

bar <- "global"
# local will create a new environment for us to play in
local({
    bar <- "local"
    print(bar)
})
#[1] "local"
bar
#[1] "global"
3
votes

A bit more along the same lines

attrs <- {}

attrs.a <- 1

f <- function(d) {
    attrs.a <- d
}

f(20)
print(attrs.a)

will print "1"

attrs <- {}

attrs.a <- 1

f <- function(d) {
   attrs.a <<- d
}

f(20)
print(attrs.a)

Will print "20"