0
votes

I am trying to run a pretty simple line of code in order to create a ggplot for homework. I am very new to R, so I suspect this is a simple problem but I'm just treading water at this point. My professor actually wrote this code and it has worked for other students I have spoken to. However, I am getting an error, which is pretty confusing.

Part of the problem may be that I previously attempted to coerce ggplot into a data frame (because I did not realise exactly what ggplot was for a while) and named that gg.

This line of code has been crashing since I began the assignment Note: This was code that was provided by my professor and works for others

ggplot(filter(gapminder, gapminder$year==1987, group=1)) + geom_point(aes(gdpPercap, lifeExp, color=continent, size=pop)) + xlab("GDP per capita") + ylab("Life expectancy at birth")

I attempted to coerce ggplot to a data frame using:

gg = as.data.frame(ggplot)

Obviously this didn't work or help, but after deleting this code from the file it may still be affecting the former line of code??

I expected a plot of some kind at least, but instead I get the following error:

Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class "c("gg", "ggplot")" to a data.frame

Any help would be appreciated!

1
Welcome to SO, did you try running it in a fresh session? Also, it is hard to diagnosis a problem when you don't have a reproducible example. Try using dput. - MatthewR
ggplot is a function. You should not want to convert it to a data.frame. Do you rather want to convert the result of ggplot to a data.frame? - Michael M

1 Answers

2
votes

ggplot takes a data frame as its input, and creates a plot object with many pieces corresponding to all the parameters that generated it. While it's technically possible to extract the data back out of a ggplot, it's a little complicated and probably wouldn't be in an introductory session. (See bottom for an example of this.)

Based on other ggplot tutorials I've seen (like this one from its creator), it's more typical to start by showing the data frame as it goes in, and showing how filtering that data changes the plot.

Here's a process that should work. If it doesn't work for you, please share any specific error messages you're getting.

  1. Restart R. If you're using RStudio, click Session -> Restart R.
  2. Load libraries. The example uses ggplot2 and gapminder at the least, maybe others as well.

library(ggplot2)
library(gapminder)
library(dplyr)   # I think this is the source of the "filter" function used here

  1. Look at the data frame. Here's the gapminder data, which has 1,704 rows. Each country has a row for each year in the data, eg 1952, 1957, etc.

> gapminder
# A tibble: 1,704 x 6
   country     continent  year lifeExp      pop gdpPercap
   <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
 1 Afghanistan Asia       1952    28.8  8425333      779.
 2 Afghanistan Asia       1957    30.3  9240934      821.
 3 Afghanistan Asia       1962    32.0 10267083      853.
 4 Afghanistan Asia       1967    34.0 11537966      836.
 5 Afghanistan Asia       1972    36.1 13079460      740.
 6 Afghanistan Asia       1977    38.4 14880372      786.
 7 Afghanistan Asia       1982    39.9 12881816      978.
 8 Afghanistan Asia       1987    40.8 13867957      852.
 9 Afghanistan Asia       1992    41.7 16317921      649.
10 Afghanistan Asia       1997    41.8 22227415      635.
# … with 1,694 more rows
  1. We could filter to just the data from 1957. (I'm not sure what the group = 1 part was for -- perhaps there was an earlier step not mentioned in your question?)

# Note: equivalent to `filter(gapminder, year == 1957)`
> filter(gapminder, gapminder$year == 1957)
# A tibble: 142 x 6
   country     continent  year lifeExp      pop gdpPercap
   <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
 1 Afghanistan Asia       1957    30.3  9240934      821.
 2 Albania     Europe     1957    59.3  1476505     1942.
 3 Algeria     Africa     1957    45.7 10270856     3014.
 4 Angola      Africa     1957    32.0  4561361     3828.
 5 Argentina   Americas   1957    64.4 19610538     6857.
 6 Australia   Oceania    1957    70.3  9712569    10950.
 7 Austria     Europe     1957    67.5  6965860     8843.
 8 Bahrain     Asia       1957    53.8   138655    11636.
 9 Bangladesh  Asia       1957    39.3 51365468      662.
10 Belgium     Europe     1957    69.2  8989111     9715.
# … with 132 more rows
  1. Send that filtered data into ggplot. The first term of the ggplot function represents the input data. (I leave out "group = 1" here since I don't know where that was defined. Might that part actually belong inside the aes(...) part? group = 1 is sometimes used there when we want ggplot to provide some sort of stat where we want it to treat the whole data set as one group, eg if you want the average gdp for all countries instead of separate averages by continent...)

ggplot(filter(gapminder, gapminder$year==1987)) + 
  geom_point(aes(gdpPercap, lifeExp, color=continent, size=pop)) + 
  xlab("GDP per capita") + 
  ylab("Life expectancy at birth")

Here's the output I get for that. Any hiccups?

enter image description here


Extracting data back out of ggplot object.

Here's the same plot, assigned to an object called gg:

gg <- ggplot(filter(gapminder, gapminder$year==1987)) + 
        geom_point(aes(gdpPercap, lifeExp, color=continent, size=pop)) + 
        xlab("GDP per capita") + 
        ylab("Life expectancy at birth")

That gg object combines many components. In RStudio, you can examine them and extract the components interactively. One of them is the source data:

enter image description here

> gg[["data"]]
# A tibble: 142 x 6
   country     continent  year lifeExp       pop gdpPercap
   <fct>       <fct>     <int>   <dbl>     <int>     <dbl>
 1 Afghanistan Asia       1987    40.8  13867957      852.
 2 Albania     Europe     1987    72     3075321     3739.
 3 Algeria     Africa     1987    65.8  23254956     5681.
 4 Angola      Africa     1987    39.9   7874230     2430.
 5 Argentina   Americas   1987    70.8  31620918     9140.
 6 Australia   Oceania    1987    76.3  16257249    21889.
 7 Austria     Europe     1987    74.9   7578903    23688.
 8 Bahrain     Asia       1987    70.8    454612    18524.
 9 Bangladesh  Asia       1987    52.8 103764241      752.
10 Belgium     Europe     1987    75.4   9870200    22526.
# … with 132 more rows