0
votes

I have three different data.frames (GRCYPT_flows, ESIEIT_flows, GRCYPT_flows) which contain the same variables (report_ctry, partner_ctry, indicator, year, value), but with different levels/observations. Now I want to create plots for each of those data.frames. Since the plots are supposed to look the same, I seems reasonable to use an iterative command. I tried the foreach loop:

foreach(i=GRCYPT_flows, ESIEIT_flows, GRCYPT_flows) %do% {  ggplot(i, aes(year, value)) + 
geom_line(aes(colour=partner_ctry, linetype=indicator)) + facet_wrap(~report_ctry) +
theme(axis.text.x=element_text(angle=90, vjust=0.5)) + 
scale_x_continuous(breaks=seq(2002, 2012, 2), name="") +
scale_y_continuous(name="Billion Euros") + 
scale_colour_discrete(breaks=c("EA17", "ROW_NON_EA17"), labels=c("EA17", "Extra-EA17")) +
scale_linetype_discrete(breaks=c("EA17", "ROW_NON_EA17"), labels=c("Trade", "Capital")) +
theme(legend.title=element_blank())}

The code, as it is, does not work. I face to problems here:

  1. Assign a data.frame to an iteration variable.

  2. Tell the foreach loop to save each iteration to a different list with a distinct name (plot1, plot2, plot3, etc.).

I'm relatively sure, this is quite easy so solve if you have some experience with R. I'm a total greenhorn, however, so I really don't know where to start (I could easily do it with Stata with which I have at least some experience).

What I want to do is tell R: "Make a plot for each of these data.frames and save each of it in an individual list."

3
Is there a reason why you want to use foreach instead of a simple for (or list+lapply)talat
Perhaps you could pass a list of character names for data.frames to foreach and use get to fetch the actual data.frame. You can return a list using foreach infrastructure (e.g. see here).Roman Luštrik
@docendodiscimus perhaps OP can upscale to use a parallel back-end at some point.Roman Luštrik
@docendo discimus: Well, I'm open to any suggestion that works. It does not necessarily have to be a foreach loop. As I said, I'm totally new to R and not aware of many commands.Laubsauger

3 Answers

1
votes

I would suggest separating the plotting code from the loop, that way you can test it on one example and then run it for the batch easily. And you probably want to save the batch to files.

library(tidyverse)

myplot <- function(df, filename = NULL) {
  df %>%
    ggplot(aes(Sepal.Length, Petal.Length)) +
    geom_point() ->
    result

  if(!is.null(filename)) ggsave(filename, plot = result, width = 6, height = 4)
  else result
}

# test the plot
myplot(iris)

# do the batch
l <- list(one = iris, two = iris)
l %>% names %>% walk(function(n) myplot(l[[n]], paste0(n, ".pdf")))
0
votes

Here's an example with three data.frames of iris, which I'd named i1, i2 and i3 for simplicity sake.

i2 <- i3 <- i1 <- iris

foreach(m = 1:3) %do% {
  dat <- paste0("i" , m) %>% get
  ggplot(dat, aes(Sepal.Length, Petal.Length)) + geom_line()
}

Basically the trick is to call for the specific data.frame with get. In your case, this should work:

data.names <- c("GRCYPT_flows", "ESIEIT_flows", "GRCYPT_flows")
foreach(i=1:length(data.names) %do% {
  dat <- get(data.names[i])
  ggplot(dat, aes(year, value)) + 
     geom_line(aes(colour=partner_ctry, linetype=indicator)) + 
     facet_wrap(~report_ctry) +
     theme(axis.text.x=element_text(angle=90, vjust=0.5)) + 
     scale_x_continuous(breaks=seq(2002, 2012, 2), name="") +
     scale_y_continuous(name="Billion Euros") + 
     scale_colour_discrete(breaks=c("EA17", "ROW_NON_EA17"), 
     labels=c("EA17", "Extra-EA17")) +
     scale_linetype_discrete(breaks=c("EA17", "ROW_NON_EA17"), 
     labels=c("Trade", "Capital")) +
     theme(legend.title=element_blank())
  }
0
votes

I think the most "R"-y solution here would be lapply. Lapply takes a vector of things and does the same thing to all of them, then stores the outputs as a list. Since you're using ggplot, you may like a neatly organized list of all the similar plots.

First organize your data frames together in a list

my_data  <- list(GRCYPT_flows, ESIEIT_flows) 

Two of your "three" data frames have exactly the same name. I'm going to assume you actually meant two, but this would work with any number of data frames.

my_plots = lapply(my_data, function(i) {
ggplot(i, aes(year, value))
})

This takes each element of the list ("i") and does the custom function to it, where the custom function is your elaborate plots.

Since you're using ggplot, you can store these plots as outputs. so my_plots will be a neat list with all your plots.

so with your full plot function try:

    my_plot <- lapply(my_data, function(i) {
ggplot(i, aes(year, value)) + 
geom_line(aes(colour=partner_ctry, linetype=indicator)) + facet_wrap(~report_ctry) +
theme(axis.text.x=element_text(angle=90, vjust=0.5)) + 
scale_x_continuous(breaks=seq(2002, 2012, 2), name="") +
scale_y_continuous(name="Billion Euros") + 
scale_colour_discrete(breaks=c("EA17", "ROW_NON_EA17"), labels=c("EA17", "Extra-EA17")) +
scale_linetype_discrete(breaks=c("EA17", "ROW_NON_EA17"), labels=c("Trade", "Capital")) +
theme(legend.title=element_blank())
})