0
votes

I'm trying to plot a graph using the Facets feature from ggplot2 from a simple data.frame extracted from the Lahman package. Nevertheless, it's placing some observations in the wrong variable plot. I've tried to use several configurations in the facet_grid arguments but all of them have wrong placement of the observations.

Here below the code to reproduce the plot.

library(Lahman)
library(tidyverse)
library(plotly)

TmsStd <- Teams

TmsStd <- TmsStd %>% select(yearID, lgID, teamID, divID, Rank, W, L, DivWin, WCWin, LgWin, WSWin, name, teamIDBR)

TmsStd$WLPctg <- TmsStd$W / (TmsStd$W + TmsStd$L)

TmsStd <- TmsStd %>% arrange(yearID, desc(WLPctg))

TmsStd$OvSeaRank <- ave(TmsStd$WLPctg, TmsStd$yearID, FUN = seq_along)

TmPostS <- TmsStd %>% filter(OvSeaRank <= 4 & WSWin == "Y" & yearID > 1970) %>% select(yearID, teamIDBR, W, L, WLPctg, OvSeaRank)

Best_Post <- ggplot(data = TmPostS, aes(x = yearID)) +
  geom_bar() + 
  ggtitle("ABC") +
  xlab("Year") + ylab("") +
  facet_grid(OvSeaRank ~ .) +
  theme_light()

Best_Post

facet_grid plot

There is only one observation per year.

table(TmPostS$yearID)

1971 1972 1973 1974 1975 1976 1977 1978 1979 1981 1982 1983 1984 1986 1988 1989 1990 1991 1992 1993 1995 1996 
   1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1 
1997 1998 1999 2002 2004 2005 2007 2009 2013 2015 
   1    1    1    1    1    1    1    1    1    1 

So it must exist only one line per year independently of the "OvSeaRank" variable.

Any hint of what I could be doing wrong?

Thanks in advance.

2
At the moment, you're only plotting the relationship between year and rank. This isn't very informative, especially given there is only one observation per year. Can you explain in more detail what you want to show?Joe
Hello Joe, I want to plot a bar line for each year in order to see what was the rank of champion teams during the regular season.darh78
All bars must be mutually exclusive in the graph, along yearsdarh78
Ahhh, I think I get what you mean. Answer is to use stat="identity". Will add the answer now.Joe
BTW, there's a good explanation of that here: cookbook-r.com/Graphs/Bar_and_line_graphs_(ggplot2)Joe

2 Answers

1
votes

By default geom_bar will count the number of occurrences of each year (which is always 1) rather than the value. You need to change the default behaviour with stat="identity" so it uses the column value.

ggplot(TmPostS, aes(x = yearID, y=OvSeaRank)) + geom_bar(stat="identity") + 
ggtitle("ABC") + xlab("Year") + ylab("") + facet_grid(OvSeaRank ~ .) +
theme_light()

enter image description here

It's actually better without faceting, because you don't really have enough variables in the plot. Leaving out facet_grid(OvSeaRank ~ .) gives the following: enter image description here

Idea How about using geom_line and reversing the y-axis for rank?

ggplot(TmPostS, aes(x = yearID, y=OvSeaRank)) + geom_line() + geom_point() + 
scale_y_reverse() + ggtitle("ABC") + xlab("Year") + ylab("Rank of champion") + theme_light()

enter image description here

0
votes

Thanks to Joe support, I could found what I wanted to show on this question. I was modifying stat = "identity" by stat = "bin" and defining a bindwidth = 1

ggplot(TmPostS, aes(x = yearID)) + geom_bar(stat="bin", binwidth = 1, color = "red", fill = "darkblue") + 
  ggtitle("World Series Champions based on their regular season W-L% overall rank") + xlab("Season") + ylab("") + facet_grid(OvSeaRank ~ .) +
  theme_bw() +
  theme(axis.text.y=element_blank(), 
        axis.ticks = element_blank())

Wished graph using facets

On this case now the data frame considers all the MLB champions since 1884.

Finally, using geom_line idea from Joe:

ggplot(TmPostS, aes(x = yearID, y=OvSeaRank)) + geom_line(colour = "darkblue") + geom_point(colour = "red") + 
  scale_y_reverse() + ggtitle("World Series Champions based on their regular season W-L% overall rank") + xlab("Year") + ylab("Rank of champion") + theme_light()

Alternative graph using geom_line