2
votes

I have the following problem, which seems common, but is not. I have made a ggplot graph with linetype and colour set manually, both legends have the same name and the same variable labels, df in long format. One legend is produced, but each variable is shown twice. In order for you to understand what I want to achieve, I need to back up a little.

  1. I am working on a function which permits me to update a dataframe with monthly spending for this year and to then generate different plots to follow up on my budgeting. My variables have two "properties", so to speak. They are of a particular item, and each item is either a projection (i.e. planned) or actual spending. What I wanted originally was to have each item possess one colour and two linetypes (solid for projected, solid-dashed for actual spending). So, for example, green for saving, projected savings with a solid line, actual savings with a dashed line. I wanted two legends with that, one legend showing only colours (i.e. items) and the other showing only the two kinds of linetypes (solid, dashed) so that it is left to the reader to put the two together (and thus also have less legend items in total). If anyone has a solution for this problem, I'd be very happy to find out. However, the following is what I am trying to solve now:

  2. I have by now given up on this original intention and settled for a legend with each kind of line getting one legend entry. This is what the intro (above) was about. Despite having the same legend name and variable labels and correct number thereof, each variable appears twice now. I would like to know why I am getting these double entries and find a solution. I have tried all sorts of things over many hours and have found nobody with a similar problem (since I get the more "normal" problems with my keyword search).

  3. One strange thing I have also noted is that the variable "Add. income" does not behave like the other variables, since it only appears once.

  4. The reason why there are many NA values in the dataframe (below) is because these are figures to be filled into the df and then plotted as the year progresses.

Code:

ggplot(fin2019Long, aes(x=month, y=value, colour=variable)) +   geom_line(aes(linetype=variable)) + geom_point() +
labs(title = "Projected expenditure and saving", y = "Euros", x = "Month") +
scale_x_continuous("Month", breaks= c(1:12)) +
scale_colour_manual(name = "Items", 
                  values=c("green","green", "yellow", "yellow", "blue", "blue", "red", "red", "orange"), 
                  labels=c(rep("Living expend.", 2), rep("Debt repay.", 2), rep("Saving", 2), rep("Furn. fund", 2), "Extra pay")) +
scale_linetype_manual(name = "Items", 
                    values=c(rep(c("solid", "twodash"), 4), "twodash"), 
                    labels=c(rep("Living expend.", 2), rep("Debt repay.", 2), rep("Saving", 2), rep("Furn. fund", 2), "Extra pay"))

Data:

structure(list(month = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), variable = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), .Label = c("livingExpProj", 
"livingExp", "debtRepayProj", "debtRepay", "savingProj", "saving", 
"furnFundProj", "furnFund", "addIncome"), class = "factor"), 
value = c(1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 
1000, 1000, 1000, 1000, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 600, 600, 600, 600, 600, 600, 600, 600, 600, 
600, 600, 600, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 
500, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 100, 
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -108L
), class = "data.frame") 
1

1 Answers

2
votes

Seperate the variable column into two columns will make it much easier to control:

fin2019Long$type <- ifelse(grepl('Proj$', fin2019Long$variable), 'Planned', 'Spending')
fin2019Long$variable2 <- gsub('Proj$', '', fin2019Long$variable)

ggplot(fin2019Long, aes(x=month, y=value, colour=variable2)) +
    geom_line(aes(linetype=type)) + geom_point() +
    labs(title = "Projected expenditure and saving", y = "Euros", x = "Month") +
    scale_x_continuous("Month", breaks= c(1:12))

enter image description here