0
votes

Any idea how to add labels directly to to my plot (geom_text) ?

Here is my sample dataframe, I am plotting three curves ( confirmed, deaths, recovered) but how to add there also colname labels ? I read dataframe from csv file.

print (data)
        date confirmed  deaths recovered
1 2020-12-01  63883985 1481306  41034934
2 2020-12-02  64530517 1493742  41496318
3 2020-12-03  65221040 1506260  41932091
4 2020-12-04  65899441 1518670  42352021
5 2020-12-05  66540034 1528868  42789879
6 2020-12-06  67073728 1536056  43103827

Here is my code:


data <- structure(list(date = structure(1:6, .Label = c("2020-12-01", 
                                                        "2020-12-02", "2020-12-03", "2020-12-04", "2020-12-05", "2020-12-06"
), class = "factor"), confirmed = c(63883985L, 64530517L, 65221040L, 
                                    65899441L, 66540034L, 67073728L), deaths = c(1481306L, 1493742L, 
                                                                                 1506260L, 1518670L, 1528868L, 1536056L), recovered = c(41034934L, 
                                                                                                                                        41496318L, 41932091L, 42352021L, 42789879L, 43103827L)), row.names = c(NA, 
                                                                                                                                                                                                               6L), class = "data.frame")

ggplot(data, aes(x = date, y = confirmed, group=1 ) ) +
  geom_line(colour = "blue", size =1, aes(date, confirmed)) +
  scale_y_continuous(labels = unit_format(unit = "M", scale = 1e-6)) +
  geom_line(color = "red", size = 1, aes(date, deaths)) +
  geom_line(color = "#1EACB0", size = 1, aes(date, recovered)) 
   

Here is my current plot without labels, I tried also ggplot with this code label=colnames(stats_data), but not working this way,

enter image description here

2
Hi, I would like to show labels directly in Plot like with geom_text(), not with legends, but how to do it ?Andrew
Okay I See. Maybe then you should clarify where you want the labels. And please add your data as a dput. See how to make a minimal reproducible examplestefan
Ok, I have updated my Q, for labels lets say in the middle.Andrew
Your example is not reproducible per se. Please paste the data in a manner that is easy (for us) to import into R. Link provided by stefan have all the possible ways can do that. Also, have you seen this? stackoverflow.com/questions/29357612/…Roman Luštrik

2 Answers

2
votes

As mentioned in the post linked by Roman, ggrepel is a good option for this. Note you can adjust where you want the label to fall using the variable lab_date I created.

# load packages
library(tidyverse)
library(scales)
library(ggrepel)

# process data for plotting
data1 <- data %>%
  mutate(date = as.Date(date)) %>%
  pivot_longer(cols = -date, names_to = "category", values_to = "cases") %>%
  mutate(category = factor(category)) 


# set color scheme with named vector
color_scheme <- setNames(c("blue", "red", "#1EACB0"), unique(data1$category))

# determine position of label
lab_date <- data1$date %>%
  as.numeric(.) %>% # convert to numeric for finding desired potition
  quantile(., 0.5) %>% # selects middle of range but you can adjust as needed
  as.Date(., origin = "1970-01-01") %>% # convert back to date
  as.character() # convert to string for matching in geom_label_repel call

# plot lines with labels and drop legend
data1 %>%
  ggplot(data = ., aes(x = date, y = cases, color = category)) +
  geom_line() +
  geom_label_repel(aes(label = category), 
                   data = data1 %>% filter(date == lab_date)) +
  scale_y_continuous(labels = unit_format(unit = "M", scale = 1e-6)) +
  scale_color_manual(values = color_scheme) +
  theme(legend.position = "none")

Gives the following plot:

labeled line plot

A few notes with updates:

  1. I updated the color scheme as requested. Note the use of a named list fed into scale_color_manual which will preserve the color scheme even if the order of the categories changes or one is absent.
  2. I modified the method of setting the position to make it more generalizable in case you decided you wanted to put it somewhere else. If you wanted to just specify the position manually you could just set lab_date <- "2020-12-03" or whatever you needed.
  3. If you want to avoid loading extra packages, using geom_label instead of geom_label_repel gives almost the exact same result so might be considered gratuitous for this relatively small number of labels, although it does help to get the label off the line if that's important.
  4. As you point out in your comment, plotly::ggplotly doesn't support ggrepel or even ggplot2::geom_label. Therefore, if you need this to go into plotly, one option is to change geom_label_repel to geom_text although then it will plot on top of the line if you don't adjust the y position. See below:
ggplotly(
data1 %>%
  ggplot(data = ., aes(x = date, y = cases, color = category)) +
  geom_line() +
  geom_text(aes(label = category), 
                   data = data1 %>% 
              filter(date == lab_date) %>% 
              mutate(cases = cases + 2e6)) + # this adjusts the y position of the label to avoid overplotting on the line
  scale_y_continuous(labels = unit_format(unit = "M", scale = 1e-6)) +
  scale_color_manual(values = color_scheme) +
  theme(legend.position = "none")
)

Produces this plot:

plotly output

The amount you want to adjust by will depend on line thickness, specific values your your data and size of your plot so it's more of a hack than a robust solution.

1
votes

This type of problems generally has to do with reshaping the data. The format should be the long format and the data is in wide format. See this post on how to reshape the data from wide to long format.

library(dplyr)
library(tidyr)
library(ggplot2)

stats_data %>%
  select(-starts_with("diff")) %>%
  pivot_longer(-date, names_to = "cases", values_to = "count") %>%
  mutate(cases = factor(cases, levels = c("confirmed", "deaths", "recovered"))) %>%
  ggplot(aes(date, count, colour = cases)) +
  geom_line() +
  scale_color_manual(values = c("blue", "red", "#1EACB0"))

enter image description here

Data

stats_data <- read.table(text = "
        date confirmed diff.x deaths diff.y recovered
'2020-01-22'       555    555     17     17        28
'2020-01-23'       654     99     18      1        30
'2020-01-24'       941    287     26      8        36
'2020-01-25'      1434    493     42     16        39
'2020-01-26'      2118    684     56     14        52
'2020-01-27'      2927    809     82     26        61
", header = TRUE, colClasses = c("Date", rep("numeric", 5)))