3
votes

I have a dataset similar to the below example

df <- structure(list(Species = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,3L, 
1L, 2L, 3L), .Label = c("setosa", "versicolor", "virginica"), class = 
"factor"), flower_att = c("Sepal.Length", "Sepal.Length", "Sepal.Length", 
"Sepal.Width", "Sepal.Width", "Sepal.Width", "Petal.Length", "Petal.Length", 
"Petal.Length", "Petal.Width", "Petal.Width", "Petal.Width"), measurement = 
c(5.1, 7, 6.3, 3.5, 3.2, 3.3, 1.4, 4.7, 6, 0.2, 1.4, 2.5), month = 
c("January", "February", "January", "February", "January", "February", 
"January", "February", "January", "February", "January", "February")), 
row.names = c(NA,-12L), class = "data.frame")

I want to display both sepal length and width for each species and month side by side. I was hoping to do this using a diagonal split cell in the heatmap with 2 different colour legends i.e. red for length and blue for width. If possible I would like the value to be displayed within the cell segment. My search so far has found this closest example but I am looking for a workable ggplot version.

My own attempt currently looks like the below. I cannot work out how to break up the cells.

ggplot(df, aes(x=month, y=Species)) +   geom_tile(aes(fill=measurement), 
color="black") +   theme(axis.text.x = element_text(angle=45, hjust = .5)) +   
geom_text(aes(label = round(measurement, .1))) +   scale_fill_gradient(low = 
"white", high = "red")

Update

After some serious digging through the internet I have found a potential option using geom_segment and geom_text_repel, see below. Could anyone tell me if this a viable option to pursue? If so how can I get it to meet the requirements above?

I am open to switching scale_fill_gradient to scale_fill_manual or other alternative, my main objective is to have the all data displayed side by side

ggplot(df, aes(x=month, y=Species)) +
geom_tile(aes(fill=measurement), color="black") +
theme(axis.text.x = element_text(angle=45, hjust = .5)) +
geom_text_repel(aes(label = round(measurement, .1))) +
scale_fill_gradient(low = "white", high = "red")

gb <- ggplot_build(p)

p + geom_segment(data=gb$data[[1]],
aes(x=xmin, xend=xmax, y=ymin, yend=ymax), color="black")
1
what have you tried so far?Mike
Hi Mike ive used the following so far but I cannot workout how to perform the split ggplot(df, aes(x=month, y=Species)) + geom_tile(aes(fill=measurement), color="black") + theme(axis.text.x = element_text(angle=45, hjust = .5)) + geom_text(aes(label = round(measurement, .1))) + scale_fill_gradient(low = "white", high = "red")AudileF
good question! I am not sure how to do that, I would also edit your question to put the ggplot code in there so other people can help troubleshoot.Mike
There appears to be other efforts on this, such as in this post.Ben
Thanks for the link @Ben I tried the askers example and it does not give the same result as they got. But Ill see if I can work it around :)AudileF

1 Answers

0
votes

This is slightly hacky, but to be honest, without creating a dedicated geom, I don't think you can get it less hacky - and creating a geom can also get somewhat hacky :)

  • Creating triangle polygons for each x/y coordinate with sapply. I guess you could use that approach for your compute_group layer in your future StatSplitTile.
  • The messing with factors is a necessary evil to get the order right. If you want a specific order in your y axis, you would also need to factorise Species first.
  • Using ggnewscale for a very simple way of having several fill scales.
  • set limits to the same for better comparability
  • coord_equal to make it look nicer
library(tidyverse)

mydat <- structure(list(Species = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("setosa", "versicolor", "virginica"), class = "factor"), flower_att = c("Sepal.Length", "Sepal.Length", "Sepal.Length", "Sepal.Width", "Sepal.Width", "Sepal.Width", "Petal.Length", "Petal.Length", "Petal.Length", "Petal.Width", "Petal.Width", "Petal.Width"), measurement = c(5.1, 7, 6.3, 3.5, 3.2, 3.3, 1.4, 4.7, 6, 0.2, 1.4, 2.5), month = c("January", "February", "January", "February", "January", "February", "January", "February", "January", "February", "January", "February")),
  row.names = c(NA, -12L), class = "data.frame"
)

make_triangles <- function(x, y, point = "up") {
  x <- as.integer(as.factor((x)))
  y <- as.integer(as.factor((y)))

  if (point == "up") {
    newx <- sapply(x, function(x) {
      c(x - 0.5, x - 0.5, x + 0.5)
    }, simplify = FALSE)
    newy <- sapply(y, function(y) {
      c(y - 0.5, y + 0.5, y + 0.5)
    }, simplify = FALSE)
  } else if (point == "down") {
    newx <- sapply(x, function(x) {
      c(x - 0.5, x + 0.5, x + 0.5)
    }, simplify = FALSE)
    newy <- sapply(y, function(y) {
      c(y - 0.5, y - 0.5, y + 0.5)
    }, simplify = FALSE)
  }
  data.frame(x = unlist(newx), y = unlist(newy))
}

# required, otherwise you cannot use the values as fill
mydat_wide <- mydat %>% pivot_wider(names_from = "flower_att", values_from = "measurement")
# making your ordered months factor
mydat_wide$month <- droplevels(factor(mydat_wide$month, levels = month.name))
# The actual triangle computation
newcoord_up <- make_triangles(mydat_wide$month, mydat_wide$Species)
newcoord_down <- make_triangles(mydat_wide$month, mydat_wide$Species, point = "down")
# just a dirty trick for renaming
newcoord_down <- newcoord_down %>% select(xdown = x, ydown = y)
# you need to repeat each row of your previous data frame 3 times
repdata <- map_df(1:nrow(mydat_wide), function(i) mydat_wide[rep(i, 3), ])
newdata <- bind_cols(repdata, newcoord_up, newcoord_down)

ggplot(newdata) +
  geom_polygon(aes(x = x, y = y, fill = Sepal.Length, group = interaction(Species, month)), color = "black") +
  scale_fill_gradient(low = "white", high = "red", limits = c(0, 10)) +
  ggnewscale::new_scale_fill() +
  geom_polygon(aes(x = xdown, y = ydown, fill = Sepal.Width, group = interaction(Species, month)), color = "black") +
  scale_fill_gradient(low = "white", high = "red", limits = c(0, 10)) +
  scale_x_continuous(breaks = seq_along(unique(mydat_wide$month)), 
                     labels = unique(levels(mydat_wide$month))) +
  scale_y_continuous(breaks = seq_along(unique(mydat_wide$Species)),
                     labels = unique(mydat_wide$Species))+
  coord_equal()

Created on 2021-01-27 by the reprex package (v0.3.0)