1
votes

I am working with ecological data (the percentage abundance of different diatom species present at different depths in a sediment core) and want to plot the results alongside a dendrogram representing the results of a hierarchical cluster analysis (CONISS) that I will use to split the core into ecological zones.

This is an example of the type of thing I am trying to achieve:

I have successfully run the cluster analysis but am struggling to plot the dendrogram the way I would like to. Whenever I plot the dendrogram, it plots the leaves of the dendrogram in sequential order (as shown here). However, I need the leaves to plot by depth from the top of the sediment core so that there is a leaf present at each depth of the core that I have examined and so that there are gaps in the dendrogram where I have missing samples (as shown here). (Note there is no y-axis shown here as the dendrogram will join onto the rest of the diagram, which already contains the y-axis.)

I managed to create the latter plot through the rioja package using plot.chclust and providing the depths for the leaves to the xvar argument. However, I have constructed the rest of my diagram (the species abundance data, PCA results etc.) with ggplot (and the help of the tidypaleo package) and then combined the various aspects of it using cowplot. I need to be able to create the dendrogram through ggplot in order to add it to my main diagram. I have investigated both the ggdendro and dendextend packages but can't find a way to plot the dendrogram according to depth using these. Is this possible? Do these packages have a function to do this that I am not aware of? I started looking into the source code for these packages as well as for as.dendrogram to see if I could figure out a way of modifying the functions to plot the leaves by depth but it is beyond my skill level. I was wondering if anyone has a solution to enable me to plot my dendrogram by depth in my sediment core rather than in sequential order?

My data

This is the data that I have used to calculate the distance matrix in the code below. (Apologies for the very long dput!)

My code to plot the dendrograms

# Load requirements
library(vegan) # designdist and chclust
library(rioja) # plot.chclust
library(ggplot2)
library(ggdendro) # dendro_data
library(dplyr)
library(tibble)

# Calculate distance matrix
dist_matrix <- designdist(coniss_data, method = "A+B-2*J", terms = "quadratic")

# Run cluster analysis
coniss_results <- chclust(dist_matrix, method = "coniss")


# Plot with rioja ---------------------------------------------------------

# Isolate depths
coniss_depths <- coniss_data %>%
  rownames_to_column("depth") %>%
  mutate(depth = as.numeric(depth)) %>%
  select(., "depth")

# Plot
plot(
  coniss_results,
  hang = -0.01, # make leaves extend to 0 and labels hang down from there
  horiz = TRUE, # rotate plot
  x.rev = TRUE, # reverse depths
  xvar = coniss_depths$depth, # plot leaves by depth of sediment core
  labels = FALSE,
  cex.axis = 0.8,
  tcl = -0.1
)


# Plot with ggdendro ------------------------------------------------------

# Extract dendrogram data
ddata <- dendro_data(coniss_results, type = "rectangle")

# Plot
ggplot(segment(ddata)) +
  geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) +
  coord_flip() +
  scale_y_continuous(expand = c(0, 0)) +
  scale_x_reverse(breaks = NULL,
                  labels = NULL) +
  labs(x = "",
       y = "Total sum of squares") +
  theme_classic(8)
1

1 Answers

0
votes

The actual depths are mapped to the x values in ddata$labels, so you can reverse-map the x values to actual depths. A handy way to do this is with approxfun:

new_x <- approxfun(ddata$labels$x, as.numeric(as.character(ddata$labels$label))) 
ddata$segments$x <- new_x(ddata$segments$x)
ddata$segments$xend <- new_x(ddata$segments$xend)

So now with your same plotting code:

ggplot(segment(ddata)) +
  geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) +
  coord_flip() +
  scale_y_continuous(expand = c(0, 0)) +
  scale_x_reverse(breaks = NULL,
                  labels = NULL) +
  labs(x = "",
       y = "Total sum of squares") +
  theme_classic(8)

We get:

enter image description here