I am working with ecological data (the percentage abundance of different diatom species present at different depths in a sediment core) and want to plot the results alongside a dendrogram representing the results of a hierarchical cluster analysis (CONISS) that I will use to split the core into ecological zones.
This is an example of the type of thing I am trying to achieve:
I have successfully run the cluster analysis but am struggling to plot the dendrogram the way I would like to. Whenever I plot the dendrogram, it plots the leaves of the dendrogram in sequential order (as shown here). However, I need the leaves to plot by depth from the top of the sediment core so that there is a leaf present at each depth of the core that I have examined and so that there are gaps in the dendrogram where I have missing samples (as shown here). (Note there is no y-axis shown here as the dendrogram will join onto the rest of the diagram, which already contains the y-axis.)
I managed to create the latter plot through the rioja package using plot.chclust
and providing the depths for the leaves to the xvar
argument. However, I have constructed the rest of my diagram (the species abundance data, PCA results etc.) with ggplot (and the help of the tidypaleo package) and then combined the various aspects of it using cowplot. I need to be able to create the dendrogram through ggplot in order to add it to my main diagram. I have investigated both the ggdendro and dendextend packages but can't find a way to plot the dendrogram according to depth using these. Is this possible? Do these packages have a function to do this that I am not aware of? I started looking into the source code for these packages as well as for as.dendrogram
to see if I could figure out a way of modifying the functions to plot the leaves by depth but it is beyond my skill level. I was wondering if anyone has a solution to enable me to plot my dendrogram by depth in my sediment core rather than in sequential order?
My data
This is the data that I have used to calculate the distance matrix in the code below. (Apologies for the very long dput!)
My code to plot the dendrograms
# Load requirements
library(vegan) # designdist and chclust
library(rioja) # plot.chclust
library(ggplot2)
library(ggdendro) # dendro_data
library(dplyr)
library(tibble)
# Calculate distance matrix
dist_matrix <- designdist(coniss_data, method = "A+B-2*J", terms = "quadratic")
# Run cluster analysis
coniss_results <- chclust(dist_matrix, method = "coniss")
# Plot with rioja ---------------------------------------------------------
# Isolate depths
coniss_depths <- coniss_data %>%
rownames_to_column("depth") %>%
mutate(depth = as.numeric(depth)) %>%
select(., "depth")
# Plot
plot(
coniss_results,
hang = -0.01, # make leaves extend to 0 and labels hang down from there
horiz = TRUE, # rotate plot
x.rev = TRUE, # reverse depths
xvar = coniss_depths$depth, # plot leaves by depth of sediment core
labels = FALSE,
cex.axis = 0.8,
tcl = -0.1
)
# Plot with ggdendro ------------------------------------------------------
# Extract dendrogram data
ddata <- dendro_data(coniss_results, type = "rectangle")
# Plot
ggplot(segment(ddata)) +
geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) +
coord_flip() +
scale_y_continuous(expand = c(0, 0)) +
scale_x_reverse(breaks = NULL,
labels = NULL) +
labs(x = "",
y = "Total sum of squares") +
theme_classic(8)