How do I determine the optimal number of clusters while using hierarchical clustering. If I am just having the distance matrix as I am measuring only pairwise distances (levenshtein distances), how do I find out the optimal number of clusters? I referred to other posts they all use k-means, hierarchical but not for string type of data as shown below. Any suggestions on how to use R to find the number of clusters?
set.seed(1)
rstr <- function(n,k){ # vector of n random char(k) strings
sapply(1:n,function(i) {do.call(paste0,as.list(sample(letters,k,replace=T)))})
}
str<- c(paste0("aa",rstr(10,3)),paste0("bb",rstr(10,3)),paste0("cc",rstr(10,3)))
# Levenshtein Distance
d <- adist(str)
rownames(d) <- str
hc <- hclust(as.dist(d))