2
votes

I want to do cluster analysis in R. So I create a distance matrix (Fig.1) below:

Fig.1

matrix_a <- data.frame(n1=c(0,1,11,5),n2=c(1,0,2,3),n3=c(11,2,0,4),n4=c(5,3,4,0))

Then I use the code below for cluster analysis:

result <- hclust(matrix_a,method="average")

However, an error occured:

Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : missing value where TRUE/FALSE needed

Could anyone help me check out where was I wrong?

1
Try this: hclust(as.dist(matrix_a), method = "average"). The argument in hclust needs to be of class dist, not a matrix/data frame.Nick Criswell

1 Answers

4
votes

In ?hclust the d argument is described as:

d
a dissimilarity structure as produced by dist.

The object matrix is not such an object. In fact it is not even an R matrix. It is a data frame.

Try the following. We have given a more accurate name to the input and then converted it to a dist object as required.

DF <- data.frame(n1 = c(0,1,11,5), n2 = c(1,0,2,3), n3 = c(11,2,0,4), n4 = c(5,3,4,0))
hclust(as.dist(DF), "ave")