As new to R, I have a question about writing and reading vector data.
My Example 1
n = 100 g = 6 set.seed(g) d <- data.frame(x = unlist(lapply(1:g, function(i) rnorm(n/g, runif(1)*i^2))), y = unlist(lapply(1:g, function(i) rnorm(n/g, runif(1)*i^2)))) plot(d) require(vegan) fit <- cascadeKM(scale(d, center = TRUE, scale = TRUE), 1, 10, iter = 1000) plot(fit, sortg = TRUE, grpmts.plot = TRUE) calinski.best <- as.numeric(which.max(fit$results[2,])) cat("Calinski criterion optimal number of clusters:", calinski.best, "\n")
(source), it prints "Calinski criterion optimal number of clusters: 5" as expected.
Example 2: (write data frame d first, then read it)
n = 100 g = 6 set.seed(g) d <- data.frame(x = unlist(lapply(1:g, function(i) rnorm(n/g, runif(1)*i^2))), y = unlist(lapply(1:g, function(i) rnorm(n/g, runif(1)*i^2)))) write.table(d, "d.txt", sep='\t', quote=FALSE) #write data frame d = read.table("d.txt", header=TRUE, sep = '\t') #read later plot(d) require(vegan) fit <- cascadeKM(scale(d, center = TRUE, scale = TRUE), 1, 10, iter = 1000) plot(fit, sortg = TRUE, grpmts.plot = TRUE) calinski.best <- as.numeric(which.max(fit$results[2,])) cat("Calinski criterion optimal number of clusters:", calinski.best, "\n")
However, example 2 prints "Calinski criterion optimal number of clusters: 1".
I think the format (or something else) has been changed after IO from file in R. But i have no knowledge about how R read and write numbers. Can anyone give me some clues, thanks.
EDIT If the file is written without col name and row name, problem solved.
write.table(d, "d.txt", sep='\t', quote=FALSE, row.name=FALSE, col.names=FALSE)
When reading, R also reads the row and col names,. Another is to escape those names when reading.
head
of it will dohead(d)
) – llrssave
orsaveRDS
instead. – Joshua Ulrich