How to create clustering plots which long and wide format data for multiple variables

Question

Having a dataset like this one:

data.frame(id = c(1,1,1,2,2,3) snames  = c("stockA","stockB","stockC","stockA","stockB","stockc"), var1 = c(0.13,1.2,-1.5,3.45,-0.26,-2.1), var2 = c(-2.1,2.34,3.56,-1.53,-0.48,-0.29), var3 = c(0.04,-3.45,-0.22,-0.29,1.34,0.32), var4 = c(2.14,-1.34,-4.35,-1.56,0.13,-2.35), var5 = c(1.53,1.24,-0,32,-0.3,-4.25,-2.49))

How is it possible to create a cluster using the long and wide format data together?

Having these data is there any way to cluster the variables of snames column in combination with var1, var2, var3, var4, var5 to find in which cluster there are group like in the first cluster we have stockB and var2 and var3?

Like one from here

Andy Andy · Accepted Answer · 2020-01-22T15:44:16

I've been working on your code, but you need to provide more information for me to answer it

First off, there are spelling mistakes in your code you provided, I fixed them here.

df<- data.frame(id=c(1,1,1,2,2,3),snames=c("stockA","stockB","stockC","stockA","stockB","stockC"), var1 = c(0.13,1.2,-1.5,3.45,-0.26,-2.1), var2 = c(-2.1,2.34,3.56,-1.53,-0.48,-0.29), var3 = c(0.04,-3.45,-0.22,-0.29,1.34,0.32), var4 = c(2.14,-1.34,-4.35,-1.56,0.13,-2.35), var5 = c(1.53,1.24,-0.32,-0.3,-4.25,-2.49)) # you wrote stockc and not stockC, also var5  was written -0,32, and it needs to be -0.32

please provide the code you used to make the above graph.

I believe you could simply define pch=levels(df$snames), and the col=c("df$vars1","df$vars2","df$vars3","df$vars4","df$vars5") in the plot command and it should do what you want to do

How to create clustering plots which long and wide format data for multiple variables

1 Answers