I am currently trying to understand cluster analysis (using SPSS and R). Reading up so much about it further confused me as to what clustering method to use to answer a research question.
My research question investigates whether a) certain participants can be clustered according to their change in variable A (a group that remains stable, a group that worsens, and a third group that improves over 2 assessments), and b) how these groups/clusters differ with regard to two other variables at assessment 1 (B and C). That is, do people with different patterns in B and C have a different change in A?
Question: I have standardised the data and, so far, have tried two step hierarchical and k-means clustering. However, I am unsure if this is the right method for answering my question. In the case of a fixed number of clusters, I chose 3 because I am interested in seeing clusters of people that improve/worsen/stay stable over time, and the clusters' individual pattern of B and C. Is this feasible? Am I missing something?
For k-means clustering I used the following syntax:
QUICK CLUSTER z_A_change z_B_mean z_C_mean
/MISSING=LISTWISE
/CRITERIA=CLUSTER(3) MXITER(10) CONVERGE(0)
/METHOD=KMEANS(NOUPDATE)
/SAVE CLUSTER DISTANCE
Finally, is there any way to visualise these clusters on a 3-D plot in SPSS? I am not quite as proficient in R's ggplot2 or scatterplot3d as I would like to be.
Thank you in advance.
dput()
for at least a small sample of your data. – dcarlson