Insert new matrix in the R scatterplot

Question

I would like to insert new coordinates in my scatterplot, from another matrix. I am using the fviz_cluster function to generate the graph for the clusters. I would like to insert the coordinates of the matrix called Center of mass in my graph, as they are the best coordinates of each cluster for installing a manure composting machine. I can generate the scatter plot only for the properties, as attached. The codes are below:

> library(readxl)
> df <- read_excel('C:/Users/testbase.xlsx') #matrix containing waste production, latitude and longitude
> dim (df)
[1] 19  3
> d<-dist(df)
> fit.average<-hclust(d,method="average") 
> clusters<-cutree(fit.average, k=6) 
> df$cluster <- clusters # inserting column with determination of clusters
> df
    Latitude    Longitude  Waste   cluster
     <dbl>       <dbl>     <dbl>     <int>
 1    -23.8     -49.6      526.        1
 2    -23.8     -49.6      350.        2
 3    -23.9     -49.6      526.        1
 4    -23.9     -49.6      469.        3
 5    -23.9     -49.6      285.        4
 6    -23.9     -49.6      175.        5
 7    -23.9     -49.6      175.        5
 8    -23.9     -49.6      350.        2
 9    -23.9     -49.6      350.        2
10    -23.9     -49.6      175.        5
11    -23.9     -49.7      350.        2
12    -23.9     -49.7      175.        5
13    -23.9     -49.7      175.        5
14    -23.9     -49.7      364.        2
15    -23.9     -49.7      175.        5
16    -23.9     -49.6      175.        5
17    -23.9     -49.6      350.        2
18    -23.9     -49.6      45.5        6
19    -23.9     -49.6      54.6        6

> ########Generate scatterplot
> library(factoextra)
> fviz_cluster(list(data = df, cluster = clusters))
> 
> 
>  ##Center of mass, best location of each cluster for installation of manure composting machine
> center_mass<-matrix(nrow=6,ncol=2)
> for(i in 1:6){
+ center_mass[i,]<-c(weighted.mean(subset(df,cluster==i)$Latitude,subset(df,cluster==i)$Waste),
+ weighted.mean(subset(df,cluster==i)$Longitude,subset(df,cluster==i)$Waste))}
> center_mass<-cbind(center_mass,matrix(c(1:6),ncol=1)) #including the index of the clusters
> head (center_mass)
          [,1]      [,2] [,3]
[1,] -23.85075 -49.61419    1
[2,] -23.86098 -49.64558    2
[3,] -23.86075 -49.61350    3
[4,] -23.86658 -49.61991    4
[5,] -23.86757 -49.63968    5
[6,] -23.89749 -49.62372    6

enter image description here

New scatterplot

enter image description here

Scatterplot considering Longitude and Latitude

vars = c("Longitude", "Latitude")

gg <- fviz_cluster(list(df, cluster = dfcluster), choose.var=vars)

gg

enter image description here

Thanks for the edition Roman Luštrik and Tjebo. Could you give me any ideas for my problem above? — user13047398
It's not quite clear to me what you exaclty want to achieve. Also, your problem is not reproducible. Please kindly try to make it reproducible (see here how: stackoverflow.com/questions/5963269/…). or r-bloggers.com/… Ideally don't post output of your data, but create pertinent sample data. And show an output what you would expect. This will make it much more likely to get help — tjebo
Have a look at the reprex package. Tip: Use RStudio instead of R GUI. install the reprex package and it will be integrated in RStudio. And then create reprex from your code , and you will create nice reproducible code — tjebo

mastropi mastropi · Accepted Answer · 2020-03-24T09:25:28

Since the fviz_cluster() function returns a ggplot object you should be able to add new points to the plot as you do with ggplot().

Here is an example using mock data, where I only use functions from the ggplot2 package (since I don't have the factoextra package installed).

# Dataset with all the points (it's your df data frame)
df <- data.frame(x=1:10, y=1:10)

# Dataset with two "center" points to add to the df points (it's your center_mass matrix)
dc <- data.frame(x=c(2.5, 7.5), y=c(2.5, 7.5))

# ggplot with the initial plot of the df points (it mimics the result from fviz_cluster())
# Note that the plot is not yet shown, it's simply stored in the gg variable
gg <- ggplot() + geom_point(data=df, mapping=aes(x,y))

# Create the plot by adding the center points to the above ggplot as larger red points
gg + geom_point(data=dc, mapping=aes(x,y), color="red", size=3)

which produces:

In your case you should:

Replace the line:
fviz_cluster(list(data = df, cluster = clusters))
with:
gg <- fviz_cluster(list(data = df, cluster = clusters))
Convert the center_mass matrix to a data frame (by simply using as.data.frame(center_mass)) before passing it to the geom_point() call in the last line of my example above, and assign appropriate column names with the colnames() function to which you can refer to in the mapping option of geom_point().

Let me know if this works for you!

Insert new matrix in the R scatterplot

2 Answers