0
votes

I'm rather new to R and trying to graph out a humanitarian aid distribution to see if we can identify clusters. The data is really simple and consists of two columns, a unique identifier for each beneficiary, and the unique identifier of the group providing them a service. Each row is one activity (i.e. one beneficiary, one provider).We have about 50,000 beneficiaries, and about 6,000 groups, and I want to see if we can loosely identify "clusters" of beneficiaries who rely on the same set of groups.

I feel like I should be able to do this using igraph in R, where the beneficiaries are nodes, and shared groups create an edge, but I'm not sure how to structure that formula. Would really appreciate any help on this.

1
Please show us what you've tried to do and a reproducible example if possible.Richard Erickson

1 Answers

1
votes

Here`s a starter:

library(igraph)
# set.seed(3); g <- ba.game(10); write.table(setNames(get.data.frame(g), c("beneficiary", "group")), sep=";", row.names = F)
df <- read.table(sep=";", header=T, text='
"beneficiary";"group"
2;1
3;1
4;3
5;1
6;1
7;3
8;3
9;1
10;1')
g <- graph_from_data_frame(df)
cl <- cluster_walktrap(g)
plot(cl, g)

enter image description here