There are several things going on here. You do have a mistake in your code, but the big problem is with the closeness
function - either its implementation or its documentation. First, what are we supposed to compute? The igraph documentation for closeness
says:
The closeness centrality of a vertex is defined by the inverse of the
average length of the shortest paths to/from all the other vertices in
the graph:
1/sum( d(v,i), i != v)
If there is no (directed) path between vertex v and i
then the total number of vertices is used in the formula instead of
the path length.
Let's compare that with what it says in the
Wikipedia article on closeness_centrality.
Closeness was defined by Bavelas (1950) as the reciprocal of the
farness, that is:
C(x) = 1 / ∑ d(y,x)
where d(y,x) is the distance between vertices
x and y. When speaking of
closeness centrality, people usually refer to its normalized form
which represents the average length of the shortest paths instead of
their sum. It is generally given by the previous formula multiplied by
N − 1 , where N is
the number of nodes in the graph. For large graphs this difference
becomes inconsequential so the −1 is dropped
resulting in:
C(x) = N / ∑ d(y,x)
This adjustment allows comparisons between nodes of graphs
of different sizes.
First off, the igraph documentation takes the sum over i != v
.
The words say "the inverse of the average length" which would imply
C(x) = (N-1) / ∑ d(y,x)
but the formula says 1 / ∑ d(y,x)
.
In fact, we will see that what the closeness
function computes corresponds
to this original definition despite the words indicating the normalized version.
But there is one other problem. You changed the Inf values to NA and then used na.rm=T
. Notice the last sentence in the igraph documentation.
If there is no (directed) path between vertex v and i then the total
number of vertices is used in the formula instead of the path length.
You are not supposed to ignore these nodes. You are supposed to set the distance to the total number of nodes in the graph. So, to get the same result as produced by igraph, you need to compute:
Dist <- distances(g, mode="out")
Dist[Dist == Inf] <- vcount(g)
1/rowSums(Dist)
Amy Ram Li Kate
0.1666667 0.1428571 0.1428571 0.1666667
closeness(g, mode = "out")
Amy Ram Li Kate
0.1666667 0.1428571 0.1428571 0.1666667
Certainly, the igraph documentation is inconsistent. The words say that it computes the normalized closeness, but the formula (and what it actually computes) is the un-normalized form.
I hope that this makes it clear what is being computed and helps you pick what you want to use for your analysis.
BTW: When you compute 1/rowMeans(Dist)
, you are including the v=i case (where the distance is zero) which igraph leaves out. That means that you are computing C(x) = N / ∑ d(y,x)
rather than C(x) = (N-1) / ∑ d(y,x)
. As noted in Wikipedia, for large graphs, they are essentially the same, but I just want to be sure that you are aware of what you are computing.
function igraph_closeness
to jump to the right location. – eipi10