My answer is going to be based on the igraph
package in R. The situation is indeed quite confusing and the questions are relevant since, as Newman (2004) says,
Since the publication of that work, the author has been asked a number
of times whether an appropriate generalization of the algorithm exists
for weighted networks.
In his paper he derives an appropriate generalization of the Newman-Girvan algorithm to weighted networks.
Weights
You are right about the interpretation of weights in the Newman-Girvan algorithm. edge_betweenness
uses a formula analogous to that in (Brandes, 2001), where the length of a path is defined as the sum of the weights of its edges. (You may also check the source code but it's quite involved). In ?edge_betweenness
and, in particular, ?cluster_edge_betweenness
it says
Edge weights are used to calculate weighted edge betweenness. This
means that edges are interpreted as distances, not as connection
strengths.
The implications are as follows. Let b(e, w) be the edge betweenness of an edge e with weight w. Then it can be shown (I could elaborate if you wish) that
b(e, w) <= b(e, w*) if and only if w >= w*.
That is, the edge betweenness and the weight of e are inversely related. The main idea is that given, e.g., w* >> w, those shortest paths that were crossing e now are likely to be dominated by some other paths that do not include e. Hence, larger weight implies (weakly) lower betweenness, and lower betweenness makes it less likely that e will be recognized as an edge connecting two communities. Thus, this sounds strange if we see weights as distances. On the other hand, if e is within some community and we decrease its weight, then the number of shortest paths through that edge potentially increases and it becomes more likely to be seen as connecting two communities. I am not yet claiming anything about the corresponding modularity scores, though.
Now let's suppose that weights actually correspond to connection strengths. Then the stronger the connection is, the fewer shortest paths will go through that edge (because we still need to compute them), the lower its edge betweenness is, and the less likely it is to be removed. So that kind of makes sense.
What is not nice, or rather strange, is that now the length of a path is defined as the sum of its connection strengths. However, we can reinterpret the algorithm then. Suppose that the weights are >> 1 within communities and << 1 between them. Then we can interpret the length of a path as the privacy of this path (e.g., a path within a community would contain lots of close interactions, while the edge connecting two communities is somewhat public, open). Given such an interpretation, the algorithm would look for the least private / the most open paths and compute the corresponding betweenness. Then we would be removing such edges that belong to many most open paths.
So perhaps I made a mistake somewhere, but it looks like it would make more sense to see weights as connection strengths.
Newman (2004) does something related:
...we will consider specifically those networks in which the weights
on edges take greater values for vertex pairs that have closer
connections or are more similar in some way.
It would seem that it should make sense. However, as to keep the more natural definition of the shortest path he writes:
One can define paths on a weighted network by assuming the “length” of
an edge to vary inversely with its weight, so that two vertices that
are connected twice as strongly will be half as far apart.
That is, the shortest path lengths now are inversely related to the weights. Since not doing that seemed to give good results, now we have a problem:
To see this, notice that any two vertices that are particularly
strongly connected to one another will have a particularly short
distance along the edge between them. Geodesic paths will thus, all
other things being equal, prefer to flow along such an edge than along
another longer edge between two less well connected vertices, and
hence closely connected pairs will tend to attract a lot of paths and
acquire high betweenness. This means that, as a general rule, we are
more likely to remove edges between well connected pairs than we are
between poorly connected pairs, and this is the precise opposite of
what we would like the algorithm to do.
Which is the result that I described when we see weights as distances. As I mentioned in the beginning of the answer, to deal with this Newman (2004) proposes mapping weighted graphs to unweighted multigraphs and then proceeding very similarly as in the standard case. I believe that this multigraph idea can be implemented by setting weighted = NULL
but having not a binary adjacency matrix (when defining a graph; see weighted
in ?graph_from_adjacency_matrix
).
Modularity
First of all, one can use modularity with weighted graphs, as Newman (2004) does, that's not a problem. In general, it's so not obvious how using weights affects using modularity as the way to choose the number of communities. I'll perhaps add some examples with R. It seems like there should be an improvement over the unweighted case, as Newman (2004) finds, when the interpretation is in line with the way algorithm works. Otherwise, I think the graph structure and the weights itself can be quite important to describe the degree of how far from the truth we get.
References
Newman, M.E., 2004. Analysis of weighted networks. Physical review E, 70(5).
Brandes, U., 2001. A faster algorithm for betweenness centrality. Journal of mathematical sociology, 25(2), pp.163-177.