1
votes

I have a postcode for each vertex in an igraph object. I want to convert these into geographic coordinates using ggmap so I can calculate edge attribute = geographic distance.

require(igraph)
require(ggmap)

g <- graph.ring(6)
V(grph)$postcode <- c("Johannesburg 2017", 
                      "Rondebosch 8000",
                      "Durban 4001", 
                      "Pietermaritzburg 3201", 
                      "Jeffreys Bay 6330", 
                      "Pretoria 0001" )

I thought I could generate geographic coordinates for each vertex this way:

V(g)$coordinate <- geocode(V(g)$postcode, sensor = FALSE, 
                           output = "latlon", source = "google")

The result is a list of latlon coordinates for all the vertices repeated for each vertex, not a unique latlon for each vertex.

head(head(V(g)$coordinate)
[[1]]
[1] 28.03837 28.31993 31.02204 30.36661 24.91015 28.18540

[[2]]
[1] -26.18825 -25.84222 -29.84962 -29.65119 -34.05067 -25.74895

[[3]]
[1] 28.03837 28.31993 31.02204 30.36661 24.91015 28.18540

[[4]]
[1] -26.18825 -25.84222 -29.84962 -29.65119 -34.05067 -25.74895

[[5]]
[1] 28.03837 28.31993 31.02204 30.36661 24.91015 28.18540

[[6]]
[1] -26.18825 -25.84222 -29.84962 -29.65119 -34.05067 -25.74895

The -ve numbers = latitude, +ve numbers = longitude. What am I doing wrong?

1
geocode creates a list of tuples for each postcode. Each tuple contains a latitude and longitude value.aterhorst

1 Answers

1
votes

The problem is that geocode returns a dataframe, but when you assign it to V(g)$coordinate it is treating it as a list, and recycling the columns to get a value for each vertex.

postcode_df <- geocode(V(g)$postcode, sensor = FALSE, 
                           output = "latlon", source = "google")

postcode_df
#        lon       lat
# 1 28.03837 -26.18825
# 2 28.31993 -25.84222
# 3 31.02204 -29.84962
# 4 30.36661 -29.65119
# 5 24.91015 -34.05067
# 6 28.18540 -25.74895

You need to turn each row of the dataframe into an element that can be assigned to a vertex. This can be done lots of ways, here's a simple one:

V(g)$coordinate <- split(postcode_df, 1:nrow(postcode_df))

V(g)$coordinate
# [[1]]
# lon       lat
# 1 28.03837 -26.18825
# 
# [[2]]
# lon       lat
# 2 28.31993 -25.84222
# 
# [[3]]
# lon       lat
# 3 31.02204 -29.84962
# 
# [[4]]
# lon       lat
# 4 30.36661 -29.65119
# 
# [[5]]
# lon       lat
# 5 24.91015 -34.05067
# 
# [[6]]
# lon       lat
# 6 28.1854 -25.74895