0
votes

I am trying to create a map of all school districts in each state. The code below works for all states, except in Florida I get this error: Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 67, 121

require(dplyr)
require(sf)
library(tmap)
require(lwgeom)


  temp <- tempfile()  ### create a temporary file to download zip file to
  temp2 <- tempfile() ### create a temporary file to put unzipped files in
  download.file("https://s3.amazonaws.com/data.edbuild.org/public/Processed+Data/SD+shapes/2018/shapefile_1718.zip", temp) # downloading the data into the tempfile

  unzip(zipfile = temp, exdir = temp2) # unzipping the temp file and putting unzipped data in temp2

  filename <- list.files(temp2, full.names = TRUE) # getting the filename of the downloaded data

  shp_file <- filename %>%
    subset(grepl("*.shp$", filename)) ## selecting only the .shp file to read in 

  state_shape <- sf::st_read(shp_file) %>% ## reading in the downloaded data
    dplyr::mutate(GEOID = as.character(GEOID),
                  GEOID = stringr::str_pad(GEOID, width = 7, pad = "0")) %>% 
    filter(State == "Florida")

  url = "https://s3.amazonaws.com/data.edbuild.org/public/Processed+Data/Master/2017/full_data_17_geo_exc.csv"
  master <- read.csv(file = url, stringsAsFactors = FALSE) %>%
    dplyr::mutate(NCESID = as.character(NCESID),
                  NCESID = stringr::str_pad(NCESID, width = 7, pad = "0"),
                  year = "2017") %>%
    dplyr::select(-NAME, -State, -STATE_FIPS) ## removing variables that duplicate with shapes

  state_shape <- state_shape %>%
    dplyr::left_join(master, by = c("GEOID" = "NCESID")) %>% 
    select(GEOID, NAME, State, StPovRate)

  shape.clean <- lwgeom::st_make_valid(state_shape) # making all geometries valid

  povertyBlues <-  c('#dff3fe', '#92DCF0', '#49B4D6', '#2586a5', '#19596d')

  map <- tm_shape(shape.clean) + 
    tm_fill("StPovRate", breaks=c(0, .1, .2, .3, .4, 1), title = "Student Poverty",
            palette = povertyBlues, 
            legend.format=list(fun=function(x) paste0(formatC(x*100, digits=0, format="f"), " %"))) +
    tm_shape(shape.clean) +
    tm_borders(lwd=.25, col = "#e9e9e9", alpha = 1) +
    tm_layout(inner.margins = c(.05,.25,.1,.05)) 

  map  ### view the map

The length of the tm_shape$shp and state_shape are both 67. Does anyone know what could be causing the "arguments imply differing number of rows: 67, 121"?

Thanks!!

3
Welcome to Stack Overflow. When you ask any questions, you want to provide all data along with your code. Right now, you are missing shape.clean, which I believe your data containing StPovRate. Please provide this data. Otherwise, nobody can replicate your situation and think how to help you.jazzurro
the state_shape is has invalid geometry; this can be fixed by lwgeom::st_make_valid(). However: the downloaded file (which is 28 mb for anyone trying on a weak connection) does not contain "StPovRate" field.Jindra Lacko
I edited my question to add StPovRatem apologies for the oversight. Jindra, you are correct that state_shape has invalid geometry but unfortunately st_make_valid does not fix it. I added it to the example for clarity.skhaji

3 Answers

0
votes

I was not able to get these shapes to print using tmap, but I was able to manually remove the problems points and lines so that they don't trigger errors in tmap. Both Florida and Nebraska had geometry collections in them, so I used the following script to remove any line or points and change the geometry collections to multipolygons. I am sure there is a better way, and would be happy to hear them if others have a more elegant solution. This, at least, allows me to move on!

### Create an st_is function that works for many types
st_is = function(x, type) UseMethod("st_is")

st_is.sf = function(x, type)
  st_is(st_geometry(x), type)

st_is.sfc = function(x, type)
  vapply(x, sf:::st_is.sfg, type, FUN.VALUE = logical(1))

st_is.sfg = function(x, type)
  class(x)[2L] %in% type

####### Correct Florida #########

#### import my florida file
  temp <- tempfile()  ### create a temporary file to download zip file to
  temp2 <- tempfile() ### create a temporary file to put unzipped files in
  download.file("https://s3.amazonaws.com/data.edbuild.org/public/Processed+Data/SD+shapes/2018/shapefile_1718.zip", temp) # downloading the data into the tempfile

  unzip(zipfile = temp, exdir = temp2) # unzipping the temp file and putting unzipped data in temp2

  filename <- list.files(temp2, full.names = TRUE) # getting the filename of the downloaded data

  shp_file <- filename %>%
    subset(grepl("*.shp$", filename)) ## selecting only the .shp file to read in 

  florida <- sf::st_read(shp_file) %>% ## reading in the downloaded data
    filter(State == "Florida")

#### extract polygon shapes from any geometry collection
solution <- for (i in florida) {
  for(j in seq_along(i)) {
    if (class(i[[j]]) != "character" & class(i[[j]]) != "double" & class(i[[j]]) != "numeric") {
      if (st_is.sf(i[[j]], c("GEOMETRYCOLLECTION"))) {
        i[[j]] <- st_collection_extract(i[[j]], type = c("POLYGON"))
      }
      else {
        next
      }
    }
    else {
      next
    }
  }
}

florida_clean <- florida
st_geometry(florida_clean) <- NULL #### remove geometry from original florida
sfc_geo <- sf::st_sfc(i)  #### define i as an sfc
florida_clean$geometry <- sfc_geo  #### attach i to florida
florida_clean <- sf::st_set_geometry( florida_clean, sfc_geo )  ### set florida's geometry as i, with the points and lines removed 
0
votes

I just had a similar issue and managed to address it with the trick explained here https://www.r-spatial.org/r/2017/03/19/invalid.html

namely applying a buffer of 0.0 to the shapes that are invalid

p[which(st_is_valid(p)== FALSE),]= st_buffer(p[which(st_is_valid(p)== FALSE),], 0.0)

where p is the layer in question. I hope this helps.

0
votes

I had a similar issue relating to GEOMETRYCOLLECTIONs mixed in with other geometry types (in my case, MULTIPOLYGONs), producing the same error re: differing numbers of rows. By unpacking the GEOMETRYCOLLECTIONs to POLYGONs, and then casting to MULTIPOLYGON, I got a uniform spatial file comprising only MULTIPOLYGONs without dropping the features contained in the GEOMETRYCOLLECTIONs, and without the tmap error. Something akin to:

florida <- sf::st_read(shp_file) %>% ## reading in the downloaded data
    filter(State == "Florida") %>%
    st_collection_extract(type = "POLYGON") %>% ##unpacking into POLYGON-type geometries
    st_cast ##type-casting (in this case, st_cast automatically casts to MULTIPOLYGONs)