8
votes

I've been attempting to create a TopoJson file with consolidated layer data containing, among other layers, U.S. States, Counties, and Congressional Districts.

Original .shp shapefiles come from the Census Bureau's Cartographic Boundary Files.

These were converted to GeoJson via ogr2ogr.

Then combined into TopoJson format via the node server side library, with quantization of 1e7 and retain-proportion of 0.15. Up to this point there is no indication of any problem.

I view the final topojson file using mapshaper and things seem to look OK:rendered via mapshaper

But, when attempting to render with the topojson client library and D3.geo.path(), I encounter some strange paths in the congressionalDist layer: (notice the large rectangular paths around the continental US, AK, and HI)square paths

A working version of the page can be found here: http://jsl6906.net/D3/topojson_problem/map/

A couple of relevant notes:

  • If I change my topojson generation script to remove simplification, the paths then seem to show correctly via the same d3.js page
  • If I only keep the congressionalDist layer when creating the topojson, the paths then seem to show correctly via the same d3.js page:

good

After attempting as much troubleshooting as I've been able to handle, I figured I would ask someone here to see if someone has experienced any similar issues. Thanks for any help.

1
This seems to be/might be related to the problems mentioned in stackoverflow.com/questions/23953366/…. There the calculation of the bound went wrong with some of the regions resulting also in large rectangles. In your example, for example, d3.geo.bounds(cds[84]) results in `[[-180, -90], [180, 90]]' which seems to be incorrect. I do not know why this happens though.Jan van der Laan
Still not sure what's causing this, but one interesting thing I've noticed is that the id property of the data bound to the offending rectangles ends in ZZ, whereas all other objects have an id ending with two numbers. The id's responsible are: 09ZZ, 17ZZ, and 26ZZ. For example, try the following: d3.selectAll(d3.selectAll('.cd')[0].filter(function(d) { return d3.select(d).attr('id').slice(-2) === 'ZZ' })).style('stroke', 'red') and you will notice that only those rectangles are colored red.jshanley
It seems ZZ is the code given to "undefined" congressional districts. I'm not exactly sure what this means, but you can see it occurring in this dataset under the column CD113FP, wherever the NAMELSAD column contains "Congressional Districts not defined". Also there is a reference to removing such districts when running ogr2ogr in this file which is part of us-atlasjshanley
In case this might be useful - here is my complete workflow: 1. Download shapefiles (census.gov/geo/maps-data/data/tiger-cart-boundary) 2. Convert shapefiles to geoJson (jsl6906.net/D3/topojson_problem/3create_geo_jsons.bat.txt) 3. Combine geoJson files into topoJson (jsl6906.net/D3/topojson_problem/3create_topo_json.js)Josh
Is it possible your paths are "inside out" (counter-clockwise versus clockwise)? What happens if you assign a fill color to Alaska or Hawaii -- does it fill everything in the rectangle except the islands/state? See, e.g. stackoverflow.com/q/21786168/3128209AmeliaBR

1 Answers

4
votes

As I mentioned in the comments, I had noticed that the three offending rectangles all were bound to data with an id property ending in ZZ, while all other paths had IDs ending with numbers.

After some Google searching, I came up with what I think is the answer.

According to this document on the census.gov website,

In Connecticut, Illinois, and Michigan the state participant did not assign the current (113th) congressional districts to cover all of the state or equivalent area. The code “ZZ” has been assigned to areas with no congressional district defined (usually large water bodies). These unassigned areas are treated within state as a single congressional district for purposes of data presentation.

It seems that these three undefined districts would account for the three rectangles. It is unclear at what point in the process they cause the issue, but I believe there is a simple solution to your immediate problem. While searching for information about the ZZ code, I stumbled across this makefile in a project by mbostock called us-atlas.

It seems he had encountered a similar issue and had managed to filter out the undefined congressional districts when running ogr2ogr. Here is the relevant code from that file:

# remove undefined congressional districts
shp/us/congress-ungrouped.shp: shp/us/congress-unfiltered.shp
    rm -f $@
    ogr2ogr -f 'ESRI Shapefile' -where "GEOID NOT LIKE '%ZZ'" $@ $<

I'm betting that if you run your ogr2ogr on your shapefile using the flags shown here it will solve the problem.