1
votes

I am trying, in python 3.7, to join two files along a common attribute (ZCTA Zip Code number) and output a shapefile that contains the join along ZCTA and adds attribute Manufa_EMP (Manufacturing employment) and then change the projected coordinate system (PCS or CRS). However the shapefile that is being outputted by the following code is messing with the scale and size of the geographic projection of the file that is outputted. When I open the shapfile as a layer on a map in ArcMap and compare it to the first imported shapefile, the outputted one is projected as much much smaller. Does anyone know why this might be happening?

import sys
import pandas as pd
import geopandas as gpd
import numpy

# Set Working Directory
sys.path.append(r"G:\MAX-Filer\Collab\Labs-kbuzard-S18\Admin\Block Level Analysis")

# Read in gz.csv file as "ZCTA" Table
Table = pd.read_csv('DEC_00_SF3_DP3_with_ann.csv', skiprows = 1)

# Create new table "ZCTA_Manufa" with only Block ID and Total employment columns
Tab2 = Table.loc[:,["Id2", "Number; Employed civilian population 16 years and over - INDUSTRY - Manufacturing"]].values

# renaming headers
Tab2 = pd.DataFrame(data=Tab2, columns=["ZCTA5CE00", "Manufa_Emp"])

# Import Shapefile
zips = r"G:\MAX-Filer\Collab\Labs-kbuzard-S18\Admin\Block Level Analysis\tl_2010_06_zcta500.shp"
data = gpd.read_file(zips)

# To join the two together
Table3 = data.merge(Tab2, on='ZCTA5CE00')

zFeatures = gpd.read_file(zips).filter(['Manufa_Emp', 'ZCTA5CE00', 'geometry'], axis = 1)

# Set geometry and CRS
geometry = zFeatures.geometry
geo_df = gpd.GeoDataFrame(Table3, geometry = geometry)
geo_df.crs = {'init':'epsg:5070'}

# Export out as a shapefile
result = ("CA_ZCTA_Man.shp")
geo_df.to_file(result)
1

1 Answers

0
votes

Projecting GeoDataFrame to another CRS is done using .to_crs(). You cannot just assign it, that only overwrites the existing projection metadata without changing the geometry.

So the following line:

geo_df.crs = {'init':'epsg:5070'}

should look like this:

geo_df = geo_df.to_crs('epsg:5070')  # EPSG number would be enough