Pages

Saturday, February 12, 2011

Create a .prj Projection File for a Shapefile

An example of a cordiform map projection a.k.a. 
heart-shaped projection
. Happy Valentine's!
If you create a shapefile with ESRI software or receive one from someone who did you may see a ".prj" file included along with the shp, shx, and dbf files.  In fact, the prj file is one of up to 9 possible "official" file extensions for various indexes and other meta files.  Most of these file formats are proprietary.  There are an additional two formats created by the open source community to work around the closed formats created by ESRI for spatial indexing.

The shapefile format does not allow for specifying the map projection of the data. When ESRI created the shapefile format everyone worked with data in only one projection. If you tried to load a layer in a different projection into your GIS weird things would happen.  Not too long ago as hardware capability increased according to Moore's Law, GIS software packages developed the ability to reproject geospatial layers on the fly.  You could now load in layers in any projection and as long as you told the software what projections were involved the map would come together nicely.

ArcGIS 8.x allowed you to manually assign each layer a projection.  This information was stored in the prj file.  The prj file contains a WKT (Well-Known Text) string which has all the parameters for the map projection. So the format is quite simple and was created by the Open GIS Consortium.

But there are several thousand "commonly-used" map projections which were standardized by the European Survey Petroleum Group (EPSG). And there's no way to accurately detect the projection from the coordinates in the shapefile. For these reasons the Python Shapefile Library does not currently handle prj files.

If you need a prj file, the easiest thing to do is write one yourself. The following example creates a simple point shapefile and then the corresponding prj file using the WGS84 "unprojected" WKT.

import shapefile as sf
filename = 'test/point'

# create the shapefile
w = sf.Writer(sf.POINT)
w.point(37.7793, -122.4192)
w.field('FIRST_FLD')
w.record('First','Point')
w.save(filename)

# create the PRJ file
prj = open("%s.prj" % filename, "w")
epsg = 'GEOGCS["WGS 84",'
epsg += 'DATUM["WGS_1984",'
epsg += 'SPHEROID["WGS 84",6378137,298.257223563]]'
epsg += ',PRIMEM["Greenwich",0],'
epsg += 'UNIT["degree",0.0174532925199433]]'
prj.write(epsg)
prj.close()

I've thought about adding the ability to optionally write prj files but the list of "commonly-used" WKT strings is over .5 megs and would be bigger than the shapefile library itself.  I may eventually work something out though.

The easiest thing to do right now is just figure out what WKT string you need for your data and write a file after you save your shapefile. If you need a list of map projection names, epsg codes, and corresponding WKT strings you can download it from the geospatialpython Github "Learn" repository here.

A word of warning if you are new to GIS and shapefiles: the prj file is just metadata about your shapefile.  Changing the projection reference in the prj file will not change the actual projection of the geometry and will just confuse your GIS software.

No comments:

Post a Comment