Saturday, December 18, 2010

Subsetting a Shapefile by Attributes

If you want to select only certain features in one shapefile and export them to another you have two options.  You can select features spatially or by the database attributes.  You can subset by attributes using the Python Shapefile Library in just a few lines of code.  In this example I use a building footprint shapefile which spans three counties and extract building footprints from just one of the counties.  The county name is one of the attributes.  The first step is to create a shapefile reader for the original 41 megabyte building footprint shapefile, Next we create a shapefile writer as a target for extracted features.  We copy the database fields from the first shapefile to the second.  We then make the selection based on attributes.  Next the features in this selection are added to the writer.  Finally the new the shapefile is written.

import shapefile

# Create a reader instance
r = shapefile.Reader("Building_Footprint")
# Create a writer instance
w = shapefile.Writer(shapeType=shapefile.POLYGON)
# Copy the fields to the writer
w.fields = list(r.fields)
# Grab the geometry and records from all features 
# with the correct county name 
selection = [] 
for rec in enumerate(r.records()):
   if rec[1][1].startswith("Hancock"):
# Add the geometry and records to the writer
for rec in selection:
# Save the new shapefile"HancockFootprints") 

I originally used python list comprehensions for the two loops in this example.  They usually run faster than "for" loops. However some basic testing showed them to be about the same speed in this case and a little harder to read.  If your selection were more complex you probably want to use a for loop anyway to select by multiple attributes or other filters.

As usual the code for this example can be found on the "geospatialpython" Google Code project in the source tree. The shapefile can be found on the same site in the download section.

No comments:

Post a Comment