Pages

Monday, February 28, 2011

Changing a Shapefile's Type

A polygon, line, and point version of the same shapefile.
Sometimes you want to convert a shapefile from one type to another.  For example you may want to convert a line shapefile to a polygon or a polygon to a point or multipoint shapefile.  There are many reasons for this type of operations ranging from error checking, to special queries, to inconvenient distribution formats.  For example a lot of coastline data is distributed as line data but you may want to convert it to a polygon to estimate coastal erosion using area comparisons between two different dates.

Performing this type of conversion is very straightforward using the Python Shapefile Library.  In fact the conversion is basically a one-off version of the shapefile merge example I wrote about recently.  You read in one shapefile and write the features and records out to another of the correct type.  There are a couple of pitfalls you need to be wary of though.  One is the current version (1.0) of the PSL requires you to explicitly set the shape type of each record if you want to convert them.  The second issue is if you are converting to a single point shapefile where each point feature is a record you must compensate for the imbalance in the dbf records by copying the record from the parent feature for each point.  Instead of dealing with this issue you could simply create a multi-point shapefile where each shape record is allowed to be a collection of points.  Which method you choose depends on what you are trying to do with the output.  The examples below cover both methods.

The example in this post takes a state boundary polygon file and converts it to a line shapefile, then a multipoint shapefile, then a regular point shapefile.  Note the difference between the point shapefile and the line and multipoint examples.

"""
Convert one shapefile type to another 
"""

import shapefile


# Create a line and a multi-point 
# and single point version of
# a polygon shapefile

# The shapefile type we are converting to
newType = shapefile.POLYLINE

# This is the shapefile we are trying
# to convert. In this case it's a
# state boundary polygon file for 
# Mississippi with one polygon and
# one dbf record.
r = shapefile.Reader("Mississippi")

## POLYLINE version
w = shapefile.Writer(newType)
w._shapes.extend(r.shapes())
# You must explicity set the shapeType of each record.
# Eventually the library will set them to the same
# as the file shape type automatically.
for s in w.shapes():
  s.shapeType = newType
w.fields = list(r.fields)
w.records.extend(r.records())
w.save("Miss_Line")

## MULTIPOINT version
newType = shapefile.MULTIPOINT

w = shapefile.Writer(newType)
w._shapes.extend(r.shapes())
for s in w.shapes():
  s.shapeType = newType
w.fields = list(r.fields)
w.records.extend(r.records())
w.save("Miss_MPoint")

## POINT version
newType = shapefile.POINT

w = shapefile.Writer(newType)
# For a single point shapefile
# from another type we
# "flatten" each shape
# so each point is a new record.
# This means we must also assign
# each point a record which means
# records are usually duplicated.
for s in r.shapeRecords():
  for p in s.shape.points:
    w.point(*p)
    w.records.append(s.record)  
w.fields = list(r.fields)
w.save("Miss_Point")

You can download the state boundary polygon shapefile used in the example from the GeospatialPython Google Code Project Downloads section.  You can download the sample script above from the subversion repository of that same project.

And of course the Python Shapefile Library is here.

No comments:

Post a Comment