Pages

Tuesday, April 9, 2013

Add a Field to an Existing Shapefile

The dbf file of a shapefile is a simple file-based database with rows and columns.  The rows are
Adding a field where there wasn't one before has
limitless possibilities.
"records" and the columns are "fields".  Sometimes you want to add an additional field to the dbf file to capture some new type of information not originally included.

Today's example shows you how to use pyshp to add a new field to an existing shapefile.  This operation is a two-step process.  You must first update the dbf header to define the new field.  Then you must update each record to account for a new column in the database so everything is balanced.

In the past, I've demonstrated modifying existing shapefiles for other reasons including merging shapefiles and deleting features in shapefiles.  In every case you are actually reading in the existing shapefile, creating a new shapefile in memory and then writing out the new file either separately or on top of the old one.  Even in really high-end GIS packages that's basically all you're doing.  Some packages will use a temporary file in between. 

Here's the example.  We'll create a counter that gives us unique sample data to append to each record just so we can see the changes clearly.  In the real world, you'd probably just insert a blank palce holder.

import shapefile

# Read in our existing shapefile
r = shapefile.Reader("Mississippi")

# Create a new shapefile in memory
w = shapefile.Writer()

# Copy over the existing fields
w.fields = list(r.fields)

# Add our new field using the pyshp API
w.field("KINSELLA", "C", "40")

# We'll create a counter in this example
# to give us sample data to add to the records
# so we know the field is working correctly.
i=1

# Loop through each record, add a column.  We'll
# insert our sample data but you could also just
# insert a blank string or NULL DATA number
# as a place holder
for rec in r.records():
 rec.append(i)
 i+=1
 # Add the modified record to the new shapefile 
 w.records.append(rec)

# Copy over the geometry without any changes
w._shapes.extend(r.shapes())

# Save as a new shapefile (or write over the old one)
w.save("Miss") 

So there you have it. Overall it's a pretty simple process that can be extended to do some sophisticated operations.  The sample Mississippi shapefile can be found here.  But this shapefile only has one record so it's not that interesting.  But it's lightweight and easy to examine the dbf file in your favorite spreadsheet program.

8 comments:

  1. hello Joel,
    I'm an urban hydrologist from paris, France. I found your blog very very interesting and useful! I would like to ask you a question to which I didn't find the answer. I wish to update a column in my attribute table with my hydrology modelling results (txt file), sort of "join" with an id column which exist in both files. Can you give some clues? Thanks in advance.
    yinghao

    ReplyDelete
    Replies
    1. yinghao,

      I'd be happy to send you and example. Would it be possible for you to send me data samples? That would make it much easier.

      - Joel

      Delete
  2. Thank you. I'm learning this great tool. Unfortunately, there is very little information on the use of Pyshp

    ReplyDelete
  3. Thanks for sharing "shapefile.py", it works perfectly!! The result shapefile does not have georeference information. It needs to be added manually.

    ReplyDelete
    Replies
    1. Fangjun,

      Thank you for the feedback! You are correct that pyshp does not produce the georeference information defining the projection. But check out this post!

      http://geospatialpython.com/2014/12/wkt-epsg-strings-made-easy.html

      Delete
  4. Hi, How can I change the datatype of the new field?

    Thanks!

    ReplyDelete
    Replies
    1. The type is stored in the corresponding DBF file, so I assume you can use the types like C, N, F, I, ... (example doc). Each field has a mandatory width (size) and precision (number of decimals). For example, to add fields with numeric vals, you can use:

      w.field("numeric", "N", "8")

      For floats you should be able to use something like

      w.field("float", "N", "8", "3")

      Btw, thanks to the author of this post. Saved me a tone of time when merging thousands of single-shape shapefiles with separate CSV data into a new shapefile!

      Delete
  5. Thank you very much brother. Your code saved my ass!!!!
    I love your approach.

    ReplyDelete