Arne pointed out that all the code samples out there iterate through each feature in a shapefile and add them to the merged file. He says this method is slow. I agree to an extent (no pun intended). However, at some point the underlying shapefile library MUST iterate through each feature in order to generate the summary information, namely the bounding box, required to write a valid shapefile header. But it is theoretically slightly more efficient to wait until the merge is finished so there is only one iteration cycle. At the very least, waiting till the end requires less code.
The following example merges all the shapefiles in the current directory into one file and it is quite fast.
# Merge a bunch of shapefiles with attributes quickly!
import glob
import shapefile
files = glob.glob("*.shp")
w = shapefile.Writer()
for f in files:
r = shapefile.Reader(f)
w._shapes.extend(r.shapes())
w.records.extend(r.records())
w.fields = list(r.fields)
w.save("merged")
4 comments:
To make the code you wrote works I had to modify few thing, because the writer need something to extend to use extend.
# Merge a bunch of shapefiles with attributes quickly!
import glob
import shapefile
files = glob.glob("*.shp")
w = shapefile.Writer()
r = shapefile.Reader()
w._shapes.append(shapefile.Reader(files[0]))
for f in files[1:]:
print f
r = shapefile.Reader(f)
w._shapes.extend(r.shapes())
w.records.extend(r.records())
w.fields = list(r.fields)
w.save("merged")
I hope this can help
gionata,
That is strange... the writer initializes shapes as an empty array which is extendable:
>>> import shapefile
>>> w = shapefile.Writer()
>>> w._shapes()
[]
>>> w._shapes.extend([1,2,3])
>>> w._shapes()
[1,2,3]
What version of Python and what platform are you using? Could you post some sample code?
Very nice piece of code and very useful to merge a lot of tiled data.
I however had to update a little bit the shapefile.py (date: 20110927, version: 1.1.4) to process my pointZ shapefiles. On line 699, I've change s.points[0][2] by s.z[0] and on line 705 s.points[0][3] by s.m. Does it seems correct?
Thanks a lot,
David
Post a Comment