Pages

Wednesday, January 19, 2011

Point in Polygon

The Ray Casting Method tests if a point is inside a polygon.
UPDATE: There's a newer version of this algorithm that accounts for points that fall on the boundary of a polygon which are included as inside the polygon. The title is "Point in Polygon 2: Walking the line" and was published Aug. 23, 2011.

A fundamental geospatial operation is checking to see if a point is inside a polygon.  This one operation is the atomic building block of many, many different types of spatial queries.  This operation seems deceptively simple because it's so easy to see and comprehend visually. But doing this check computationally gets quite complex.

At first glance there are dozens of algorithms addressing this challenge.  However they all have special cases where they fail.  The failures come from the infinite number of ways polygons can form which ultimately foil any sort of systematic check.  For a programmer the choice comes down to a compromise between computational efficiency (i.e. speed in this case) and thoroughness (i.e. how rare the exceptions are).

The best solution to this issue I've found is the "Ray Casting Method".  The idea is you start drawing an imaginary line from the point in question and stop drawing it when the line leaves the polygon bounding box. Along the way you count the number of times you crossed the polygon's boundary.  If the count is an odd number the point must be inside.  If it's an even number the point is outside the polygon.  So in summary, odd=in, even=out - got it?

This algorithm is fast and is accurate.  In fact, pretty much the only way you can stump it is if the point is ridiculously close to the polygon boundary where a rounding error would merge the point with the boundary.  In that case you can just blame your programming language and switch to Python.

I had no intention of implementing this algorithm myself so I googled several options, tried them out, and found a winner.  It's interesting but not surprising that most of the spatial algorithms I find and use come from computer graphics sites, usually gaming sites or computer vision sites, as opposed to geospatial sites.  My favorite ray casting point-in-polygon sample came from the "Simple Machine Forum" at "PSE Entertainment Corp".  It was posted by their anonymous webmaster.

# Determine if a point is inside a given polygon or not
# Polygon is a list of (x,y) pairs. This function
# returns True or False.  The algorithm is called
# the "Ray Casting Method".

def point_in_poly(x,y,poly):

    n = len(poly)
    inside = False

    p1x,p1y = poly[0]
    for i in range(n+1):
        p2x,p2y = poly[i % n]
        if y > min(p1y,p2y):
            if y <= max(p1y,p2y):
                if x <= max(p1x,p2x):
                    if p1y != p2y:
                        xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
                    if p1x == p2x or x <= xints:
                        inside = not inside
        p1x,p1y = p2x,p2y

    return inside

## Test

polygon = [(0,10),(10,10),(10,0),(0,0)]

point_x = 5
point_y = 5

## Call the function with the points and the polygon
print point_in_poly(point_x,point_y,polygon)

Easy to read, easy to use.  In a previous post on creating a dot density profile, I used the "contains" method in OGR to check randomly-generated points representing population counts against US Census Bureau tracts.  That script created a point shapefile which could then be added as a layer.  It worked great but it wasn't pure python because of OGR.  The other problem with that recipe is creating a shapefile is overkill as dot density maps are just a visualization.

I decided to build on some other posts to combine this ray casting method, PNGCanvas, and the Python Shapefile Library to create a lightweight, pure Python dot density map implementation. The following code reads in a shapefile of census tracts, looks at the population value for that tract, then randomly draws a dot within that census tract for every 50 people.  The census tract boundaries are also added to the resulting PNG image.  The conventional wisdom, especially in the geospatial world, states if you need to do a large number of costly calculations it's worth using C because Python will be much slower.  To my surprise the pure Python version was just about as quick as the OGR version.  I figured the point-in-polygon calculation would be the most costly part.  The results are close enough to warrant further detailed profiling which I'll do at some point.  But regardless this operation is much, much quicker in pure Python than I expected.

import random
import shapefile
import pngcanvas

def pip(x,y,poly):
    n = len(poly)
    inside = False
    p1x,p1y = poly[0]
    for i in range(n+1):
        p2x,p2y = poly[i % n]
        if y > min(p1y,p2y):
            if y <= max(p1y,p2y):
                if x <= max(p1x,p2x):
                    if p1y != p2y:
                        xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
                    if p1x == p2x or x <= xints:
                        inside = not inside
        p1x,p1y = p2x,p2y
    return inside

# Source shapefile - can be any polygon
r = shapefile.Reader("GIS_CensusTract_poly.shp")

# pixel to coordinate info
xdist = r.bbox[2] - r.bbox[0]
ydist = r.bbox[3] - r.bbox[1]
iwidth = 600
iheight = 500
xratio = iwidth/xdist
yratio = iheight/ydist

c = pngcanvas.PNGCanvas(iwidth,iheight,color=[255,255,255,0xff])

# background color
c.filledRectangle(0,0,iwidth,iheight)

# Pen color
c.color = [139,137,137,0xff]

# Draw the polygons 
for shape in r.shapes():
  pixels = []
  for x,y in shape.points:  
    px = int(iwidth - ((r.bbox[2] - x) * xratio))
    py = int((r.bbox[3] - y) * yratio)
    pixels.append([px,py])
  c.polyline(pixels)

rnum = 0
trnum = len(r.shapeRecords())
for sr in r.shapeRecords():
  rnum += 1
  #print rnum, " of ", trnum
  density = sr.record[20]
  total = int(density / 50)
  count = 0
  minx, miny, maxx, maxy = sr.shape.bbox   
  while count < total:    
    x = random.uniform(minx,maxx)
    y = random.uniform(miny,maxy)    
    if pip(x,y,sr.shape.points):
      count += 1
      #print " ", count, " of ", total
      px = int(iwidth - ((r.bbox[2] - x) * xratio))
      py = int((r.bbox[3] - y) * yratio)
      c.point(px,py,color=[255,0,0,0xff])

f = file("density_pure.png", "wb")
f.write(c.dump())
f.close()

The shapefile used above can be found here.

You can download PNGCanvas here.

And the Python Shapefile Library is here.

33 comments:

  1. Hey Joel,

    Just a note from the Webmaster at PSE Entertainment Corp. Glad the Point-in-Poly algorithm worked out for you. I too am a geospatial python enthusiast. I haven't posted any new code in a while - I'll have to get back on it. Lately, I've written a few scripts that wrap NGS tools to perform geodetic transformations. Geoid heights, coordinate transforms, etc. You have great blog - keep up the good work!

    Best Regards,

    J.D.

    ReplyDelete
  2. hi, i'm searching how to see if a point is inside a polygon.
    and i found your page.

    but i can't run your example with lat/lon.


    lat="-33.416032"
    lon= "-70.593016"
    poligono = [("-33.416032","-70.593016"), ("-33.415370","-70.589604"), ("-33.417340,-70.589046"), ("-33.417949,-70.592351"), ("-33.416032,-70.593016")]

    print point_in_poly(lat,lon,poligono)

    can you help me, please.

    thanks

    ReplyDelete
  3. Vicuña,

    Easy fix!

    There should be no "quotes" around the numbers. They must be numbers and not strings.

    lat is the "y" value and lon is the "x" value. In your example they are reversed.

    Your corrected example would look like:

    lat= -33.416032
    lon= -70.593016
    poligono = [(-33.416032,-70.593016), (-33.415370,-70.589604), (-33.417340,-70.589046), (-33.417949,-70.592351), (-33.416032,-70.593016)]

    print point_in_poly(lon,lat,poligono)

    Which is "False".

    Let me know if you have any other problems.

    Next time please post in the Geospatial Python forum:
    http://groups.google.com/group/geospatialpython

    Thanks,
    Joel

    ReplyDelete
  4. thanks for the answer.
    i'm new in python.

    but it didn't work weill, because the point is a vertex of the polygon, so it should be True.
    i try with another point inside and outside and the response was False too.

    Can you help me please.

    thanks

    ReplyDelete
  5. sorry, i didn't see the instructions:
    "lat is the "y" value and lon is the "x" value. In your example they are reversed."

    thanks

    ReplyDelete
  6. Hi,

    This solution works quit well most of the time.

    but I have a problem:
    I tried to use this for determining in what country a point is.
    The problem i got is countries that consist of multiple parts (greece, spain for instance). result in a false result.
    does anybody have the same problem and maybe a solution for this?

    Harmen

    ReplyDelete
    Replies
    1. I see shape.parts is never used in your code but many shapefiles actually use it to include nested polygons. Maybe this can explain the issue you describe here. The solution is quite simple: take care of the structure of the nested polygons (shape.points *must* be sliced according to the indexes in shape.parts). Also, if you have nested polygons, the same logic applies as if you where testing your point against edges: if (x,y) is inside an odd number of polygons then inside must be True, etc. (e.g. consider an island on a river and you're looking for a point no ground or water.)

      Delete
  7. Hi,

    Does it matter what order the vertexes for the polygon are in?

    ReplyDelete
  8. Yes - point order matters in a polygon. Points should be clockwise and polygon holes are counterclockwise according to the spec.

    ReplyDelete
  9. Thanks Joel! I was doing the points left to right, and got the right answer most of the times but couldn't figure out why certain times the answer was wrong. Big help. When you said spec, where is this spec?

    ReplyDelete
  10. I keep a copy here:

    http://pyshp.googlecode.com/files/shapefile.pdf

    ReplyDelete
  11. Does it matter what order the points are put in, as long as they are clockwise?

    ReplyDelete
  12. If I understand your question, no - as long as they end up clockwise before you save. When you add points for a polygon they go inside a nested list called "parts". A point Is just a list of [x,y] inside the parts list. You can create the parts list and add it to the poly method at any time.

    ReplyDelete
  13. Sorry, I was a little ambiguous. What I meant was do the order of the vertexes of the polygon list matter? Or can they be specified in any clockwise manner. For example, if you had a simple square defined by 4 points, does it matter which vertex is listed first, given that they are listed clockwise?

    ReplyDelete
  14. Ah - sorry - no, the starting point doesn't matter.

    ReplyDelete
  15. Joel: Do you have any suggestions for determining (with Python) if a convex or concave polygon runs clockwise? Your script works well, but the set of polygons I have are a mixture of clockwise and counterclockwise. Thank you in advance.

    ReplyDelete
  16. Mike,

    Sorry for the slow reply. I don't have anything coded up but here's some pretty straight forward math that may help:

    http://tinyurl.com/7cqzwdw

    If you implement this in Python please let me know. I'd love to share that with the community.

    ReplyDelete
  17. Joel,

    I would be happy to provide my script, but I want to make sure I have my head on straight. Don't you need to convert the lat/lon points into points on a Cartesian coordinate system for this point-in-polygon algorithm to work?

    ReplyDelete
  18. Mike,

    There's no conversion or measurement going on so I think this is fine for lat/long. It should be abstract enough and works on arbitrary polygons so I think it would be ok. Correct me if you discover otherwise.

    ReplyDelete
  19. There is a standard method that works every time.

    You the visual studio solution downloadable from http://www.inferentialpower.com/Bio/Ing/ObjectParser/.

    It contains an "InPolygon" method written in C#.

    The solution also contains an "Imprint" method for imprinting polygons onto bitmaps.

    ReplyDelete
  20. I recently dig into the point in polygon problem. I think I found two major references about this topic:

    http://www.ecse.rpi.edu/Homepages/wrf/Research/Short_Notes/pnpoly.html
    7 lines of C, assume semi-open polygons (means a point one the edge will always match only one polygon if many adjacent ones are to be considered). Note that is does not require min/max tests at all, really smart.

    while the other discusses about several algorithms:
    http://erich.realtimerendering.com/ptinpoly/

    ReplyDelete
  21. Here is the simplest Python port of the C code I mentioned above.

    # pnpoly_orig - the original pnpoly, the simplest port from the C-version. Written in "simple" Python.

    def pnpoly_orig(polyx, polyy, x, y):

    nvert = len(polyx)
    c = False
    i = 0
    j = nvert-1
    while i < nvert:
    if ((polyy[i]>y) != (polyy[j]>y)) and (x < (polyx[j]-polyx[i]) / (polyy[j]-polyy[i]) * (y-polyy[i]) + polyx[i]):
    c = not c
    j=i
    i=i+1
    return c


    I'm currently working on a really optimized version, vectorized in numpy and in which edge inverse slopes computation is done only once for each edge, even if the function is called multiple times (I keep them in a list).

    ReplyDelete
  22. The operation “point in polygon” is used every time a GIS user clicks with the mouse somewhere on a themecontaining polygons. The GIS program must then calculate if the point is inside or outside a polygon in order to identify the correct polygon where the user clicked. Other times this operation is used are e.g. when we want totest which cities that are situated in a certain district, or the distribution of soil fertility samples on different land use polygons.

    By tracing an indefinitely long line in any direction from a point, if the number of times this line intersects the edges of a polygon is uneven (or odd) (1, 3, 5, 7, …) then the point is inside the polygon. On the other hand, if the number of intersections is even (0, 2, 4, 6, 8, …), the point is outside the polygon. This implies that if we are able to calculate intersections between lines we are also able to check if a point lies inside a specific polygon.
    This method is always working if the line does not completely overlap a “border line” of the polygon. Then we have a special case, which is also possible to solve. However, this is very rare.

    Is any exception(s) when the procedure explained above, to identify if a point is within or outside a polygon would fail to work?

    lizthornblom@gmail.com

    ReplyDelete
  23. Is there any exception(s) when the procedure explained above, to identify if a point is within or outside a polygon would fail to work?

    lizthornblom@gmail.com

    ReplyDelete
  24. This code does not work. Just try your example for x=9, y=0 (the polygon is the same: (0,0) (0,10) (10,0) (10,10) )

    It should give true, but it gives false.

    ReplyDelete
    Replies
    1. You are correct. This algorithm does not account for points along the polygon boundary. Please see the following post from a few months later. I'll put a note on this post to steer people in the right direction:

      http://geospatialpython.com/2011/08/point-in-polygon-2-on-line.html

      Delete
  25. Ok, but let me show you another thing:

    Look at this question, it's about this code:
    http://stackoverflow.com/questions/16325720/point-in-convex-polygon

    I'm also trying to implement such a function in python, and this solution is what I've found on most of websites. I'm trying it for different shapes/points, but it's sometimes acting the opposite it should. (I'm not a programmer, just a beginner with python, so I don't know exactly what's wrong)

    ReplyDelete
  26. Hi, I was aiming something like that, and I tried your approach and it actually worked, thanks, but it took too long, around 0.5 secs to calculate in which of 60 polygons was 1 point (but i need to run it with a hundred million points)

    But soon, I realize, I needed to accelerate it, I was trying to do a point_on_polygon but for a list of more than 100.000 points and 60 (non overlapping) polygons (each of them with hundreds of vertices). So I decided to give it a try to shapely, I read my points and turned them into an array of points, and then did the same for the polygons, and then created a loop like:
    ```
    points = Array of shapely.Points
    polygons = Array of shapely.Polygons

    for p in points:
    for m in polygons:
    if p.within(m):
    whatever
    ```

    The performance for the point in polygon is spectacular, so if you are working on something like this, you should also try shapely.

    ReplyDelete
  27. Hi,
    thanks for providing the code. I presume that in the function point_in_poly(x,y,poly), the loop could really start at 1 (omitting evaluation of a degenerate line).
    Btw. point_in_poly(x,y,poly) should work both for clockwise and counter clockwise perimeters. Those who want to understand the algorithm should note that it relies on external point == (+infinity, y) for ray casting in the figure on the top of the page

    ReplyDelete
  28. Hi, Many thanks for this great tutorial. Regarding the second program in which you've generated some points on a shapefile, I need just the coordinates of the points. So I dropped the pngcanvas parts of your code and just stored the px and py values. But when I map them, it gives me straight line. What am I missing here?

    ReplyDelete
  29. This worked really well for me and I want to applaud your efforts. My only wish at this point is that the program worked a bit faster. I'm dealing with a large file, though. Cheers and nice work!

    ReplyDelete
  30. I know it has been years but just wanted to say thank you !

    ReplyDelete