3
votes

To remove duplicate lists from a list, there are several nice ways in Python - for example:

a = [[ 9.1514622, 47.1166004 ], [ 9.1513045, 47.1164599 ], [ 9.1516278, 47.1163001 ], [ 9.1517832, 47.1164408 ], [ 9.1514622, 47.1166004 ] ] 

print len(a) # 5
b_set = set(map(tuple,a))
b = map(list,b_set)
print len(b) # 4

But unfortunately, I had to convert my list to a Shapely Polygon object, in which I need to simplify the geometry and do some other geo functions.

from shapely.geometry import Polygon
a = [[[ 9.1514622, 47.1166004 ], [ 9.1513045, 47.1164599 ], [ 9.1516278, 47.1163001 ], [ 9.1517832, 47.1164408 ], [ 9.1514622, 47.1166004 ] ] ]
polys = [Polygon(item) for item in a] # convert list to polygon
print len(polys) # prints 5

This answer shows how to remove a duplicate Polygon from a list of Polygons, but how can I remove a duplicate point from a list of points, as a Shapely polygon?

I guess it's possible to convert it back to a list, remove duplicates, and then re-convert to Polygon.

But that seems overly complicated. Any ideas on how to do this?

1

1 Answers

6
votes

Let's use the data in your question as an example. You have a list of coordinates:

L = [[ 9.1514622, 47.1166004 ], [ 9.1513045, 47.1164599 ], [ 9.1516278, 47.1163001 ], [ 9.1517832, 47.1164408 ], [ 9.1514622, 47.1166004 ]]

which is then converted into a Polygon:

P = Polygon(L)

Now, it might seem that L is redundant since the last point is the same as the first one. But that's actually not a problem since Shapely would otherwise duplicate the first point anyway (in order to close the boundary of the Polygon). You can see this with:

P = Polygon(L)
print(list(P.exterior.coords))
#[(9.1514622, 47.1166004), (9.1513045, 47.1164599), (9.1516278, 47.1163001), (9.1517832, 47.1164408), (9.1514622, 47.1166004)]

#now skip the last point
P = Polygon(L[:-1])
print(list(P.exterior.coords))
#[(9.1514622, 47.1166004), (9.1513045, 47.1164599), (9.1516278, 47.1163001), (9.1517832, 47.1164408), (9.1514622, 47.1166004)]

In case there would be some duplicate point "inside" L, as for example in:

L = [[ 9.1514622, 47.1166004 ], [ 9.1513045, 47.1164599 ], [ 9.1513045, 47.1164599 ], [ 9.1516278, 47.1163001 ], [ 9.1517832, 47.1164408 ], [9.1514622, 47.1166004 ]]

then one could eliminate it using the simplify method with zero tolerance (in order to not introduce side-effects):

print(list(Polygon(L).simplify(0).exterior.coords))
#[(9.1514622, 47.1166004), (9.1513045, 47.1164599), (9.1516278, 47.1163001), (9.1517832, 47.1164408), (9.1514622, 47.1166004)]