The Clip Challenge

Following a recent discussion on the  Portuguese QGis user group mailing list, about an incomplete shootout for the clip function on raster and vector datasets, I had the idea to perform this basic challenge to some of the most used open-source geospatial libraries available.

About the discussion: The challenge was between ArcGIS and QGis and the results on the vector side where somehow surprising for the QGis side, with a significant underperform. The challenge itself was very controversial because it only included this spatial operation and came in contradiction with several other shootouts available.

About this challenge: We tested some of the available open-source geospatial libraries to check for results and performance timings. On the core side we had JTS with the help of GeoTools for data reading/writing. For GEOS we used PostGIS for native support and severall Python binding libraries to check for performance, as QGis uses GEOS via python bindings.

My Machine: For sure my laptop is not decent GIS worstation but this brings significance to the performance of old/weak processors laptops. It’s a Acer Travelmate 8372 with an Intel i5, 2.4Ghz (3mb cache) and 4Gb Ram (DDR3 1066)


Dataset:

EU-DEM: 25m contour line where generated via gdal_contour for this DEM and clipped to match Portuguese country limits. The total size is about 230Mb.

Coimbra: As in the original challenge, the overlay clip test was made targeting the Coimbra District derived from CAOP dataset.

Dataset link


Results


 

Lessons learned:

We can see above that core libraries perform likewise. However, vector clipping tool in QGis uses Python binding to link with GEOS and threading in these bindings is not enabled. So performance decrease should be expected. Tests with native OGR python bindings and QGis clip tool confirm this and timing is basically the same. Shapely was later added just to check the performance of this new Python geospatial library, with the worst performance of all (further investigation needed on this lib).


Attachments:

Core libs:

public static void doClip(URL inputFileURL, URL clipFileURL, File clipOutputShpFile)
            throws IOException, MalformedURLException, IllegalAttributeException {

        // Load clip shapefile
        File file = new File(clipFileURL.getFile());
        Map map = new HashMap();
        map.put("url", file.toURL());
        DataStore clipStore = DataStoreFinder.getDataStore(map);
        String typeName = clipStore.getTypeNames()[0];

        FeatureSource clipfs = clipStore.getFeatureSource(typeName);
        FeatureCollection clipcoll = clipfs.getFeatures();

        // Load input shapefile
        file = new File(inputFileURL.toString());
        map = new HashMap();
        map.put("url", file.toURL());
        //DataStore inputStore = DataStoreFinder.getDataStore(map);
        FileDataStore inputStore = FileDataStoreFinder.getDataStore(inputFileURL);
        typeName = inputStore.getTypeNames()[0];

        FeatureSource inputfs = inputStore.getFeatureSource(typeName);

        // Create the output shapefile
        ShapefileDataStore outStore = new ShapefileDataStore(clipOutputShpFile.toURL());
        outStore.createSchema((SimpleFeatureType) inputfs.getSchema());
        FeatureWriter writer = outStore.getFeatureWriter(outStore.getTypeNames()[0], Transaction.AUTO_COMMIT);

        // Filter features by clip geom
        FilterFactory2 ff = CommonFactoryFinder.getFilterFactory2(null);

        BoundingBox filterBB = clipcoll.features().next().getBounds();

        Geometry filterGeom = JTS.toGeometry(filterBB);

        Expression geomPropertyExpr = new AttributeExpressionImpl(inputfs.getSchema().getGeometryDescriptor().getName());
        Expression areaGeomExpr = ff.literal(filterGeom);

        Filter filter = ff.intersects(geomPropertyExpr, areaGeomExpr);

        // Apply filter
        FeatureCollection inputcoll = inputfs.getFeatures(filter);

        // Get clip geom
        FeatureIterator clipiterator = clipcoll.features();
        SimpleFeature clipFeature = (SimpleFeature) clipiterator.next();
        Geometry clipGeometry = (Geometry) clipFeature.getDefaultGeometry();

        // Iterate input features and write intersecting ones
        FeatureIterator inputiterator = inputcoll.features();

        try {

            SimpleFeatureCollection collection = FeatureCollections.newCollection();
            Object[] att = null;
            int count = 0;
            while (inputiterator.hasNext()) {
                // Extract feature and geometry
                SimpleFeature feature = (SimpleFeature) inputiterator.next();
                Geometry featureGeometry = (Geometry) feature.getDefaultGeometry();

                if (featureGeometry.intersects(clipGeometry)) {

                    // Clip
                    Geometry newGeom = featureGeometry.intersection(clipGeometry);

                    // Adjust the feature to represent the clipped geometry with same attributes
                    feature.setDefaultGeometry(newGeom);

                    // Get the next, empty feature from the writer
                    SimpleFeature writeFeature = (SimpleFeature) writer.next();
                    att = feature.getAttributes().toArray();

                    for (int n = 0; n < att.length; n++) {
                        writeFeature.setAttribute(n, att[n]);
                    }
                    writer.write();

                    count++;

                }

            }

            writer.close();

        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            inputiterator.close();
        }

        // TODO: Dispose all datastores, etc..

    }
SELECT ST_Intersection(a.geom, b.geom) as clip
FROM contours a, dcoimbra b
WHERE ST_Intersects(a.geom, b.geom) = 'TRUE';

Python Bindings:

from osgeo import ogr
from time import time

start = time()

input = ogr.Open('/path/to/my_input.shp')

clip = ogr.Open('/path/to/my_clip.shp')

input_shape = input.GetLayer(0)
clip_shape = clip.GetLayer(0)
#feature of distrito
clip_f = clip_shape.GetFeature(0)
clip_geom = clip_f.GetGeometryRef()

driver = ogr.GetDriverByName('ESRI Shapefile')
output = driver.CreateDataSource('/path/to/my_output.shp')
output_layer = output.CreateLayer('clip',geom_type=ogr.wkbLineString)

id_field = ogr.FieldDefn ('ID', ogr.OFTInteger)
output_layer.CreateField(id_field)

for f in input_shape:
    geom_input = f.GetGeometryRef()
    id = f.GetField('ID')
    # select only the intersections
    if geom_input.Intersects(clip_geom):
        inner_geom = geom_input.Intersection(clip_geom)
        output_f = ogr.Feature(output_layer.GetLayerDefn())
        output_f.SetGeometry(inner_geom)
        output_f.SetField("ID", id)
        output_layer.CreateFeature(output_f)
        output_f = inner_geom = None

    geom_input = id = None

end = time()
elapsed = end - start
print "Time taken: ", elapsed, "seconds."
import fiona
from time import time
from shapely.geometry import shape, mapping

start = time()

with fiona.open("/path/to/my_input.shp", "r") as input:
    schema = input.schema.copy()
    with fiona.open("/path/to/my_clip.shp", "r") as clip:
        for c in clip:
            geom_clip = shape(c['geometry'])
        with fiona.open("/path/to/my_output.shp", "w", "ESRI Shapefile",
                        schema) as output:
            for f in input:
                if shape(f['geometry']).intersects(geom_clip):
                    clip = shape(f['geometry']).intersection(geom_clip)
                    output.write({
                            'properties': {
                                'ID': f['properties']['ID']
                            },
                            'geometry': mapping(clip)
                        })
                    clip = None

end = time()
elapsed = end - start
print "Time taken: ", elapsed, "seconds."

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>