Apache Parquet is a powerful column-oriented data format, built from the ground up to as a modern alternative to
CSV files. GeoParquet is an incubating Open Geospatial Consortium (OGC) standard
that adds interoperable geospatial types (Point, Line, Polygon) to Parquet.
Following GeoParquet's structure enables interoperability between any system that reads or writes spatial data
Columnar Data for Geo
Data science workflows benefit from columnar data formats, and geospatial analysis can tap into its
Cloud Data Warehouse Interoperability
Snowflake, BigQuery, RedShift, DataBricks can all work together seamlessly with the same geospatial data
Who is involved in GeoParquet?
GeoParquet is rapidly maturing, with a number of new software libraries and tools coming online.
Browser-based converter: powered by the GPQ
library, you can convert GeoJSON to GeoParquet and vice-versa, from within your browser.
(Python) extends the datatypes used by pandas to allow spatial operations
on geometric types and supports reading and writing GeoParquet.
QGIS Windows and Linux ship with GeoParquet support, and Mac can work installing with conda (from the terminal with conda activated run 'conda install qgis libgdal-arrow-parquet', and then just type 'qgis' in the terminal).
Scribble Maps is a full-featured web app that supports both import & export of GeoParquet.
BigQuery Converter provides Python scripts to read and write
GeoParquet files with Google BigQuery.
Apache Sedona is a cluster computing system for processing large-scale spatial data that extends existing cluster computing systems like Apache Spark and Apache Flink.
It can load and save GeoParquet with Scala, Java, Python or R.
Esri's ArcGIS GeoAnalytics Engine 'delivers spatial analysis to your big data by extending Apache Spark with ready-to-use
SQL functions and analysis tools'. It can load or save GeoParquet with the Python library or the Spark plugin, see their GeoParquet page
for more details.
SeerAI'sGeodesic Platform is a cloud-native, planetary scale Spatiotemporal Data Mesh and Data Fusion platform. Geodesic's Boson Service Mesh supports GeoParquet natively and can expose massive GeoParquet datasets as compatible formats to other analytical systems and geospatial software via APIs. All tabular and feature data outputs are written in Parquet/GeoParquet format.
Fiona (Python - as of version 1.9.4. Note the GeoParquet driver will only be available if your system's GDAL library links libarrow; fiona wheels on PyPI do not include libarrow as it is rather large.)