Geospatial data in Apache Parquet

Encoding geospatial data in Apache Parquet.


Apache Parquet is a powerful column-oriented data format, built from the ground up to as a modern alternative to CSV files. GeoParquet is an incubating Open Geospatial Consortium (OGC) standard that adds interoperable geospatial types (Point, Line, Polygon) to Parquet.

Read the specification for the v1.1.0 release (or see the metadata schema). Find links to older releases on the release page.

For more information see the goals and features section of the readme in the GeoParquet repository. There is also a nice deep dive on Parquet and GeoParquet in this blog post: Introducing the GeoParquet data format, and we'll be soon expanding this website with more details.

Why GeoParquet?

Who is involved in GeoParquet?


GeoParquet is rapidly maturing, with a number of new software libraries and tools coming online.



Data Providers & Sample Data

There are many sources of GeoParquet data, with more and more coming online all the time. If you have or know of a good source of GeoParquet data please let us know!