Skip to content

Jump to hands-on lesson

Overview of Cloud Optimized GeoTIFF (COG)

Background

The TIFF file format (Tagged Image File Format) is an old format dating back to 1992. TIFF are great for high-resolution verbatim raster images. TIFF are still used a bit in high-end photography, but where it has really grown a second life is in digital cartography. The variation called GeoTIFF has been widely adopted as a way to share satellite images and other satellite data.

While the GeoTIFF file format has long been thought of as only suitable for raw data: if you wanted to display it on a map, you’d convert it into tiles. If you wanted a static image, you’d render it into a PNG or JPEG. But Cloud-Optimized GeoTIFF means that GeoTIFFs can be a bit more accessible than they used to be.

Relationship of COGs to other cloud native formats

Much of the material in this workshop is recursive - you need to know about GeoJSON to understand STACs and to work with tools which query COGs, COPC, Zarr, and Xarray data.

Built to support efficient tile-by-tile access to large collections of geospatial imagery, the COG has provided an excellent template for the development of other cloud-optimized data formats (e.g. Zarr).

Like many open source projects, the development and production of COGs has lead to innovation in other areas as well. One example of such innovation is the development of the SpatioTemporal Asset Catalog (STAC).

The SpatioTemporal Asset Catalog (STAC) specification provides a common language to describe a range of geospatial information, so it can more easily be indexed and discovered. A ‘spatiotemporal asset' is any file that represents information about the earth captured in a certain space and time.

COGs and STAC provide the building blocks for a flexible and accessible system for geospatial data analysis (geospatial imagery). STAC provides a system for describing large collections of geospatial data stored in cloud object store and COG provide efficient access to pieces of those collections without the need to download the data first.

A Cloud Optimized GeoTIFF (COG) is a regular GeoTIFF file, aimed at being hosted on a HTTP file server, with an internal organization that enables more efficient workflows on the cloud. It does this by leveraging the ability of clients issuing ​HTTP GET range requests to ask for just the parts of a file they need.

Cloud Optimized GeoTIFFs (COGs) are just like regular GeoTIFF

COG Specification

COGs have three major features: internal tiling, internal overview structures, and HTTP GET Range Requests

Tiling

Internal Tiling GeoTIFFs (typically into 128x128, 256x256, or 512x512 pixel tiles).

COGs leverage Virtual Raster Tiles (VRT) which are virtual datasets using XML format. GDAL uses VRTs to create mosaic datasets which improve performance for loading and viewing COG data, e.g., gdalbuildvrt

Overviews

Overviews are downsampled thumbnail images of the tile. A COG will have many overviews matched to each Zoom Level.

geotiff_pyramid

GeoTIFF pyramid by Zoom Level

HTTP(s) GET Range Request

The HTTP GET Range Request allows a client to request specific chunks of the COG.

GDAL

The lastest versions of GDAL (>v3.1) have COG generator installed by default.

Most GIS software use GDAL.

Install GDAL

GDAL installation can at times be difficult. When different older python environments are installed on a desktop or laptop GDAL can become broken or incompatiblity issues can come up when installing it.

USGS Windows GDAL Installation Guide

Official GDAL Install Guide

QGIS installs GDAL by default

Anaconda and its package management conda

Docker osgeo/gdal images are maintained on the Docker Hub

  • cogger is a rapid COG generator from GeoTIFF

Example COGs in WebGL

Open Layers COGs

Open Layers WebGLTile Pyramid from COG

Hands On

Step 1 Finding COGs on the internet

There are numerous cloud based data stores hosting COGs, take a look through a few of these:

CyberGIS with COG support

Google Earth Engine supports Image Collections in COG format.

Microsoft Planetary Computer Catalog - most of the imagery datasets in Planetary Computer are hosted as COGs

OpenAerialMap hosts drone imagery

Public Datasets

NASA

ESA Sentinel-2 COGs

AWS STAC Search

Step 2 Open the COG Viewer

Open in your browser the COG Viewer

Alternate viewers:

COGEO.xyz

Try adding the https:// URL of the COG that you found on the internet into the viewer.

Step 3 Option 1: open a COG in QGIS

  • Open QGIS

  • In the "Layers" then "Add Layer" and then "Add Raster Layer"

  • Choose the Source Type and select "Protocol: HTTP(s), cloud, etc" for a file on your computer

  • Enter a valid https:// in the URl field for a COG you found online

Step 4 Optional - create a COG from a GeoTIFF

Open a console and check your gdal installation

gdalinfo --version

Make sure that you're operating on at least v3.1 of GDAL (current latest v3.5.1)

Sample USGS GeoTIFFs

gdal_translate p_ndvi_cor.tif p_ndvi_cor_cog.tif \
-b 1 -b 2 -b 3  \
-of COG \
-co TILING_SCHEME=GoogleMapsCompatible \
-co COMPRESS=JPEG \
-co OVERVIEW_QUALITY=100 \
-co QUALITY=100

Check the file size of your example file and your output file. Which is larger?

Now, if we want to add overviews to the output p_ndvi_cor_cog.tif:

gdaladdo \
  --config COMPRESS_OVERVIEW JPEG \
  --config JPEG_QUALITY_OVERVIEW 100 \
  --config PHOTOMETRIC_OVERVIEW YCBCR \
  --config INTERLEAVE_OVERVIEW PIXEL \
  -r average \
  p_ndvi_cor_cog.tif \
  2 4 8 16

Step 5 Optional: upload the file to a public https:// endpoint

If you have your own web server, or public cloud bucket, you can upload your new COG and view it using one of the example viewers, or try loading it using QGIS.

Here is an example using the CyVerse Data Store and the CoGEO viewer

Additional Reading

GeoTIFF Compression for Dummies - suggests the best version is "a GeoTIFF, with JPEG compression, internally tiled, in the YCBCR color space, with internal overviews."

COGS in Production blog post by Sean Rennie


Last update: 2022-11-15