NASA Earth Observatory image modified by SkyTruth

Cerulean Methods



SkyTruth’s Terms of Service can be found here. By using Cerulean, you warrant that you understand and agree to the Terms of Service.

Cerulean is currently in beta release (v1.0-beta). All methods and algorithms are under active development and could change at any time. When the algorithms are finalized, we will provide a fuller, updated description of our methods. During the beta phase, this page should not be assumed to be accurate. For a full and current understanding of our methods, we recommend checking our GitHub repository.

Oil slick identification

Currently, oil slicks are detected using a Mask R-CNN model trained on a dataset of 600 expert labeled oil slicks and applied to three data layers:

The VV polarization of Sentinel-1 GRD data from the European Space Agency’s Copernicus satellite program, accessed via Amazon Web Service’s (AWS) Open Data Catalog. Only Sentinel-1 scenes that intersect with the ocean or inland seas are processed by our model.

Proximity to offshore infrastructure, as identified from a circa-2021 global map of offshore infrastructure created by Global Fishing Watch (publication in press). We generated a set of rasters whose values correspond to the distance to the nearest piece of offshore oil and gas infrastructure identified in the Global Fishing Watch dataset.

Global maritime vessel traffic density, published by the Global Maritime Traffic Density Service. and upscaled to match the resolution of the Sentinel-1 data inputs.

Prior to inference, each data layer is scaled to 80 meter resolution and split into overlapping 512×512 tiles, which are then processed by the Mask R-CNN model. Inference results from overlapping image tiles are then composited using basic merging strategies that average the inference chain’s confidence score.

Predicted oil slick instances are then vectorized into polygons, which are inserted into a cloud-based Postgres database, where additional calculations such as polygon area, length, perimeter, and other geometric properties are calculated. All oil slick detection results can be accessed via our online map interface or our OGC Compliant REST API. A graphical overview of our inference architecture can be seen in Figure 1.

Disclaimer: It is not possible to definitively identify oil slicks using synthetic aperture radar (SAR) satellite data alone. All Cerulean detected oil slicks should be considered potential oil slicks, not definitive oil slicks.

Figure 1. High level flow diagram of Cerulean inference pipeline.

Vessel source candidate identification

Vessels near possible oil slicks are identified using vessel position histories from Automatic Identification System (AIS) broadcast systems. For each possible oil slick with a machine confidence > 0.5, we compute a proximity score to identify vessels whose path and timing best match the path and timing of identified oil slicks.

To calculate proximity scores, we download all available AIS data that intersects each Sentinel-1 scene for a time window from 12 hours preceding Sentinel-1 image collection time to 1 hour after Sentinel-1 image collection time. Then we calculate three metrics to get a combined proximity score (P), as follows. For complete information on methods, please refer directly to the code in the GitHub repository.

Frechet distance (F). The Frechet distance is calculated between the path of a vessel and the path of a slick polygon. The path of the vessel is defined by the linear trajectory of its AIS points. The path of the slick polygon is defined by approximating its central trajectory vector. To approximate this vector, we use Voronoi polygonization to split the polygon, compute the centroids of each resulting shape, then fit a spline curve through those centroids to generate a linestring that can be compared to the vessel path using the Frechet distance.

Spatial overlap (S). We define spatial overlap as the intersecting area between the oil slick and the buffered vessel path. We use conical buffers to account for the fact that slicks drift over time, meaning that older portions of slicks tend to move a greater distance from their point of origin than newer portions of slicks. To do this, we create a cone around each track, resulting in a cone-shaped buffer like that shown below, with a narrower, more highly-weighted buffer at AIS positions closer to the time of Sentinel-1 image capture, and a wider, less-weighted buffer at older AIS positions.

Figure 2. Example of a rasterized conical buffer computed for an AIS track. The black circle indicates the vessel position at the time of Sentinel-1 image capture. The white dotted line represents the interpolated AIS track. And the color of the buffer corresponds to weights, with darker reds (nearer the vessel location at the time of image capture) corresponding to higher weights and lighter reds corresponding to lower weights.

Infrastructure source candidate identification

We also compute the proximity of oil slicks to nearby pieces of offshore oil and gas infrastructure, such as oil platforms. Infrastructure locations are identified using a circa-2021 global map of offshore infrastructure created by Global Fishing Watch (publication in press). This dataset used Sentinel-1 GRD data to find radar corner reflector objects in the ocean that remained in the same location for at least six months. It then classified those persistent objects into categories (wind, oil & gas, and other) using a deep learning model trained on Sentinel-2 data.

The relationship between oil slicks polygons and offshore infrastructure points is computed by relating an infrastructure point to a 2D slick polygon using a mathematical formula called the moment of inertia. Essentially, the equation penalizes pieces of infrastructure located at the center of the slick, and highlights those at the furthest ends of a slick. Unfortunately, because we have not yet implemented the ability to predict the direction a slick is flowing, you will occasionally see pieces of infrastructure at the tail end of a slick get high scores. We also apply a strict proximity requirement to avoid ranking too highly pieces of infrastructure that are far away from the terminus of a slick. A scaling factor brings this ranking into the same range as the vessel Proximity formula, though there is not a one-to-one correspondence

This algorithm is designed to favor infrastructure points located near the terminus of an oil slick, indicating that oil could be flowing from it. For more detailed information on this algorithm, please see the corresponding code on our GitHub page.

Sources of data used by Cerulean

as of 2024-01-01

Sentinel-1 GRDOil slick detection training and inferenceEuropean Space Agency, via Registry of Open Data on AWS
Vessel density dataOil slick detection model training and inferenceGlobal Maritime Traffic Density Service
Offshore infrastructure locationsAssessing proximity of potential sources and slicksGlobal Fishing Watch
AIS vessel location dataAssessing proximity of potential sources and slicksSpire Maritime, via Global Fishing Watch
Marine Protected AreasMap display and slick intersectionProtected Planet, World Database of Protected Areas
Exclusive Economic Zone boundariesMap display and slick
Ocean current and windMap displayNOAA GFS 0.25 degree data, via WeatherLayers
BathymetryMap displayMapbox
Global political boundariesMap displayMapbox