How does Cerulean work?
Potential oil slick identification
Currently, potential oil slicks are detected using a U-Net model trained on a dataset of expert-labeled oil slicks and applied to a single data layer:
The VV polarization of Sentinel-1 GRD data from the European Space Agency’s Copernicus satellite program, accessed via Amazon Web Service’s (AWS) Open Data Registry. Only Sentinel-1 scenes that intersect with the ocean are processed by our model.
Prior to inference, the Sentinel-1 VV polarization imagery is scaled to 80-meter resolution and split into overlapping 512×512 tiles, which are then processed by a ResNet34-based U-Net model. Predicted inference results from overlapping image tiles are composed using basic merging strategies that average the model’s confidence scores.
These merged oil slick rasters are then processed to create instances of vectorized polygons, which are inserted into a cloud-based Postgres database. Additional calculations, such as polygon area, length, perimeter, and other geometric properties, are performed within the database. All oil slick detection results are accessible via our online map interface or our OGC Compliant REST API. A graphical overview of our inference architecture can be seen in Figure 1.
Disclaimer: It is not possible to definitively identify oil slicks using synthetic aperture radar (SAR) satellite data alone. All Cerulean-detected oil slicks should be considered potential oil slicks, not definitive oil slicks.
Vessel source association
Potential oil slicks captured by Cerulean are processed to determine which nearby broadcasting vessels could be responsible for the pollution event. Only long, linear oil detections are considered for evaluation. Vessels near oil detections are identified using vessel position histories from Automatic Identification System (AIS) broadcast systems. We download all available AIS data that intersects each Sentinel-1 scene for a time window from 8 hours preceding Sentinel-1 image collection time to 6 hours after Sentinel-1 image collection time. The path and timing of an AIS broadcast is analyzed to compute three metrics with respect to the centerline of a slick detection – parity, proximity, and temporality.
Parity. The parity score is computed by comparing the length of the slick to its projected length along the AIS track. This gives us a measurement of how parallel the AIS path is to the path of the slick. Greater parallelism results in a higher parity score.
Proximity. The proximity score is computed by considering the distance from the head of the slick to its nearest point along the AIS track. Closer distance results in a higher proximity score.
Temporality. The temporal score is computed by considering the timestamp of the AIS broadcast that is spatially proximal to the head of the slick. This gives us an estimate of when the vessel was last polluting. The sooner this pollution event is to the time the slick is captured, the higher the temporal score.
Infrastructure source association
We compute the proximity of oil slicks to nearby pieces of offshore oil and gas infrastructure, such as oil platforms. Infrastructure locations are identified using SAR Fixed Infrastructure locations which are made available through the Global Fishing Watch API.
Cerulean detections are automatically attributed to potential infrastructure sources using an algorithm which calculates the probability that a nearby point is the terminus or origin of the slick. The algorithm finds points along the perimeter of the slick which are far enough from the centerpoint to be considered a potential terminus. It then applies a distance decay to assign higher probabilities to points closer to the potential terminus, and lower probabilities to points which are further away. This algorithm is run on points obtained from the SAR Fixed Infrastructure Dataset to identify and assign probabilities to the most likely culprits.
Essentially the algorithm is designed to reveal known infrastructure located near the terminus of an oil slick, indicating that oil could be flowing from it. For more detailed information on this algorithm, please see the corresponding code on our GitHub page.
Dark vessel source association
In cases where the potential polluter is not a stationary piece of infrastructure nor a broadcasting AIS vessel, we can still automatically associate them with a slick using Global Fishing Watch (GFW) dark object detections. This data can reveal vessels in the imagery which aren’t reporting their location but can be visibly identified as potential culprits, also known as dark vessels. We limit our search to only objects larger than 30 meters in estimated length and that were detected by GFW with high confidence within a radius of 50 kilometers of the oil.
We predict the path of the vessel at each end of the slick using interpolated points selected from the edges of the slick’s centerline. We then assign scores to objects based on their distance from the slick and their angular deviation from the predicted path. This allows us to capture only the dark objects which reasonably align with the direction and orientation of the pollution.
Sources of data used by Cerulean
as of 2025-05-01
Data | Purpose | Source |
---|---|---|
Sentinel-1 GRD | Oil slick detection training and inference | European Space Agency, via Registry of Open Data on AWS |
Offshore infrastructure locations | Assessing proximity of potential sources and slicks | Global Fishing Watch |
AIS vessel location data | Assessing proximity of potential sources and slicks | Spire Maritime, via Global Fishing Watch |
Dark object detections | Assessing proximity of potential dark vessel sources and slicks | Global Fishing Watch |
Marine Protected Areas | Map display and slick intersection | Protected Planet, World Database of Protected Areas |
Exclusive Economic Zone boundaries | Map display and slick intersection | Marineregions.org |
Ocean current and wind | Map display | NOAA GFS 0.25 degree data, via WeatherLayers |
Bathymetry | Map display | Mapbox |
Global political boundaries | Map display | Mapbox |