Drilling Detection with Machine Learning Part 3: Making and mapping predictions

SkyTruth Technical Program Director Ry Covington, PhD explains challenges to generating meaningful predictions from the machine learning model, and outlines solutions.

[This is the final post in a 3-part blog series describing SkyTruth’s effort to automate the detection of oil and gas well pads around the world using machine learning. We hope that this series – and the resources we’ve provided throughout – will help other developers and scientists working on conservation issues to learn from our experience and build their own exciting tools.  You can read the two previous posts in the series here and here.]

SkyTruth Intern Sasha Bylsma and Geospatial Analyst Brendan Jarrell explained how we create training data and implement a machine learning model to detect oil and gas well pads. So, now that we’ve got a trained model, we just need to run it on a few satellite images and put the predictions on a map.  Seems easy enough…  

We started with some Sentinel-2 imagery collected over the Neuquén Basin in central Argentina. This is one of the most heavily drilled areas in Argentina, and we’ve used the Google Earth Engine platform to export a few Sentinel-2 images that we could work with.  

Screenshot of the Neuquén basin in Argentina. 

The images come out of Earth Engine as GeoTIFFs – a pretty standard file format for overhead imagery. We’ve used some Earth Engine magic to reduce the file size of each image so that they’re easier to handle, but they’re still a bit big for the model. The model expects simple, small patches of images: 256 pixels high, 256 pixels wide, and three channels (e.g., Red, Green, Blue) deep. Our Sentinel-2 GeoTIFFs are about 11,000 pixels high by 11,000 pixels wide, so that left us with a few things to figure out:

  • First, the model is expecting small, simple patches of images – no frills, just pixel values. That means that we have to take the geographic information that’s imbedded in the original GeoTIFF and set aside.  So, how do we do that? 
  • Second, how can we evenly slice up the full image into the small patches that the model is expecting?
  • Third, for every small image patch that the model sees, it returns a small, single channel prediction image with values between zero and one. The closer a pixel is to one, the more likely it is that pixel belongs to a drilling site.  But once the model makes predictions on all of the small images, how do we put them back together in the right order to get the original image’s size and shape?
  • Lastly, how do we take the prediction image and convert it into polygons that we can overlay on a map?

These were all things that we’d never done before, so it’s taken us quite a bit of trial and error to figure out how to make things work. In fact, we’re still working on them – we’ve got a workflow in place, but we’re always trying to refine and improve it. For now, let’s just look at what we’ve got working. 

Step 1. Converting our GeoTIFF to a NumPy array

We started off with a pretty small subset of a Sentinel-2 image that we could experiment with. It’s 1,634 pixels high, 1,932 pixels wide, and 3 channels deep. In developer’s parlance, its shape is: (1634, 1932, 3). The image is of Mari Menuco Lake in Argentina. There are a few dozen drilling sites along the southern portion of the lake that seemed like an ideal place to test out our workflow. Once we had everything working like expected, we’d run the larger Sentinel-2 images through.  

First, we used the GDAL Python API to load our image and collect its (a) geotransform and (b) projection. So, what are these two things? Well, basically, the geotransform is the formula that GDAL uses to go from pixel space (think rows and columns) to geographic space (think x [longitude] and y [latitude]), and the projection is just the coordinate reference system of the image. After we had those two pieces of information set aside for later, we pushed all of the image bands into an NumPy array. 

 

# Get geographic information.
projection = tiff.GetProjection()           

# Set spatial reference.
spatial_ref = osr.SpatialReference()     # Create empty spatial reference.
spatial_ref.ImportFromWkt(projection)    # Read the "projection" string.

# Collect all the bands in the .tif image.
bands = [tiff.GetRasterBand(band+1) for band in range(tiff.RasterCount)]

# Read each band as an array.
arrays = [band.ReadAsArray() for band in bands] 

# Combine into a single image. 
image = np.array(arrays)

# Format as (height, width, channels).
image = image.transpose(1,2,0)

GDAL reads and writes images differently than NumPy, so the last thing we did was transpose the axes to put our image in the format that we needed: height, width, channels.   

Step 2. Slicing our NumPy array and running predictions 

The next bit was tricky for us to figure out. We needed to take our image – 1634 by 1932 by 3 (1634, 1932, 3) – and slice it into even squares of (256, 256, 3). Our first problem: neither 1634 nor 1932 divides by 256 evenly, so we needed to figure out how to make the image patches overlap just enough to get a clean division.  

Our second problem: we also needed to keep track of where each patch lived in the larger image so that we could put the predictions back together in the right order later. We ended up giving each patch an ID and collecting the coordinates of their top-left pixel (their minimum x and minimum y). We pushed that information into a pandas dataframe – basically, a 2-D matrix of rows and columns – that we could set aside to rebuild our prediction image later.  

Many thanks to CosmiQ Works and all of the participants in the SpaceNet Challenges; the code snippets and GitHub repos that they’ve made available were immensely helpful for us as we tried to figure out how to implement this step.

 

# Set some variables.
patchSize = 256
overlap = 0.2
height, width, bands = image.shape
imgs, positions = [], []
columns = ['xmin', 'ymin']

# Work through the image and bin it up.
for x in range(0, width-1, int(patchSize*(1-overlap))):    
   for y in range(0, height-1, int(patchSize*(1-overlap))):
      
       # Get top-left pixel.
       xmin = min(x, width-patchSize)
       ymin = min(y, height-patchSize)

       # Get image patch.
       patch = image[ymin:ymin+patchSize, xmin:xmin+patchSize]

       # Set position.
       pos = [xmin, ymin]

       # Add to array.
       imgs.append(patch)
       positions.append(pos)

# Convert to NumPy array.
imageArray = np.array(imgs) / 255.0

# Create position datataframe.
df = pd.DataFrame(positions, columns=columns)
df.index = np.arange(len(positions))

Once we had the new array of patches – 80 patches of 256 by 256 by 3 – it was easy to run the model and generate some predictions.

# And, go. Don't forget the batch dimension.
predictions = model.predict(imageArray, batch_size=20, steps=4)

Step 3. Rebuilding our image

The model returns an array of predictions – (80, 256, 256, 1). The prediction values range from zero to one. So, a pixel value of .82 means that the model is 82% confident that pixel belongs to a drilling site.  

Side by side comparison of an image and its prediction.

We used the pandas dataframe that we made earlier to put all of these predictions back together in the right order and get the original image’s size and shape. The dataframe was where we recorded the ID and top-left pixel (their minimum x and minimum y) of each patch. First, we created an empty image that is the same size and shape as the original. Next, we went through the dataframe, took out each patch, and added it to the new empty image in the right spot (its top-left pixel).  

 

# Create numpy zeros of appropriate shape.
empty_img = np.zeros((height, width, 1))

# Create another zero array to record where pixels get overlaid.
overlays = np.zeros((height, width, 1))

# Iterate through patches.
for index, item in positions.iterrows():

   # Grab values for each row / patch.
   [xmin, ymin] = item

   # Grab the right patch.
   slice = predictions[index]
  
   x0, x1 = xmin, xmin + patchSize
   y0, y1 = ymin, ymin + patchSize

   # Add img_slice to empty_img.
   empty_img[y0:y1, x0:x1] += slice

   # Update count of overlapping pixels.
   overlays[y0:y1, x0:x1] += np.ones((patchSize, patchSize, 1))            

# Normalize the image to get our values between 0 and 1. 
rebuilt_img = np.divide(empty_img, overlay_count)

Most of our team are visual thinkers, so the easiest way for us to imagine rebuilding the image is like covering a blank sheet of white paper in pink sticky-notes, and then smoothing them all down to get a new, blank sheet of pink paper.   

Step 4. Converting our predictions to polygons

After rebuilding our prediction array to be the same size and shape as the original satellite image, we used the GDAL Python API to convert it into polygons that could go on a map. To try and clean things up a bit, we started by selecting only those pixels where the model was more than 70% confident they belonged to a drilling site. We set anything under that threshold to zero. This just helped us to eliminate some noise and clean up the predictions a bit. With that done, we used GDAL to convert the cleaned up prediction image into polygons and reattach the spatial information that we set aside at the beginning (i.e., the geotransform and the projection).  

  

# Band to use.
sourceBand = output.GetRasterBand(1)

# Set up shapefile.
shpDrv = ogr.GetDriverByName("ESRI Shapefile")                                
outFile = shpDrv.CreateDataSource("/gdrive/My Drive/detections.shp")      
layer = outFile.CreateLayer("detections", srs=spatial_ref)                    

# Add field.
idField = ogr.FieldDefn("id", ogr.OFTInteger)               
layer.CreateField(idField)

# And, convert. 
gdal.Polygonize(sourceBand, sourceBand, layer, 0, [], None)

And at this point we had our shapefile. From there, it was easy to upload that shapefile as an asset into our Earth Engine account and have a look at our predictions over satellite imagery. We did a bit of clean up and editing to make sure that all of the polygons look good — what’s sometimes called “human-in-the-loop” modeling — but, for the most part, we were all done.

Screenshot of the polygons in EE.

Lastly, we did a quick assessment to see how well our from-scratch workflow functioned. In the image above, red points are drilling sites that we got correct (true positives), green points are drilling sites that we missed (false negatives), and blue points are places where the model thought there was a drilling site when there wasn’t (false positives). Here are the numbers: 

Total number of validated ground truth points: 239
True Positives: 107
False Positives: 50
False Negatives: 82
Precision: 0.6815286624203821
Recall: 0.5661375661375662
F1-score: 0.6184971098265896

Precision, recall, and F1-score are all just metrics for understanding how a model performs. Your spam folder offers a good example. Imagine that your email’s spam model makes 30 predictions. Precision would measure the percentage of emails flagged as spam that it correctly classified as spam. Recall would measure the percentage of actual spam emails that it correctly classified as spam. Often, these two things are in tension – if the model is more aggressive (i.e., lowers the threshold for an email to be classified as spam) the recall will go up since they’d be capturing more actual spam emails. But the precision would go down, because they’re also capturing more emails overall, and it’s pretty likely that many of those won’t be spam. The F1-score builds off of precision and recall, and it’s probably easiest to think of it as a measure of overall accuracy. In the context of the drilling site work, our precision and recall numbers mean that 68% of the things we’re predicting as drilling sites actually are drilling sites, but we’re only picking up 56% of the drilling sites that are actually out there in the world. 

We hope that this series has helped to demystify the machine learning workflow for others. Figuring out how to build a machine learning data processing pipeline from scratch was an exciting challenge for us. We’re encouraged by our progress so far, but as the metrics above indicate, there’s still lots of room for improvement. We will keep at it because this technology stack — integrating remote sensing with cloud computing and machine learning — is at the heart of our Conservation Vision initiative to automate the analysis of imagery, to solve challenging conservation problems. 

So please stay tuned for updates in the future.  And don’t hesitate to send us a note if you have questions about what we’ve done or ideas about how we can improve things going forward.  We welcome the collaboration! 

 

Drilling Detection with Machine Learning Part 2: Segmentation Starter Kit

Geospatial Analyst Brendan Jarrell explains, step-by-step, how to develop a machine learning model to detect oil and gas well pads from satellite imagery.

[This is the second post in a 3-part blog series describing SkyTruth’s effort to automate the detection of oil and gas well pads around the world using machine learning. This tool will allow local communities, conservationists, researchers, policymakers and journalists to see for themselves the growth of drilling in the areas they care about. This is a central part of SkyTruth’s work: to share our expertise with others so that anyone can help protect the planet, their communities, and the places they care about. You can read the first post in the series here. All of the code that will be covered in this post can be found here. Our training dataset is also available here.]

SkyTruth Intern Sasha Bylsma explained in our first post in this series how we create training data for a machine learning workflow that will be used to detect oil and gas well pads around the world. In this post, I’m going to explain how we apply a machine learning model to satellite imagery, explaining all the tools we use and steps we take to make this happen, so that anyone can create similar models on their own.

Once we have created a robust set of training data, we want to feed a satellite image into the machine learning model and have the model scan the image in search of well pads. We then look to the model to tell us where the well pads are located and give us the predicted boundary of each of the well pads. This is known as segmentation, as shown in Figure 1. 

Figure 1: An example of our current work on well pad segmentation. The original image is seen on the left; what the ML model predicts as a well pad can be seen on the right. Notice that the algorithm is not only returning the drilling site’s location, but also its predicted boundaries.

We want the model to identify well pad locations because of the crucial context that location data provides. For example, location can tell us if there is a high density of drilling in the area, helping nearby communities track increasing threats to their health. It can also calculate the total area of disturbed land in the area of interest, helping researchers, advocates and others determine how severely wildlife habitat or other land characteristics are diminished.  

In the past, SkyTruth did this work manually, with an analyst or volunteer viewing individual images to search for well pads and laboriously drawing their boundaries. Projects like FrackFinder, for example, may have taken staff and volunteers weeks to complete. Now, with the help of a machine learning model, we can come in on a Monday morning, let the model do its thing, and have that same dataset compiled and placed on a map in an hour or two. The benefits of leveraging this capability are obvious: we can scan thousands of images quickly and consistently, increasing the likelihood of finding well pads and areas with high levels of drilling.

Formatting the data

So how do we do this? The first thing we need to do is get our data into a format that will be acceptable for the machine learning model. We decided that we would use the TensorFlow API as our framework for approaching this task. TensorFlow is an open-source (i.e. “free-to-use”) software package that was developed by Google to give users access to a powerful math library specifically designed for machine learning. We exported data from Google Earth Engine in the TFRecord format; TFRecords are convenient packages for exporting information from Earth Engine for later use in TensorFlow. In our code under the section labeled “Get Training, Validation Data ready for UNET,” we see that there are a few steps we must fulfill to extract the TFRecords from their zipped up packages and into a usable format (see Figure 2). 

# Bands included in our input Feature Collection and S2 imagery.

bands = ['R','G','B']
label = 'Label'
featureNames = bands + [label]
# Convert band names into tf.Features.

cols = [
         tf.io.FixedLenFeature(shape=[256,256],dtype=tf.float32) for band in featureNames
       ]

"""Pass these new tensors into a dictionary, used to describe pieces of the input dataset."""
featsDict = dict(zip(featureNames,cols))

Figure 2:  Preprocessing code

Second, we create Tensorflow representations of the information we are interested in drawing out of each of our examples from the Google Earth Engine workflow (see the first post in this series for more explanation on how we made these samples). Each of the samples has a Red, Green, and Blue channel associated with it, as well as a mask band, called “label” in our code. As such, we create Tensorflow representations for each of these different channels that data will be plugged into. Think of the representations we create for each channel name as sorting bins; when a TFRecord is unpacked, the corresponding channel values from the record will be placed into the bin that represents it. After loading in all of our TFRecords, we push them into a TFRecord Dataset. A TFRecord Dataset is a dataset which is populated by several TFRecords. We then apply a few functions to the TFRecord Dataset that make the records interpretable by the model later on.

Validation dataset

Once the dataset is loaded in, we split the dataset into two. This is an important part of machine learning, where we set aside a small amount of the whole dataset. When the model is being trained on the larger portion of the dataset, known as the training data, it will not see this smaller subset, which we call the validation set. As its name suggests, the model uses this smaller fraction of information to perform a sanity check of sorts. It’s asking itself, “Okay, I think that a well pad looks like this. Am I close to the mark, or am I way off?” All of this is put in place to help the model learn the minute details and intricacies of the data we’ve provided it. Typically, we will reserve 15-30% of our total dataset for the validation set. The code necessary for splitting the dataset is shown in Figure 3 below.

# Get the full size of the dataset.
full_size = len(list(data))
print(f'Full size of the dataset: {full_size}','\n')

# Define a split for the dataset.
train_pct = 0.8
batch_size = 16
split = int(full_size * train_pct)

# Split it up.
training = data.take(split)
evaluation = data.skip(split)

# Get the data ready for training.
training = training.shuffle(split).batch(batch_size).repeat()
evaluation = evaluation.batch(batch_size)

# Define the steps taken per epoch for both training and evaluation.
TRAIN_STEPS = math.ceil(split / batch_size)
EVAL_STEPS = math.ceil((full_size - split)  / batch_size)

print(f'Number of training steps: {TRAIN_STEPS}')
print(f'Number of evaluation steps: {EVAL_STEPS}')

Figure 3: Validation split code snippet

Implementation in U-Net

Now it’s time for the fun stuff! We’re finally ready to begin setting up the model that we will be using for our segmentation task. We will be leveraging a model called a U-Net for our learning. Our implementation of the U-Net in TensorFlow follows a very similar structure to the one seen in the example here. In a nutshell, here is what’s happening in our U-Net code:

1.) The machine learning model is expecting a 256 pixel by 256 pixel by 3 band input. This is the reason why we exported our image samples in this manner from Earth Engine. Also, by chopping up the images into patches, we reduce the amount of information that needs to be stored in temporary memory at any given point. This allows our code to run without crashing.

2.) The computer scans the input through a set of encoders. An encoder’s job is to learn every little detail of the thing we’re instructing it to learn. So in our case, we want it to learn all of the intricacies that define a well pad in satellite imagery. We want it to learn that well pads are typically squares or rectangles, have well defined edges, and may or may not be in close proximity to other well pads. As the number of encoders increases further down the “U” shape of the model, it is learning and retaining more of these features that make well pads unique.

3.) As the computer creates these pixel-by-pixel classifications sliding down the “U,” it sacrifices the spatial information that the input once held. That is to say, the image no longer appears as a bunch of well pads scattered across a landscape. It appears more so as a big stack of cards. All of the pixels in the original image are now classified with their newly minted predictions (i.e. “I am a well pad” or “I am not a well pad”), but they don’t have any clue where in the world they belong. The task of the upper slope of the “U” is to stitch the spatial information onto the classified predictions generated by our model. In this light, the upward slope of the “U” is made up of filters known as decoders. The cool thing about the U-Net is that as we go further up the “U”, it will grab the spatial pattern associated with the same location on the downward slope of the U-Net. In short, the model gives its best shot at taking these classified predictions and making them back into an image. To see a visual representation of the U-Net model, refer to Figure 4 below.

Figure 4: A graphic representing the U-Net architecture, courtesy of Ronneberger, et al.

At the end of the trip through the model, we are left with an output image from the model. This image is the model’s best guess at whether or not what we’ve fed it shows well pads or not. Of course, the model’s best guess will not be absolute for each and every pixel in the image. Given what it has learned about well pads, (how they’re shaped, what color palette usually describes a well pad, etc.), the model returns values on a spectrum from 0 to 1. Wherever the values land in between these two numbers can be called the model’s confidence in its prediction. So for example, forested areas in the image would ideally show a confidence value near zero; conversely, drilling sites picked up in the image would have confidence values close to one. Ambiguous features in the image, like parking lots or agricultural fields, might have a value somewhere in the middle of zero and one. Depending on how well the model did when compared to the mask associated with the three band input, it will be reprimanded for mistakes or errors it makes using what’s known as a loss function. To read more about loss functions and how they can be used, be sure to check out this helpful blog. Now that we have the model set up, we are ready to gear up for training!

Data augmentation

Before we start to train, we define a function which serves the purpose of tweaking the inputs slightly every time they are seen by the model. This is a process known as data augmentation. The reason why we make these small changes is because we don’t have a large dataset. If we give the model a small dataset without making these tweaks, each time the model sees the image, it will essentially memorize the images as opposed to learning the characteristics of a well pad. It’s a pretty neat trick, because we can make a small dataset seem way larger than it actually is simply by mirroring the image on the y-axis or by rotating the image 90 degrees, for example. Our augmentation workflow is shown in Figure 5.

# Augmentation function to pass to Callback class.
def augment(image, mask):
 rand = np.random.randint(100)
  if rand < 25:
   image = tf.image.flip_left_right(image)
   mask = tf.image.flip_left_right(mask)

 elif rand >= 25 and rand < 50:
   image = tf.image.rot90(image)
   mask = tf.image.rot90(mask)

 elif rand >= 50 and rand < 75:
   image = tf.image.flip_up_down(image)
   mask = tf.image.flip_up_down(mask)

 else:
   pass

 return (image, mask)

# Callback for data augmentation.
class aug(tf.keras.callbacks.Callback):
 def on_training_batch_begin(self, batch, logs = None):
   batch.map(augment, num_parallel_calls = 5)
   batch.shuffle(10)

Figure 5: Augmentation function and checkpoints cell

Fitting the model to the dataset

Now it’s time to put this model to the test! We do this in a TensorFlow call known as .fit(). As the name suggests, it is going to “fit” the model to our input dataset. Let’s go ahead and take a look at the code from Figure 6, shown below. 

history = UNet.fit(
     x = training,
     epochs = model_epochs,
     steps_per_epoch = TRAIN_STEPS,
     validation_data = evaluation,
     validation_steps = EVAL_STEPS,
     callbacks = [aug(),cp,csv])

Figure 6: Fitting the model to the input dataset

It’s important to conceptually understand what each of the values passed into this function call represents. We start with the variable “x”: this expects us to pass in our training dataset, which was created earlier. The next argument is called epochs. Epochs describe how many times the model will see the entire dataset during the fitting process. This is somewhat of an arbitrary number, as some models can learn the desired information more quickly, thus requiring less training. Conversely, training a model for too long can become redundant or potentially lead to overfitting. Overfitting is when a model learns to memorize the images it’s trained on, but it doesn’t learn to generalize. Think of overfitting like memorizing a review sheet the night before a test; you memorize what is covered in the review, but any minor changes in the way questions are asked on the actual test could trip you up. For this reason, it is generally up to the user to determine how many epochs are deemed necessary based on the application. 

The next argument, steps_per_epoch (also validation_steps) describes how many batches of data should be taken from our training and validation sets respectively through each epoch. Batches are small chunks of the dataset; it is useful to divide up the dataset into batches to make the training process more computationally efficient. One would typically want to go through the whole dataset every epoch, so it’s best to set the steps as such. Validation_data is where we would specify the data we set aside during training to validate our model’s predictions. Remember, that data will not be seen by the model during its training cycle. The last argument is called callbacks. This is where we pass in the augmentation function. This function is instructed by our callback to run at the beginning of each new batch, therefore constantly changing the data during training. We also optionally pass in other callbacks which might be useful for later reference to our training session. Such callbacks might export the loss and metrics to our Google Drive in a comma-separated values format or might save checkpoints throughout the model, keeping track of which training epoch produces the lowest loss. There are many other pre-packaged callbacks which can be used; a full list of these callbacks can be found here. Now that we have all of that covered, it’s time to start learning! By running this code, we begin the training process and will continue until the model has finished running through all of the epochs we specified.

Once that has finished, we save the model and plot its metrics and its loss, as shown in Figure 7. Based upon how these plots look, we can tell how well we did during our training.

Figure 7: An example chart, showing plotted metrics (top) and loss (bottom). Metrics are used to evaluate the performance of our model, while loss is directly used during training to optimize the model. As such, a good model will have a greatly reduced loss by the time we reach the end of training.

And voila! You have made it through the second installment in our series. The next entry will cover post-processing steps of our machine learning workflow. Questions we will answer include:

– How do we make predictions on an image we’ve never seen before?

– How do we take a large image and chop it into smaller, more manageable pieces? 

– How do we take some new predictions and make them into polygons?

Stay tuned for our next entry, brought to you by Dr. Ry Covington, SkyTruth’s Technical Program Director. In case you missed it, be sure to check out the first post in this series. Happy skytruthing!

Drilling Detection with Machine Learning Part 1: Getting to Know The Training Data

Intern Sasha Bylsma explains the first steps in teaching computers how to detect oil and gas well pads from satellite imagery.

[This blog post is the first in a three-part series describing SkyTruth’s effort to automate the detection of oil and gas well pads around the world using machine learning. This tool will allow local communities, conservationists, researchers, policymakers, and journalists to see for themselves the growth of drilling in the areas they care about. By the end of the series, we will have demystified the basic principles of machine learning applied to satellite imagery, and will have a technical step-by-step guide for others to follow. At SkyTruth, we aim to inspire people to protect the environment as well as educate those who want to learn from our work to develop applications themselves that protect people and the planet.]

Detecting environmental disturbances and land use change requires two things: the right set of technological tools to observe the Earth and a dedicated team to discover, analyze, and report these changes. SkyTruth has both. It is this process of discovery, analysis, and publication — this form of indisputable transparency that SkyTruth offers by bringing to light the when, the where, and hopefully the who of environmental wrongdoings — that appealed to me most about this organization, and what ultimately led me to apply for their internship program this summer.

In my first weeks as an intern, I was tasked with analyzing dozens of ocean satellite images, searching for oil slicks on the sea surface left behind by vessels, which show up as long black streaks. As a student with an emerging passion for Geographic Information System science (GIS), I was eager to find a more efficient way to scan the oceans for pollution. I wished I could simply press a “Next” button and have tiles of imagery presented to me, so I could search for patterns of oil quickly. It was a relief to find out that my coworkers were developing such a solution. Instead of relying on me and others to scan imagery and recognize the patterns of oil, they were training a computer to do it for us, a project called Cerulean. They were using machine learning and computer vision to teach a model to learn the visual characteristics of oil spills in images, pixel by pixel, after giving it many examples. I was really interested in getting involved in this work, so I asked if I could join a different project using machine learning: detecting new oil and gas drilling sites being built in important habitat areas in Argentina. 

Creating the training data

One of my first tasks was to organize a handful of existing polygons that we would use to create training data, which is what we call the information that is used to teach the model. Using Google Earth Engine, I placed the polygons over imagery collected from the European Space Agency’s Sentinel-2 satellites. The imagery is pretty coarse – these satellites collect images with a 10 meter spatial resolution. This means that every pixel in the image covers 100 square meters on the ground. The imagery can also be pretty cloudy, so one of the first things that I had to do was remove the clouds by creating cloud-free composite images. Basically, this combines several images of the same place, but at different times, and only uses the pixels that aren’t cloudy. This allowed me to create a single, cloud-free image of each of our sample areas. Once we’d done that, we were ready to make examples for the model to take in. 

Figure 1 shows a visual representation of the process that my colleagues and I developed to create the training data. On the left, we have a view of two well pads in Colorado from the default satellite base map in Google Earth Engine. This is the same imagery that you would see in the “Satellite” view of Google Maps; it’s very high resolution commercial satellite imagery, so it’s easy to see objects like these drilling sites in great detail. In the middle is a Sentinel-2 image of the same well pads. Sentinel-2 imagery is publicly available for free, and it is the imagery source that we use for our model. On the right, we have the Sentinel-2 image overlain with the well pad polygons that I’ve manually drawn.

Figure 1: Overlaying well pad polygons onto Sentinel-2 images

From here, we want to be able to select an area that captures each well pad and its surroundings. To accomplish this, we take the center of each blue well pad polygon, create a buffer of 200 meters, and then select a random spot within that circular buffer zone to drop a point, which appears below as a red dot. Figure 2 illustrates this step.

Figure 2: Area capturing well pads

These red dots are then used as the center of a 256 pixel by 256 pixel square – what we call a patch – that will house the well pad and its surroundings. I’ve illustrated what this box would look like in Figure 3, just using the left well pad for simplicity.

Figure 3: A “patch” housing a well pad and its surroundings

Next, we need to classify the image into “well pad” and “no well pad” labels. We create a binary mask with white representing well pad areas and black representing no well pad areas. This mask covers the entire image, and Figure 4 is a closeup of the two well pads we’ve been looking at with the boundary box visible as well.

Figure 4: A mask with white representing well pad, and black representing everything else

Finally, let’s zoom into the extent of the white boundary and put it all together.

This pair of small pictures in Figure 5 – the image patch on the left and the image’s label on the right – is what the model sees. Well, almost. Let’s break it down. Every image is made up of pixels that have numerical values for the amount of red, green, and blue in that pixel — three colors. If you can imagine each pixel as being three numbers deep, you can then imagine that the colored image on the left is a matrix with the dimensions 256 x 256 x 3. The right image is a matrix with the dimensions 256 x 256 x 1, since it only has one channel storing the label: 0’s for black pixels and 1’s for white pixels. One of these pairs – an image and its label – constitutes a single example that will go into the model.   

Thousands of examples

In order for the model to learn, we needed to create thousands of examples. So, we mapped well pads in Colorado, New Mexico, Utah, Nevada, Texas, Pennsylvania, West Virginia, and Argentina to use for training examples. My team and I tried to keep a couple of additional things in mind. First, we created a dataset of “look-alike” well pads, meaning that we found areas that the model could easily mistake for a well pad (such as square parking lots, plots of farmland, housing developments, etc.) and made labels for them. This indicates to the model that although these examples share similar features, they are not well pads, and this strengthens the neural connections of the model by refining its definition of what a well pad is, by showing it what a well pad isn’t

Second, we made sure to capture some variation in the appearance of well pads. While some are very bright in contrast to their landscape, others were darker than their surroundings, and some, especially in the American West, are essentially a dirt patch on desert land. By collecting training data of both the obvious well pads and the harder-to-distinguish ones, we added variance and complexity to the model. Since in the real world some well pads are old, some have been overgrown by vegetation, and some are covered with equipment, it’s important to include several examples of these special cases in the training data so that the model can recognize well pads regardless of the condition that they might be in. 

Prepping the training data for the model

To complete the process of preparing the training data, I packaged up the examples as TFRecords, a data format that is ideal for working with TensorFlow, a popular machine learning platform. These TFRecords will be fed into the model so that it can learn the visual characteristics of well pads well enough to be able to detect drilling sites in previously unseen imagery. 

Now that we’ve discussed how to develop the training data, Geospatial Analyst Brendan Jarrell will explain how we developed our model in the second post in this series. 

 

 

Right-sizing Our Data Pipeline to Detect Polluters

How does SkyTruth’s new project Cerulean reduce the time and cost of processing enormous volumes of satellite information?

Project Cerulean is SkyTruth’s systematic endeavor to curb the widespread practice of bilge dumping, in which moving vessels empty their oily wastewater directly into the ocean. (We recently highlighted the scope and scale of the problem in this blog series) Our goal is to stop oil pollution at sea, and this particular project aims to do that by automating what has historically been a laborious manual process for our team. SkyTruthers have spent days scrolling through satellite radar imagery looking to identify the telltale black streaks of an oil slick stretching for dozens of kilometers on the sea’s surface. We are finally able to do this automatically thanks to recent developments in the field of machine learning (ML). Machine learning “teaches” computers to identify certain traits, such as bilge slicks. To do so efficiently, however, requires a lot of computation that costs both time and money. In this article, we explore a few tricks that allow us to reduce that load without sacrificing accuracy.

The sheer volume of data being collected by the thousands of satellites currently in orbit can be overwhelming, and over time, those numbers will only continue to increase. However, SkyTruth’s bilge dump detection relies primarily on one particular pair of satellites called Sentinel-1, whose data are made available to the public by the European Space Agency’s Copernicus program. Sentinel-1 satellites beam radar energy down at the surface and gather the signal that bounces back up, compiling that information into birds-eye images of Earth. To get a sense of how much data is being collected, Figure 1 shows a composite of all the footprints of these images from a single day of collecting. If you spent 60 seconds looking at each image, it would take you 21 hours to comb through them all. Fortunately, the repetitive nature of the task makes it ripe for automation via machine learning.

Figure 1. One day’s worth of radar imagery collected by Sentinel-1 satellites. Each blue quadrilateral represents the location of a single image. You can see the diagonal swath that the satellites cut across the equator. (Note the scenes near the poles are distorted on this map because it uses the Mercator projection.) Image compiled by Jona Raphael, SkyTruth.

Figure 2 illustrates a typical satellite radar image: just one of those blue polygons above. These images are so big — this one captures 50,000 square kilometers (more than 19,000 square miles) — that it can be tough to spot the thin oil slicks that appear as a slightly blacker patch on the dark sea surface. In this image, if you look closely, you can see a slick just south of the bright white landmass. (It’s a bit clearer in the zoomed in detail that follows.)

Figure 2. Satellite imagery, from January 1, 2019, capturing a difficult to see bilge dump just south of the tip of Papua New Guinea (Copernicus Sentinel data 2019).

Contrary to intuition, you are not seeing any clouds in this picture. Radar consists of electromagnetic radiation with very long wavelengths, so it travels from satellites through the atmosphere largely undisturbed until it hits a surface on the Earth. Once it hits, it is either absorbed by the surface or reflected back into space (much like bats using echolocation). The lightest section of the image in the top right corner is part of a mountainous island. This jagged terrain scatters the radar diffusely in all directions and reflects much of it back to the satellite. The rest of the image is much darker, and shows us the ocean where less of the radar energy bounces back to the satellite receiver. The muddled, medium-gray area along the bottom of the image shows us where strong gusty winds blowing across the ocean surface have made the water choppy and less mirror-like. Figure 3 shows us more clearly the oil slick just offshore. 

 

Figure 3: Detail from Sentinel-1 radar image shown above. The narrow oil slick identified by a dark gray streak along the bottom of this image is roughly 60 kilometers (40 miles) long, and only 15 kilometers (roughly nine miles) offshore. (Contains modified Copernicus Sentinel data 2019).

Although each image covers a large area, we need to process many images each day to monitor the entire Earth. How many? Here’s an approximation:

  • 510,000,000 square kilometers  = The total surface of the earth
  • 90,000,000 square kilometers  = Total area of images captured in one day (1,300 scenes)

This means that we expect the whole Earth to be imaged somewhere between every six to 12 days (because many of the images overlap each other), or roughly 10,000 images. 

Every year more satellite constellations are being launched, so if a new constellation were to capture the whole earth’s surface in a single day, then we would need to spend an order of magnitude more processing time to ingest it. We care about this because each image scanned will cost time, money, and computational power. To enable appropriate allocation of resources for automation, it’s critical to understand the scale of the data. For now, we can size our processing pipeline to the current number, but we must take measures to ensure the system is scalable for the increasing numbers of satellite images we anticipate in coming years.

So does that mean we need to look at 1,300 images every day? Thankfully not. We have a few techniques that we’ll use to make the computations manageable:

  • First off, we’ve found that radar satellite images near the poles are typically captured using a particular configuration called HH polarization — great for mapping sea ice, but not ideal for detecting oil slicks. And the presence of the sea ice itself makes oil slick detection difficult. If we remove those images from the 1,300, we have about 880 that suit our needs (using VV polarization).
  • Next, we won’t be looking for oil slicks on land, so we can further filter out any images that don’t have at least some ocean in them. That reduces our set of images to about 500. (Note: Sentinel-1 coverage of the open ocean is generally poor, as discussed in our previous blog post, but we anticipate future radar constellations will fill that gap.)
  • Those 500 images represent roughly 25,000,000 square kilometers of area, but we can reduce that even further by finding images on the shoreline, and eliminating any pixels in those images we know that belong to land. That drops the total area by another quarter to 15,000,000 square kilometers.

So at this point, we’ve reduced our load to about 17% of the data that our satellite source is actively sending each day. Figure 4 illustrates the effect that these filters have on our data load:

Figure 4. [Gallery] Filtering images. a) All Sentinel-1 images in one day. b) Filtered by polarization. c) Filtered by intersection with the ocean. Images compiled by Jona Raphael, SkyTruth.

Can we do better? Let’s hope so — remember that each of those 500 images is almost a gigabyte of data. That’s like filling up the memory of a brand new Apple laptop every day: an expensive proposition. Here are some ways that we make it easier to process that much information:

  • First, we don’t need to store all that data at SkyTruth. Just like you don’t need to store all of Wikipedia just to read one article, it is more efficient to load an article one at a time, read it, and then close the window to release it from memory before opening another one. That means if we’re clever, we’ll never have to load all 500 images at once.
  • Each image is originally created as 32-bit, but we can easily convert it to 16-bit, thereby halving the data size. You can think of the bit-depth as the number of digits after the decimal place: Instead of storing the value ‘1.2345678’, we would store ‘1.235’, which is almost the same value, but takes a lot less effort to store in active memory.
  • We can further reduce the amount of required memory by reducing the resolution of the image. We find it works just fine to reduce each dimension (height and width) by a factor of eight each, which means the number of pixels actually decreases by a factor of 64. This has the effect of averaging 64 small pixels into one large one.
  • Finally, we don’t need to process the whole image at once. Just like it was easier for you to spot the oil slick when you were looking at a smaller zoomed in portion of the satellite image above, we can divide up each picture into 32 smaller square ‘chips’.

Taken together, we can now work on single chips that are only 250 kilobytes  — instead of 1 gigabyte images — effectively a 99.975% reduction in memory load. That translates directly to speeding up the two core parts of machine learning: training the model and making predictions. Because training an ML model requires performing complex mathematics on thousands of examples, this speed up represents the difference between training for 10 minutes versus 20 hours.

But, wait! What about the step in which we threw away data by averaging 64 pixels into one large pixel? Is that the same as losing 63 pieces of information and saving only one? It is important to address this idea in the context of machine learning and what a machine is capable of learning. It hinges on the difference between information and data. Generally speaking, we want resolution to be as low as possible, while retaining the critical information in the image. That’s because the lower the resolution, the faster our model can be trained and make predictions. To figure out how much resolution is necessary, a convenient rule of thumb is to ask whether a human expert could still accurately make a proper assessment at the lower resolution. Let’s take a quick look at an example. Each of the following images in Figure 5 has one-fourth as many pixels as the previous:

Figure 5: [Gallery]  Reducing resolution by factor of 4: 512×512, 256×256, 128×128, 64×64, 32×32. (Contains modified Copernicus Sentinel data 2019).

If your objective is to find all the pixels that represent oil, you could easily do so in the first two images (zooming in is allowed). The next two are a bit harder, and the final one is impossible. Note that the fourth image is a factor of 64, or 98.5%, smaller than the first, but the oil is still visible. If we think critically about it, we realize that the number of pixels in a picture is irrelevant — what actually matters is that the pixels must be much smaller than the object you are identifying. In the first image, the oil slick is many pixels wide and so it has sufficient information for an ML model to learn. However, in the last image the pixels are so large that the tiny slick all but disappears in the averaging. If, instead, your objective is to find all the pixels that represent land, then the final image (32×32) is more than sufficient and could probably be reduced even further.

However, reducing resolution also comes with downsides. For instance, the reduction in visual information means there are fewer context clues, and therefore less certainty, about which pixels are oil and which are not. This results in more pixels being labeled with the wrong prediction, and a corresponding reduction in accuracy. Furthermore, even if it were possible to perfectly label all of the pixels that are oil, the pixels are so big that it’s difficult to get a good sense for the exact shape, outline, or path of the oil itself, which results in lower quality estimates derived from this work (for instance, estimating the volume of oil spilled).

If only we could use the resolution of the first image somehow… 

Good news — we can! It turns out that the first image above, the 512×512, already had its resolution reduced from the original satellite image by a factor of 64. To give you a sense, Figure 6 shows the original resolution satellite image side by side with the version we use in training for our ML model. The second image has lower resolution, but because the pixels are still much smaller than the features we care about, we are able to avoid most of the downsides described above. The oil spill is still unambiguously identifiable, so we are willing to trade the resolution for gains in training and prediction. This is a subjective tradeoff that each machine learning project must make based on the resources available

Figure 6:  Original satellite image at full resolution (left), compared to reduced resolution used for ML prediction (right). (Contains modified Copernicus Sentinel data 2019).

So what are the takeaways from this exploration? We first learned that it takes a lot of imagery to monitor the surface of the Earth. However, by thoughtfully excluding imagery that doesn’t capture the information we care about, it’s possible to reduce the volume of data we need to process significantly. Then, we discussed a few different tricks that make it much faster to train and predict on that pared down dataset. In particular, the most important technique is to reduce the resolution as much as possible, while maintaining a pixel size that is substantially smaller than any features we are attempting to identify. All told, there is a lot of data that still needs to be processed, but if we can achieve a total process-time for each satellite image that is under three minutes, then the whole pipeline can be run on a single computer. Right now, we are down to about 10 minutes per image, and still have a few tricks up our sleeve. So we’re hopeful that in the coming months we’ll hit our target and have a robust and scalable mechanism for identifying oil slicks automatically. 

Until then, we continue to tip our hats to the SkyTruthers that regularly identify and expose these egregious acts, and we look forward to the day that the illegal practice of bilge dumping comes to an end.

AIS Ship Tracking Data Shows False Vessel Tracks Circling Above Point Reyes, Near San Francisco

Analysis from SkyTruth and Global Fishing Watch shows ship tracks jumping thousands of miles from their true locations.

Bjorn Bergman works with SkyTruth and with the Global Fishing Watch research team to track vessels broadcasting false automatic identification system (AIS) locations and to investigate new sources of satellite data for vessel tracking and for detecting dark targets at sea. In this blog post, Bjorn spots an unusual pattern of false AIS broadcasts concentrated at one location, Point Reyes, northwest of San Francisco on the California coast. Why would vessels thousands of miles away be suddenly popping up in circles over Point Reyes? Could this reflect an intentional disruption of the underlying global positioning system (GPS) that AIS relies on, or is there some other explanation for this pattern?

In December 2019, SkyTruth reported on a number of locations on the Chinese coast (mostly oil terminals) where ship tracking positions from the automatic identification system (AIS)  became scrambled as soon as ships approached within a few miles of a point on shore. Importantly, we knew that this was actual disruption of the underlying global positioning system (GPS) — or more broadly the Global Navigation Satellite System — and not just a shipboard AIS malfunction. We determined this because another source of GPS data, Strava’s heat map of fitness trackers, showed the same ring pattern. A quick recent check of the data showed that this GPS manipulation is ongoing at oil terminals in four of the cities (Shanghai, Dalian, Fuzhou, and Quanzhou) where we had detected it last year. We still don’t know if this manipulation is specifically intended to mask ship traffic or if there is some other reason for disrupting GPS.

Following the findings last year on the Chinese coast, I began looking globally for any similar patterns in AIS tracking data around the world. While I haven’t found the precise pattern observed at the Chinese oil terminals outside of China, I did find a somewhat different false AIS broadcast pattern which, strangely enough, appears concentrated above Point Reyes northwest of San Francisco, California in the United States. Although the circling tracks look similar in both locations, the vessels on the Chinese coast were at most a few miles from the circling tracks, while the vessels broadcasting tracks above Point Reyes are actually thousands of miles away. So far I’ve found vessels in nine locations affected. Some of these locations are near oil terminals or where GPS disruption has been reported before, but there is no clear pattern linking all of the affected areas.  

Image 1: AIS tracks from a number of vessels have appeared circling over Point Reyes near San Francisco even though the ships can be confirmed to be thousands of miles away. False circling tracks from five vessels are shown here. AIS data courtesy of Global Fishing Watch / Orbcomm / Spire.

The AIS positions being broadcast over Point Reyes are obviously false (some of them are over land and they show a constant speed and oval pattern we wouldn’t see with a real ship track). But how can we be sure where the ship really is? The most important indication is the location broadcast just prior to the jump to Point Reyes and then where the vessel reappears after the apparent circling finishes. The duration of the circling pattern varies, from less than an hour for one ship in the Indian Ocean, to as much as two weeks for some of the other vessels. However, besides seeing the true locations before and after the jump to Point Reyes, it’s also possible to look at where the AIS receiving satellites were while the vessels were broadcasting positions around Point Reyes.

Image 2: The colored lines show AIS tracks from five of the ships whose broadcast positions jumped suddenly to Point Reyes, California, northwest of San Francisco. The time of the tracking disruption varies from less than one hour for one vessel to about two weeks for some others. Two of the vessels (Princess Janice and Alkahfi Maryam) also have tracks appearing over land in North America. The reason for this displacement is unknown although some of the vessels are in areas where GPS disruption has been reported (Eastern Mediterranean and Sea of Azov). AIS data courtesy of Global Fishing Watch / Orbcomm / Spire.

To get an approximate location for one vessel’s real position during the two weeks it broadcast over Point Reyes and the Western United States, SkyTruth analyst Christian Thomas and I analyzed the footprints of the satellites receiving the AIS positions. This was possible thanks to data Spire Global, Inc. provided to Global Fishing Watch. Spire’s data gives the identity of the receiving satellite with each AIS position. This allowed the Global Fishing Watch research team to access orbit information, which they used to calculate exactly the point above the surface of the earth where each satellite was when it received an AIS position and then calculate the distance from the satellite position to the ship’s broadcast AIS position. Because AIS broadcasts are only received within an approximately 5,000 kilometer (3,100 mile) diameter footprint, we know that the vessel was somewhere within this area. We can even narrow down the location further based on successive passes of AIS receiving satellites. 

Image 3: Broadcast AIS positions from Princess Janice. The track makes multiple jumps between a real location in an oil terminal on the coast of Nigeria (inset lower right) and false positions over the United States. Over two weeks in June 2019 the false track initially circles over Point Reyes northwest of San Francisco before veering over the Pacific and over the interior of the United States. More circling is seen around Salt Lake City Utah (inset upper right). AIS data courtesy of Global Fishing Watch / Orbcomm / Spire.

This vessel, the Princess Janice, is a crew boat traveling to offshore oil installations. It broadcasts a normal track out of a Nigerian oil terminal until June 5, 2019. For the following two weeks the vessel then broadcasts a false location track circling above Point Reyes and eventually veering off above Utah (during this time the track occasionally jumped back briefly to the Nigerian oil terminal). Unlike other false AIS broadcasts we have documented, which have a constant location offset or flipped coordinate values (producing a mirror image of the actual position), these circling tracks appear to not reflect the true movements of the vessel in any way. 

When we looked at the footprint of the satellite receiving AIS positions from Princess Janice, it’s clear that the vessel remained on a stretch of the central Nigerian coast or in nearby waters in the Gulf of Guinea (see Image 4) throughout the two-week period when false locations were being broadcast. 

Image 4: Princess Janice broadcasts an AIS track over Point Reyes near San Francisco and over the Western United States from June 5 – 21, 2019 (see Image 3). Analysis of the footprints for the satellites receiving these positions demonstrates that the vessel was actually within a region on the central Nigerian coast and adjacent Gulf of Guinea. Frame 1: Location over the Earth’s surface (red dots) of satellites receiving false position messages. Frame 2: Extent of satellite footprints for AIS reception (large red circles). Frame 3: Density of satellite coverage overlap, areas of increasing density shown as Blue → Green → Yellow → White. Frame 4: Area where all satellite footprints overlap (maximum coverage) shown in white. The white shaded region on the central Nigerian coast contains the true location of the Princess Janice during the period when the vessel was broadcasting a false location track. Analysis was done in Google Earth Engine using approximate satellite footprints of 5,000 km (3,100 miles) diameter.

Both the manipulated GPS positions seen on the Chinese coast and these new examples over Point Reyes are characterized by rings of positions. The rings have similar shapes, somewhat wider east to west than north to south. However circles appearing over Point Reyes vary greatly in size and the broadcast vessel courses may be oriented clockwise or counterclockwise around the ring. All speeds are exactly 20 knots. In contrast, the rings on the Chinese vessels last year had positions that were 21 or 31 knots with the 31 knot positions always oriented counterclockwise. Critically, while we could confirm that GPS interference caused the rings of AIS positions on the Chinese coast, we don’t yet know if that is the case with the positions over Point Reyes. An alternative is that this is simply a malfunction affecting the individual ships’ AIS systems. We were able to confirm that the false circling positions over Point Reyes occur in data from all available AIS providers (Orbcomm, Spire, and ExactEarth) and in AIS positions received by both satellites and terrestrial receivers.

The list of affected vessels below (Table 1) shows that many types of vessels in different geographic locations have displayed this same pattern of AIS disruption. Some were in areas where GPS problems have been reported by others (the Eastern Mediterranean, Sea of Azov, Libyan coast); other locations are seemingly random. A number of the vessels, but not all, appear near oil terminals and are involved in supporting offshore platforms. 

TABLE 1.

Table 1: Vessels showing a pattern of false circling AIS positions. Reported locations are where circling tracks appeared (mainly at Point Reyes near San Francisco). Real locations are where the vessel was determined to be while broadcasting the false circling AIS track. AIS data courtesy of Global Fishing Watch / Orbcomm / Spire.

The presence of three of these vessels in areas of documented GPS interference is intriguing. The cargo ship Berezovets shown below was operating in one such area in the Sea of Azov, north of the Black Sea. Following the Russian annexation of Crimea in 2014 and the takeover of Eastern Ukraine by Russian-backed separatists, the front line in the ongoing civil war has cut through Eastern Ukraine north of the Sea of Azov. There have also been conflicts on the water and a Russian blockade of the Kerch Strait leading north from the Black Sea.

Image 5: The Russian flagged cargo ship Berezovets transits through the Sea of Azov in June 2019 and has its AIS track jump suddenly to Point Reyes near San Francisco (inset). Incidents of documented GPS disruption occurred in March 2019 east of the Bilosarai Spit and in July 2019 in the city of Starohnatvka. AIS data courtesy of Global Fishing Watch / Orbcomm / Spire.

The Russian flagged Berezovets transited through the Kerch Strait on June 3, 2019 then headed northeast passing south of the conflict zone towards Russian ports. As the vessel enters Russian waters (location 1 in Image 5) and anchors, its June 4-8 positions broadcast by the AIS system are scrambled, some appearing scattered 20 miles from the vessel’s anchor point. The vessel track then moves east towards port before jumping 20 miles north to a point on land (2) and then jumping about 11,000 miles west to circle above Point Reyes (3). This circling continues for about 60 hours from June 11 – 14, including some irregular positions extending about 40 miles into the Pacific. As with the Princess Janice track, it’s unclear why the false track would jump to California and what accounts for the individual variations in the different tracks we see appearing at this location. On June 14, 2019 the Berezovets AIS track jumps back to the vessel’s real location, now in the Russian port of Azov (4) and can then be seen to proceed eastward up the River Don.  

The unusual disruption in the Berezovets broadcast AIS track was both preceded and followed by similar reported disruptions in GPS in the same region. On March 7, 2019 a Ukrainian military website reported that three vessels on the Sea of Azov experienced failures in their navigation systems. One of these failures occurred the day before, east of Bilosarai Spit (see Image 5). The other two reported disruptions were in the preceding month at other locations in the Sea of Azov. On July 23, 2019 according to a report from the Organization for Security and Co-operation in Europe’s Special Monitoring Mission to Ukraine a UAV (unmanned aerial vehicle) flying over the city of Starohnativka in Ukraine, was one of several UAVs that experienced GPS interference assessed to be likely from jamming. While not conclusive, the proximity of these other reported incidents makes it possible that the disruption seen in the Berezovets track was a result of the GPS interference known to be occuring in the area. 

Two other vessels were also in areas with documented GPS disruptions, Suha Queen II approaching the coast of Libya, and Haj Sayed I transiting from the Suez canal to Eastern Turkey. However, in searching for vessels showing the same circling pattern seen over Point Reyes, I have not yet found that multiple vessels in areas like the Sea of Azov were similarly affected. Global AIS data does show a few vessels with tracks circling over other locations. Two pilot vessels on the Chilean coast had their broadcast positions suddenly jump to circling tracks over Madrid. The Suha Queen II approaching the coast of Libya had its track jump to the Chinese city of Shanwei. The most recent vessel to appear circling over Point Reyes is the Ting Yuk, a tugboat operating in Hong Kong, which had its AIS track disrupted for a few hours at the end of March. 

So far it remains a mystery why these circling AIS tracks are appearing specifically at Point Reyes and a few other locations. It’s tempting to speculate that there might be some connection to a major U.S. Coast Guard communication station in Point Reyes which was an important historic location for developing maritime communications technology. While the Coast Guard left the area several years ago, volunteers continue to maintain at Point Reyes the only operational ship-to-shore maritime radio station. Still, it’s unclear why this location would somehow appear on AIS trackers. The fact that individual vessels in many different locations have been affected is puzzling and it’s unknown if any of these examples reflect actual disruptions of the GPS system. However some studies, such as a yearlong cruise by researchers of the German Aerospace Center which measured instances of GPS interference even during high seas transits, indicate that we may still have a great deal to learn about the true extent of global disruptions to this critical navigation system.