Training Computers to See What We See

To analyze satellite data for environmental impacts, computers need to be trained to recognize objects.

The vast quantities of satellite image data available these days provide tremendous opportunities for identifying environmental impacts from space. But for mere humans, there’s simply too much — there are only so many hours in the day. So at SkyTruth, we’re teaching computers to analyze many of these images for us, a process called machine learning. The potential for advancing conservation with machine learning is tremendous. Once taught, computers potentially can detect features such as roads infiltrating protected areas, logging decimating wildlife habitat, mining operations growing beyond permit boundaries, and other landscape changes that reveal threats to biodiversity and human health. Interestingly, the techniques we use to train computers rely on the same techniques used by people to identify objects.

Common Strategies for Detecting Objects

When people look at a photograph, they find it quite easy to identify shapes, features, and objects based on a combination of previous experience and context clues in the image itself. When a computer program is asked to describe a picture, it relies on the same two strategies. In the image above, both humans and computers attempting to extract meaning and identify object boundaries would use similar visual cues:

  • Colors (the bedrock is red)
  • Shapes (the balloon is oval)
  • Lines (the concrete has seams)
  • Textures (the balloon is smooth)
  • Sizes (the feet are smaller than the balloon)
  • Locations (the ground is at the bottom)
  • Adjacency (the feet are attached to legs)
  • Gradients (the bedrock has shadows)

While none of the observations in parentheses capture universal truths, they are useful heuristics: if you have enough of them, you can have some confidence that you’ve interpreted a given image correctly.

Pixel Mask

If our objective is to make a computer program that can find the balloon in the picture above as well as a human can, then we first need to create a way to compare the performances of computers and humans. One solution is to task both a person and a computer to identify, or “segment,” all the pixels that are part of the balloon. If results from the computer agree with those from the person, then it is fair to say that the computer has found the balloon. The results  are captured in an image called a “mask,” in which every pixel is either black (not balloon) or white (balloon), like the following image.

However, unlike humans, most computers don’t wander around and collect experiences on their own. Computers require datasets of explicitly annotated examples, called “training data,” to learn to identify and distinguish specific objects within data. The black and white mask above is one such example. After seeing enough examples of an object, a computer will have embedded some knowledge about what differentiates balloons from their surroundings.

Well Pad Discovery

At SkyTruth, we are starting our machine learning process with oil and gas well pads. Well pads are the base of operations for most active oil and gas drilling sites in the United States, and we are identifying them as a way to quantify the impact of these extractive industries on the natural environment and neighboring communities. Well pads vary greatly in how they appear. Just take a look at how different these three are from each other.

Given this diversity, we need to provide the computer many examples, so that the machine learning model we are creating can distinguish between important features that characterize well pads (e.g. having an access road) and unimportant ones that are allowed to vary (e.g. the shape of the well pad, or the color of its surroundings). Our team generates masks (the black and white pixel labels) for these images by hand, and inputs them as “training data” into the computer. We provide both the image and its mask separately to the machine learning model, but for the rest of this post we will superimpose the mask in blue.

Finally, our machine learning model looks at each image (about 120 of them), learns a little bit from the mask provided with it, and then moves onto the next image. After looking at each picture once, it has already reached 92% accuracy. But we can then tell it to go back and look at each one again (about 30 times), and add a little more detail to its learning, until it reaches almost 98% accuracy.

After the model is trained, we can feed it raw satellite images and ask it to create a mask that identifies all the pixels belonging to any well pads in the picture. Here are some actual outputs from our trained machine learning model:

The top three images show well pads that were correctly identified, and fairly well masked — note the blue mask overlaying the well pads. The bottom three images do not contain well pads, and you can see that our model ignores forests, fields, and houses very well in the first two images, but is a little confused by parking lots — it has masked the parking lot in the third image in blue (incorrectly), as if it were a well pad. This is reasonable, as parking lots share many features with well pads — they are usually rectangular, gray, contain vehicles, and have an access road. This is not the end of the machine learning process; rather it is a first pass through that informs us of a need to capture more images of parking lots and further train the model that those are negative examples.

When working on image segmentation, there are a number of challenges that we need to mitigate. 

Biased Training Data

Predictions that the computer makes are based solely on training data, so it is possible for idiosyncrasies in the training data set to be encoded (unintentionally) as meaningful. For instance, imagine a model that detects a person’s happiness from a picture of their face. If it is only shown open-mouth smiles in the training data, then it is possible that when presented with real world images, it classifies closed-mouth smiles as unhappy.

This challenge often affects a model in unanticipated ways because those biases can be inherent in the data scientist. We try to mitigate this by making sure that our training dataset comes from the same set of images as those that we need to be automatically classified. Two examples of how biased data might creep into our work are: training a machine learning model on well pads in Pennsylvania and then asking it to identify pads from California (bias in the data source), or training a model on well pads centered in the picture, and then asking it to identify them when halfway out of the image (bias in the data preprocessing).

Garbage In, Garbage Out

The predictions that the computer makes can only be as good as the samples that we provide in the training data. For instance, if the person responsible for training accidentally includes the string of a balloon in half of the images created for the training dataset and excludes it in the other half, then the model will be confused about whether or not to mask the string in its predictions. We try to mitigate this by adhering to strict explicit guidelines about what constitutes the boundary of a well pad.

Measuring Success

In most other machine learning systems, it is useful to measure success as a product of two factors. First, was the guess right or wrong? And second, how confident was the guess? However, in image segmentation, that is not a great metric, because the model can be overwhelmed by an imbalance between the number of pixels in each class. For instance, imagine the task is to find a single ant on a large white wall. Out of 1000 white pixels, only 1 is gray. If your model makes a mask that searches long and hard and guesses that one pixel correctly, then it gets 100% accuracy. However, a much simpler model would say there is no ant, that every pixel is white wall, and get rewarded with 99.9% accuracy. This second model is practically unusable, but is very easy for a training algorithm to achieve.

We mitigate this issue by using a metric known as the F-beta score, which for our purposes avoids objects that are very small being ignored in favor of ones that are very large. If you’re hungry for a more technical explanation of this metric, check out the Wikipedia page.

Next Steps

In the coming weeks we will be creating an online environment in which our machine learning model can be installed and fed images with minimal human guidance. Our objective is to create two pipelines: the first allows training data to flow into the model, so it can learn. The second allows new images from satellites to flow into the model, so it can perform image segmentation and tell us the acreage dedicated to these well pads.

We’ll keep you posted as our machine learning technology progresses. 

Note: Title was updated 10/2/19

New Writer–Editor Amy Mathews Joins SkyTruth Team

Telling SkyTruth’s stories

SkyTruth is both an intensely local and vibrantly global organization. Based in Shepherdstown, West Virginia, many of our highly talented staff are long-time residents (and some were even born and raised here). That makes our work on Appalachian issues such as mountaintop mining and fracking personal − it’s happening in our backyard, typically with little oversight from government agencies. But confronting global environmental challenges sometimes means reaching beyond local borders and finding the right people to take on a task that no one else has tackled before. And so SkyTruth’s family of staff and consultants includes programmers and others from around the world, plus top notch research partners at universities and other institutions.

The SkyTruth team in Shepherdstown, West Virginia. Photo by Hali Taylor.

As SkyTruth’s new Writer–Editor, I plan to bring you their stories in coming months, to add to the remarkable findings and tools the staff regularly share through this blog. We’ll learn more about the people whose passion propels our cutting-edge work. And we’ll learn more about all of you – the SkyTruthers who use these tools and information to make a difference in the world. We’ll share your impact stories: That is, how you’ve made a difference in your neighborhood, state, nation or the world at large.

To start, I’ll share a little bit about myself. As a kid, stories hooked me on conservation. I used to watch every National Geographic special I could find and never missed an episode of Wild Kingdom (remember that?). My fascination with all things wild led me to major in wildlife biology at Cornell University. But I quickly realized that I wasn’t a scientist at heart − I was more interested in saving creatures than studying them. I spent spring semester of my junior year in Washington, D.C. and shifted my focus to environmental policy. That decision led to dual graduate degrees in environmental science and public affairs at Indiana University and a long career in environmental policy analysis, program evaluation, and advocacy in Washington.

Urban life and policy gridlock eventually pushed me to Shepherdstown, where nature was closer at hand. I became involved in Shepherdstown’s American Conservation Film Festival, which reignited my passion for storytelling and the inspiration it can trigger. And so, after years of working and consulting for the federal government, conservation groups and charitable foundations, I returned to my conservation roots. I completed my M.A. in nonfiction writing at Johns Hopkins University in May 2013 and left my policy work behind.

Radio-collared Mexican wolf. Photo courtesy of U.S. Fish & Wildlife Service.

Since then, my writing has appeared in publications such as The Washington Post, Pacific Standard, Scientific American, High Country News, Wonderful West Virginia and other outlets. In fact, my 2018 story on the endangered Mexican wolf for Earth Touch News recently won a Genesis Award from the Humane Society of the United States for Outstanding Online News. I was thrilled to be able to observe a family of wolves as part of my reporting for that story, and I always welcome new opportunities to go out in the field and learn about the important work conservationists are doing.

During my time as a freelance journalist, I also led workshops for the nonprofit science communication organization COMPASS, teaching environmental (and other) scientists how to communicate their work more effectively to journalists, policymakers, and others.

There’s one more thing I’d like to share: Although my official role at SkyTruth as Writer–Editor is new, I’ve known SkyTruth since its very beginning. I still remember the day SkyTruth founder John Amos and I sat down at our dining room table and he told me his vision for a new nonprofit. His goal was to level the playing field by giving those trying to protect the planet the same satellite tools used by industries exploiting the planet. John is my husband, and SkyTruth’s journey has been exciting, frightening, gratifying, and sometimes frustrating, with many highs and the occasional low. But it has never been boring.

I’m looking forward to sharing SkyTruth stories with all of you, making sure they move beyond the dining room table to your homes and offices, inspiring you, your colleagues, your friends and families to make the most of what SkyTruth has to offer. Feel free to reach out to me at info@skytruth.org if you’d like to share how you’ve used SkyTruth tools and materials. Just include my name in the subject line and the words “impact story.” Let’s talk!

Note: Portions of this text first appeared on the website amymathewsamos.com.