Systematic GPS Manipulation Occuring at Chinese Oil Terminals and Government Installations

Analysis reveals precise location and timing of GPS interference but purpose remains unclear.

Last month, an article in MIT Technology Review described strange GPS anomalies  in Shanghai. I began investigating, and have now found evidence of a novel form of GPS manipulation occuring at at least 20 sites on the Chinese coast during the past year. The majority of these sites are oil terminals, but government installations in Shanghai and Qingdao also show the same striking pattern of interference in GPS positioning. We don’t know the reason for this interference. It may simply be a general security or anti-surveillance system but it is also possible that it is intended to avoid scrutiny of imports of Iranian crude which have recently come under U.S. sanctions. Whatever the intention, we are able to demonstrate here, through analysis of vessel tracking data, that this GPS interference can be pinpointed very precisely in both time and location.

According to the MIT Technology Review article, this phenomenon was first documented by the U.S. flagged container ship Manukai when the vessel entered the port of Shanghai in July. The captain noticed that the vessel’s AIS (Automatic Identification System) appeared to malfunction — vessels on the navigation screen appeared and disappeared without explanation and appeared to move when they were in fact stationary. AIS, originally designed for collision avoidance, transmits vessels’ GPS locations, courses, and speed every few seconds via VHF (very high frequency) radio. These signals are not only picked up by nearby vessels and terrestrial antennas, but some private companies have also launched satellites able to receive these signals. For this analysis we were able to use data made available by two of these companies, Spire and Orbcomm, through our research partnership with Global Fishing Watch.

An investigation by non-profit C4ADS (Center for Advanced Defence Studies) showed that AIS vessel locations from hundreds of ships navigating Shanghai’s Huangpu river were coming up at false locations. Strangely, vessels on the river would have their GPS location jump to a ring of positions appearing on land. And this was not just affecting ships; looking at the cycling and running app STRAVA’s tracking map of cyclists, C4ADS also confirmed that this strange pattern of interference was affecting all GPS receivers.

To further investigate the GPS manipulation documented in Shanghai, I examined AIS position broadcasts from ships in the area. A distinct pattern emerged. Upon approaching the area of interference, a vessel’s broadcast position jumps from the vessel’s true location to a point on land where false AIS broadcasts occur in a ring approximately 200 meters in diameter. Many of the positions within the ring had speeds of precisely 31 knots or 21 knots (much faster than vessels would be moving near dock) and showed a course varying depending on the position within the ring. The GPS anomaly appears to affect vessels once they are a few kilometers out from the center of the ring. Once affected, vessels begin broadcasting seemingly random positions within the ring or from other high speed positions scattered around it.

Image 1. The Chinese cargo ship Huai Hia Ji 1 Hao (yellow) transits southeast on the Huangpu river. Upon nearing the center of GPS interference area the track jumps to the ring on land and to other random positions nearby. Positions from other affected vessels are shown in red. AIS data courtesy Global Fishing Watch / Orbcomm / Spire.

Image 2. GPS interference can be pinpointed based on this ring of false AIS positions. Approximately 200 meters in diameter, many of the positions in the ring had reported speeds near 31 knots (much faster than a normal vessel speed) and a course going counterclockwise around the circle. AIS data courtesy Global Fishing Watch / Orbcomm / Spire.

Because the ring of false AIS broadcasts follows this very specific pattern, I was able to query AIS tracking data to check if there are other locations where these rings are also occurring. The results are striking. This GPS manipulation is occuring not only in Shanghai but has occurred in at least 20 locations in six Chinese cities within the past year. The focus of these apparent GPS manipulation devices is clearly oil terminals (where 16 of the 20 detected locations were observed). But three prominent office buildings in Shanghai and Qingdao are also affected: the Industrial and Commercial Bank of China in Shanghai, the Qingdao tax administration office, and the Qingdao headquarters of the Qingjian industrial group.

Image 3. A ring of false AIS positions marks an apparent GPS interference device deployed in an office building identified as the Qingdao tax administration office. AIS data courtesy Global Fishing Watch / Orbcomm / Spire.

Image 4. Locations of detected GPS manipulation occuring in six Chinese cities in 2019. Interference following this pattern was not found beyond the Chinese coast.

It seems likely that the centers of these rings of false AIS positions actually mark the physical location of some sort of GPS disrupting device. A device having precisely this effect on GPS receivers, including shipborne AIS systems, has not been previously documented, though there have been other cases of GPS blocking and manipulation. Earlier this year C4ADS published a report with details on GPS manipulation clearly being carried out by the Russian government. These Russian systems appeared to have the effect of making all receiving devices within range show some particular location, such as a nearby airport, rather than the true location of the device. This was seen in one striking example of vessels approaching Putin’s alleged palace on the Black Sea coast.

This Chinese system is clearly being deployed both at central government offices and at the much more remote locations of oil terminals. In the case of the government office buildings it seems likely that these GPS disrupting devices were activated as a security measure. Some are only active for a few days, perhaps to coincide with the visit of an important official. However,  the AIS manipulation occuring at oil terminals particularly interests us at SkyTruth: One possible motive for deploying GPS manipulation devices at oil terminals could be recent U.S. sanctions on Chinese companies importing Iranian crude. And the intentional disruption of a navigation safety system, in close proximity to crude oil storage, is a serious concern.

Almost half of the specific locations where these presumed GPS disrupting devices have been deployed are at oil terminals near Dalian in northeast China. In an August analysis, The New York Times matched Planet satellite imagery from June and July with AIS tracking data to show Iranian tankers delivering oil to China in violation of U.S. sanctions. The Financial Times also documented Chinese flagged tankers importing Iranian crude after ship to ship transfers with Iranian tankers.

I took a closer look at exactly how this GPS disruption is affecting vessel tracking in one oil terminal east of Dalian. Here I identified four locations where GPS disrupting devices appear to have been deployed in 2019. I compared AIS vessel position data from March 1, 2019  and September 5, 2019. The differences were dramatic.

These two days showed similar numbers of AIS positions in the area. But on September 5 approximately two-thirds of the vessel positions at dock disappeared and appeared to be replaced by positions orbiting the GPS disrupting devices or scattered randomly in the region. At the same time, it does appear that some normal AIS broadcasts are coming through and that the GPS disruption does not entirely mask all vessel movements in the area.

Image 5. On March 1, 2019 AIS vessel position data around an oil terminal east of Dalian China shows accurate vessel positions and speeds. On that date, none of the four locations of GPS interference were active. Consequently no vessel positions appear on land and stationary vessels are accurately shown with near 0 speeds (green). AIS data courtesy Global Fishing Watch / Orbcomm / Spire.

Image 6. On September 5, 2019 two GPS interference locations were active and this had a dramatic effect on scrambling vessel positions in the area. Many positions now appear orbiting the presumed GPS interference devices and others appear scattered on land. On the water many positions are appearing with very high speeds (over 25 knots, red) and it’s not possible to distinguish true and false locations. However some slow speed positions (green) are appearing at dock where they would be expected, so some AIS broadcasts appear to be unaffected. AIS data courtesy Global Fishing Watch / Orbcomm / Spire.

Image 7. The distribution of AIS speeds in the area is significantly altered by the activation of the GPS interference devices. Above AIS speed distributions are compared between March 1 (left, no GPS interference) and September 5 (right, active GPS interference). On Sept 5 the total number of slow speed positions from docked vessels is greatly reduced and spikes now appear at 21 and 31 knots from positions orbiting the presumed GPS interference devices.

I also examined one individual vessel track to see how it was affected by GPS interference. This is the Chinese flagged tanker Jin Nui Zou which entered the Dalian oil terminal on September 5. Initially a normal track is seen as the vessel approaches the terminal from the southeast. With closer proximity to the presumed interference device, scrambled positions — often with very high speeds — start to appear. Eventually almost all of the vessel’s AIS positions appear in the ring orbiting the interference device.

Image 8. The tanker Jin Niu Zuo approaches an oil terminal east of Dalian on September 5. Initially, positions with normal transit speeds appear (yellow). With closer proximity, scattered high speed positions begin to emerge (red) and eventually most positions appear in the ring surrounding the presumed AIS interference device. AIS data courtesy Global Fishing Watch / Orbcomm / Spire.

The timing of GPS interference at different sites on the Chinese coast can be inferred based on the appearance of AIS positions on land with 21 and 31 knot speeds. Of the 20 locations identified, interference appears earliest at office buildings in Qingdao but only over a couple days (April 17 – 18, 2019). The first GPS interference at oil terminals appears in June and has continued until recently but timing varies by location. Activation of interference at different terminals is intermittent and may be in response to specific events. For instance at an oil terminal near Quanzhou GPS interference appears to have been activated only between September 25th and 27th, 2019.

At the Dalian oil terminals GPS interference appears to have begun in late June 2019. It is possible that this was a reaction to increased scrutiny of crude imports after the U.S. ended exemptions for purchase of Iranian oil on May 2nd. In fact, Dalian is the headquarters of two subsidiaries of Cosco shipping which were sanctioned on September 25 for importing Iranian crude. Based on what can be seen with vessel activity in Dalian, it is clear that GPS interference is not able to entirely mask vessels approaching the terminal. However, it likely would make it impossible to reliably link a vessel’s AIS track with satellite imagery of a vessel discharging crude at dock. While it is not at all clear that GPS interference was intended to obscure shipping activity, we do see that it had a significant impact on AIS tracking and that the interference was specifically concentrated at oil terminals.

In the November article first documenting the strange GPS anomaly in Shanghai, the question was posed whether this was the work of the Chinese state or some other actor like a mafia engaged in smuggling river sand. Based on the very specific characteristics of the GPS manipulation observed and its deployment at high level installations, it seems very likely that the Chinese state is responsible. It remains to be seen whether this is simply a security measure or if GPS manipulation is also being deployed specifically to prevent monitoring of oil imports.

Fracking in Suburbia

What do you do when big oil moves in next door?

Karen Speed’s new house in Windsor, Colorado was supposed to be a peaceful retirement home. Now she plans to move.

Patricia Nelson wanted her son Diego to grow up the way she did – far from the petrochemical plants surrounding their home in Louisiana. So she moved back to Greeley, Colorado to be close to her family. Then she learned about the drilling behind Diego’s school.

Shirley Smithson had enjoyed her quiet community for years, riding her horse through her neighbor’s pastures, watching the wildlife, and teaching at local schools. When she learned that oil wells would be popping up down the street, she was in denial at first, she says. Then she took action. 

These women shared their stories with a group of journalists and others attending the Society of Environmental Journalists (SEJ) 2019 meeting in Fort Collins, Colorado last month. Fort Collins sits right next to Weld County – the most prolific county in Colorado for oil and gas production and among the most prolific in the entire United States. There, hydraulic fracturing (mostly for oil) has boomed, along with a population surge that is gobbling up farmland and converting open space into subdivisions. Often, these two very different types of development occur side-by-side. 

“We moved [into our house] in September, 2014,” Karen Speed told me, “and by the third week of January 2015, boy, I regretted building that house.” That was the week she learned that Great Western Oil and Gas Company, LLC, was proposing to put a well pad between two neighborhoods; and one of those neighborhoods was hers. When residents complained, she said, the company moved the site across a road and into a valley. “Which really isn’t the right answer,” Speed said. “Not in my backyard attitude? No – not in my town.” The well pad now sits next to the Poudre River and a bike path according to Speed. “People I know no longer ride there. They get sick,” she said. “One guy I know gets nosebleeds. He had asthma already and gets asthma attacks after riding.“

Well pads in neighborhoods are not uncommon throughout parts of Colorado’s Front Range. Weld County alone has an estimated 21,800 well pads and produces roughly 88% of Colorado’s oil. SkyTruth’s Flaring Map reveals a high concentration of flaring sites occurring in that region. This industrial activity occurs within residential areas and farmland despite the fact that people living near fracking sites in Colorado complain of bloody noses, migraines, sore throats, difficulty breathing, and other health problems according to Nathalie Eddy, a Field Advocate with the nonprofit environmental group Earthworks.   

Image 1. ImageMethane flaring locations from oil and gas wells in Weld County, CO. Image from SkyTruth’s Annual Flaring Volume Estimates from Earth Observation Group.

 

And then there was the explosion. Two years after Speed moved into her new home, on December 22, 2017, her house shook when a tank exploded at Extraction Energy’s Stromberger well pad four miles away. “When it exploded it really rocked the town,” she said. More than a dozen fire departments responded to the 30-foot high flames. “It went from 8:45 in the evening until the following morning before they could recover and get out of that space,” Speed recalls. According to a High Country News story, workers raced around shutting down operations throughout the site — 19 wells in all plus pipelines, tanks, trucks and other industrial infrastructure  — to prevent oil, gas, and other chemicals from triggering more explosions. Roughly 350 houses sat within one mile of the site and many more were within shaking range. One worker was injured. Dispatcher recordings released by High Country News reveal how dangerous the situation was, and how local fire departments were unprepared for an industrial fire of that magnitude.

That explosion occurred the very night Patricia Nelson returned home from a long day at the District Court in Denver. Nelson has been part of a coalition of public interest groups – including the NAACP, the Sierra Club, Wall of Women, and Weld Air and Water – that sued the Colorado agency responsible for overseeing oil and gas production in the state, the Colorado Oil and Gas Conservation Commission, for approving permits for 24 wells behind her son Diego’s school.  The company that would drill those wells was the same company overseeing the site that exploded – Extraction Energy.

Under Colorado law, oil and gas wells can be as close as 500 feet from a home and 1,000 feet from a school. Extraction’s new wells would be just over that limit and less than 1,000 feet from the school’s playing fields. Although the court hadn’t yet ruled, the company began construction on the site a few months later, in February 2018, and began drilling the wells that May. Ultimately, the District Court and the Appeals Court upheld the permits. Oil wells now tower over the Bella Romero Academy’s playing fields and the surrounding neighborhood of modest homes.

Smithson once taught at Bella Romero and worries about the kids. “When you have noise pollution and light pollution and dust and methane and all the things that come with having oil and gas production going on, kids are impacted physically. Their lungs aren’t developed…their immune systems aren’t totally developed and they are picking all this up,” she said. She has tried to mobilize the community but has been frustrated by the intimidation many parents feel. “This is a community without a voice,” she said. Bella Romero Academy is roughly 87% students of color, most of whom qualify for free or reduced lunch. “There are kids from Somalia, from war camps” attending the school, Smithson said. “They have trauma from the top of their head to their toes. They’re not going to speak up.” Both Smithson and Nelson pointed out that immigrants – whether from Somalia or Latin America – are unlikely to speak out because they fear retaliation from Immigration and Customs Enforcement. Moreover, some parents work for energy companies. They fear losing their jobs if they oppose an oil site near the school.

 In fact, according to Smithson, Nelson, and Speed, Extraction Energy came to Bella Romero because it expected few parents would resist: The company originally proposed these wells adjacent to the wealthier Frontier Academy on the other side of town, where the student body is 77% white. Extraction moved the wells to Bella Romero after an outcry from the school community. This kind of environmental injustice isn’t unusual, and it generated attention from major media outlets, including the New York Times and Mother Jones. You can see how close the wells are to the school in this clip from The Daily Show (and on the SkyTruth image below).

Image 2: Extraction Energy’s facking site near Bella Romero Academy in Greeley, CO. Image by SkyTruth.

 

SkyTruth has resources to help residents, activists, and researchers address potential threats from residential fracking. SkyTruth’s Flaring Map covers the entire world, and users can see flaring hotspots in their region – where energy companies burn off excess methane from drilling operations into the air — and document trends in the volume of methane burned over time. The SkyTruth Alerts system can keep people in Colorado, New Mexico, Wyoming, Montana, Utah, Pennsylvania, West Virginia up-to-date on new oil and gas permits, and new activities in their area of interest.  

 We know that residents and researchers using these kinds of tracking tools can have major impact. Johns Hopkins University researchers used SkyTruth’s FrackTracker program, which identified the location of fracking sites in Pennsylvania, to document health impacts in nearby communities. Those impacts included increases in premature births and asthma attacks. Maryland Governor Larry Hogan cited this information in his decision to ban fracking in his state. Those interested in collaborating with SkyTruth on similar projects should contact us.

Photo 1. Pump jacks at Extraction Energy’s Rubyanna site in Greeley, CO. Photo by Amy Mathews.

 

Although Colorado activists have had limited success so far, this past year did bring some positive changes. The Colorado General Assembly passed SB 181, which directs the Colorado Oil and Gas Conservation Commission to prioritize public health, safety, welfare, and the environment over oil and gas development. The new law also allows local governments to regulate the siting of oil and gas facilities in their communities and set stricter standards for oil and gas development than the state. Colorado agencies are still developing regulations to implement these new provisions.

 Improvements in technology could help as well.  The same day the SEJ crew met with concerned residents, a spokeswoman with SRC Energy explained the state of the art operations at their Golden Eagle pad in Eaton, Colorado. That technology is designed to mitigate impacts on the surrounding community and includes a 40-foot high sound wall, a water tank on site to pump water from a nearby farm (which reduces truck traffic), and electric pumps (to reduce emissions), among other features. Still, the fear of being surrounded by industrial sites remains for many residents.

Photo 2. SRC Energy’s Golden Eagle Pad, Eaton, CO. Photo by Amy Mathews.

 

In the meantime, Karen Speed is starting to look elsewhere for a new home. Shirley Smithson has decided she’s not going to let an oil company ruin her life. And Patricia Nelson will continue to fight for her family.

 “I think about moving all the time,” Nelson told the group of journalists, her voice cracking.  “But my whole family lives here and I don’t feel I can leave them behind… My sister has five children and drives to Denver for work every day…. I have cousins with kids at this school and family friends. Really, moving isn’t an option for me.”

New Oil and Gas Flaring Data Available

Updated data means anyone can see where, and how much, natural gas is being flared in their area.

SkyTruth has updated its Annual Flare Volume map to include 2017 and 2018 data. We first launched the map in 2017 to provide site specific estimates of the annual volume of gas flared during oil and gas production worldwide.

What is flaring?

Flaring is the act of burning off excess natural gas from oil wells when it can’t economically be stored and sent elsewhere. Flaring is also used to burn gases that would otherwise present a safety problem. But flaring from oil wells is a significant source of greenhouse gases. The World Bank estimated that 145 billion cubic meters of natural gas were flared in 2018; the equivalent of the entire gas consumption of Central and South America combined. Gas flaring also can negatively affect wildlife, public health, and even agriculture.

What can I do?

SkyTruth’s map allows users to search the data by virtually any geographic area they’re interested in, then easily compare and download flare volume totals from 2012 through 2018 to observe trends. In addition, it separates flaring into upstream (flaring of natural gas that emerges when crude oil is brought to the Earth’s surface), downstream oil (refineries) and downstream gas (natural gas processing facilities). Residents, researchers, journalists and others concerned about gas emissions in their city or study area can easily determine the sources of the problem using the latest data available, and how much gas has been flared.

VIIRS Satellite Instrument and the Earth Observation Group

The data we use in the SkyTruth map is a product of the Visible Infrared Imaging Radiometer Suite (VIIRS) satellite instrument, which produces the most comprehensive listing of gas flares worldwide. VIIRS data has moved to a new home this year at the Earth Observation Group in the Colorado School of Mines’ Payne Institute for Public Policy. SkyTruth also uses the VIIRS nightfire data in its popular flaring visualization map.

Thanks to the Earth Observation Group for continuing to make the nightfire data freely available to the public! They have authored the following papers for those interested in the VIIRS instrument and how the flare volume is calculated.

Elvidge, C. D., Zhizhin, M., Hsu, F -C., & Baugh, K. (2013).VIIRS nightfire: Satellite pyrometry at night. Remote Sensing 5(9), 4423-4449.

Elvidge, C. D., Zhizhin, M., Baugh, K. E, Hsu, F -C., & Ghosh, T. (2015). Methods for global survey of natural gas flaring from Visible Infrared Imaging Radiometer Suite Data. Energies, 9(1), 1-15.

Elvidge, C. D., Bazilian, M. D., Zhizhin, M., Ghosh, T., Baugh, K., & Hsu, F. C. (2018). The potential role of natural gas flaring in meeting greenhouse gas mitigation targets. Energy Strategy Reviews, 20, 156-162.

What About the Oceans? Mapping Offshore Infrastructure

Mapping stationary structures in the ocean helps us track fishing vessels and monitor pollution more effectively.

We’re all accustomed to seeing maps of the terrestrial spaces we occupy. We expect to see cities, roads and more well labeled, whether in an atlas on our coffee table or Google Maps on our smartphone. SkyTruthers even expect to access information about where coal mines are located or where forests are experiencing regrowth. We can now see incredibly detailed satellite imagery of our planet. Try looking for your house in Google Earth. Can you see your car in the driveway?

In comparison, our oceans are much more mysterious places. Over seventy percent of our planet is ocean, yet vast areas are described with only a handful of labels: the Pacific Ocean, Coral Sea, Strait of Hormuz, or Chukchi Sea for example. And while we do have imagery of our oceans, its resolution decreases drastically the farther out from shore you look. It can be easy to forget that humans have a permanent and substantial footprint across the waters of our planet. At SkyTruth, we’re working to change that.

Former SkyTruth senior intern Brian Wong and I are working to create a dataset of offshore infrastructure to help SkyTruth and others more effectively monitor our oceans. If we know where oil platforms, aquaculture facilities, wind farms and more are located, we can keep an eye on them more easily. As technological improvements fuel the growth of the ocean economy, allowing industry to extract resources far out at sea, this dataset will become increasingly valuable. It can help researchers examine the effects of humanity’s expanding presence in marine spaces, and allow activists, the media, and other watchdogs to hold industry accountable for activities taking place beyond the horizon.

What We’re Doing

Brian is now an employee at the Marine Geospatial Ecology Lab (MGEL) at Duke University. But nearly two years ago, at a Global Fishing Watch research workshop in Oakland, he and I discussed the feasibility of creating an algorithm that could identify vessel locations using Synthetic Aperture Radar (SAR) imagery. It was something I’d been working on on-and-off for a few weeks, and the approach seemed fairly simple.

Image 1. SkyTruth and Global Fishing Watch team members meet for a brainstorming session at the Global Fishing Watch Research Workshop, September 2017. Photo credit: David Kroodsma, Global Fishing Watch.

Readers who have been following SkyTruth’s work are probably used to seeing SAR images from the European Space Agency’s Sentinel-1 satellites in our posts. They are our go-to tools for monitoring marine pollution events, thanks to SAR’s ability to pierce clouds and provide high contrast between slicks and sea water. SAR imagery provides data about the relative roughness of surfaces. With radar imagery, the satellite sends pulses to the earth’s surface. Flat surfaces, like calm water (or oil slicks), reflect less of this data back to the satellite sensor than vessels or structures do, and appear dark. Vessels and infrastructure appear bright in SAR imagery because they experience a double-bounce effect. This means that — because such structures are three-dimensional — they typically reflect back to the satellite more than once as the radar pulse bounces off multiple surfaces. If you’re interested in reading more about how to interpret SAR imagery this tutorial is an excellent starting point.

Image 2. The long, dark line bisecting this image is a likely bilge dump from a vessel captured by Sentinel-1 on July 2, 2019. The bright point at its end is the suspected source. Read more here.

Image 3. The bright area located in the center of this Sentinel-1 image is Neft Daşları, a massive collection of offshore oil platforms and related infrastructure in the Caspian Sea.

Given the high contrast between water and the bright areas that correspond to land, vessels, and structures (see the vessel at the end of the slick in Image 2 and Neft Daşları in Image 3), we thought that if we could mask out the land, picking out the bright spots should be relatively straightforward. But in order to determine which points were vessels, we first needed to identify the location of all the world’s stationary offshore infrastructure, since it is virtually impossible to differentiate structures from vessels when looking at a single SAR image. Our simple task was turning out to be not so simple.

While the United States has publicly available data detailing the locations of offshore oil platforms (see Image 4), this is not the case for other countries around the world. Even when data is available, it is often hosted across multiple webpages, hidden behind paywalls, or provided in formats which are not broadly accessible or useable. To our knowledge, no one has ever published a comprehensive, global dataset of offshore infrastructure that is publicly available (or affordable).

Image 4. Two versions of a single Sentinel-1 image collected over the Gulf of Mexico, in which both oil platforms and vessels are visible. On the left, an unlabelled version which illustrates how similar infrastructure and vessels appear. On the right, oil platforms have been identified using the BOEM Platform dataset.

As we began to explore the potential of SAR imagery for automated vessel and infrastructure detection, we quickly realized that methods existed to create the data we desired. The Constant False Alarm Rate algorithm has been used to detect vessels in SAR imagery since at least 1988, but thanks to Google Earth Engine we are able to scale up the analysis and run it across every Sentinel-1 scene collected to date (something which simply would not have been possible even 10 years ago). To apply the algorithm to our dataset, we, among other things, had to mask out the land, and then set the threshold level of brightness that indicated the presence of a structure or vessel. Both structures and vessels will have high levels of reflectance. So we then had to separate the stationary structures from vessels. We did this by compiling a composite of all images for the year 2017. Infrastructure remains stationary throughout the year, while vessels move. This allowed us to clearly identify the infrastructure.

Image 5. An early version of our workflow for processing radar imagery to identify vessel locations. While the project shifted to focus on infrastructure detection first, many of the processing steps remained.

Where We Are Now

Our next step in creating the infrastructure dataset was testing the approach in areas where infrastructure locations were known. We tested the algorithm’s ability to detect oil platforms in the Gulf of Mexico, where the US Bureau of Ocean Energy Management (BOEM) maintains a dataset. We also tested the algorithm’s ability to identify wind turbines. We used a wind farm boundary dataset provided by the United Kingdom Hydrographic Office to validate our dataset, as well as information about offshore wind farms in Chinese waters verified in media reports, with their latitude and longitude available on Wikipedia.

Image 6. Wind farms in the Irish Sea, west of Liverpool.

Our results in these test areas have been very promising, with an overall accuracy of 96.1%. The methodology and data have been published by the journal Remote Sensing of Environment. Moving beyond these areas, we are continuing to work with our colleagues at MGEL to develop a full global dataset. What started as a project to identify vessels for GFW has turned into an entirely different, yet complementary, project identifying offshore infrastructure around the world.

Image 7. This animated map shows the output of our offshore infrastructure detection algorithm results (red) compared to the publicly available BOEM Platform dataset (yellow).

In addition to helping our partners at Global Fishing Watch identify fishing vessels, mapping the world’s offshore infrastructure will help SkyTruth more effectively target our daily oil pollution monitoring work on areas throughout the ocean that are at high risk for pollution events from oil and gas drilling and shipping (such as bilge dumping). This is also the first step towards one of SkyTruth’s major multi-year goals: automating the detection of marine oil pollution, so we can create and publish a global map of offshore pollution events, updated on a routine basis.

Be sure to keep an eye out for more updates, as we will be publishing the full datasets once we complete the publication cycles.

Training Computers to See What We See

To analyze satellite data for environmental impacts, computers need to be trained to recognize objects.

The vast quantities of satellite image data available these days provide tremendous opportunities for identifying environmental impacts from space. But for mere humans, there’s simply too much — there are only so many hours in the day. So at SkyTruth, we’re teaching computers to analyze many of these images for us, a process called machine learning. The potential for advancing conservation with machine learning is tremendous. Once taught, computers potentially can detect features such as roads infiltrating protected areas, logging decimating wildlife habitat, mining operations growing beyond permit boundaries, and other landscape changes that reveal threats to biodiversity and human health. Interestingly, the techniques we use to train computers rely on the same techniques used by people to identify objects.

Common Strategies for Detecting Objects

When people look at a photograph, they find it quite easy to identify shapes, features, and objects based on a combination of previous experience and context clues in the image itself. When a computer program is asked to describe a picture, it relies on the same two strategies. In the image above, both humans and computers attempting to extract meaning and identify object boundaries would use similar visual cues:

  • Colors (the bedrock is red)
  • Shapes (the balloon is oval)
  • Lines (the concrete has seams)
  • Textures (the balloon is smooth)
  • Sizes (the feet are smaller than the balloon)
  • Locations (the ground is at the bottom)
  • Adjacency (the feet are attached to legs)
  • Gradients (the bedrock has shadows)

While none of the observations in parentheses capture universal truths, they are useful heuristics: if you have enough of them, you can have some confidence that you’ve interpreted a given image correctly.

Pixel Mask

If our objective is to make a computer program that can find the balloon in the picture above as well as a human can, then we first need to create a way to compare the performances of computers and humans. One solution is to task both a person and a computer to identify, or “segment,” all the pixels that are part of the balloon. If results from the computer agree with those from the person, then it is fair to say that the computer has found the balloon. The results  are captured in an image called a “mask,” in which every pixel is either black (not balloon) or white (balloon), like the following image.

However, unlike humans, most computers don’t wander around and collect experiences on their own. Computers require datasets of explicitly annotated examples, called “training data,” to learn to identify and distinguish specific objects within data. The black and white mask above is one such example. After seeing enough examples of an object, a computer will have embedded some knowledge about what differentiates balloons from their surroundings.

Well Pad Discovery

At SkyTruth, we are starting our machine learning process with oil and gas well pads. Well pads are the base of operations for most active oil and gas drilling sites in the United States, and we are identifying them as a way to quantify the impact of these extractive industries on the natural environment and neighboring communities. Well pads vary greatly in how they appear. Just take a look at how different these three are from each other.

Given this diversity, we need to provide the computer many examples, so that the machine learning model we are creating can distinguish between important features that characterize well pads (e.g. having an access road) and unimportant ones that are allowed to vary (e.g. the shape of the well pad, or the color of its surroundings). Our team generates masks (the black and white pixel labels) for these images by hand, and inputs them as “training data” into the computer. We provide both the image and its mask separately to the machine learning model, but for the rest of this post we will superimpose the mask in blue.

Finally, our machine learning model looks at each image (about 120 of them), learns a little bit from the mask provided with it, and then moves onto the next image. After looking at each picture once, it has already reached 92% accuracy. But we can then tell it to go back and look at each one again (about 30 times), and add a little more detail to its learning, until it reaches almost 98% accuracy.

After the model is trained, we can feed it raw satellite images and ask it to create a mask that identifies all the pixels belonging to any well pads in the picture. Here are some actual outputs from our trained machine learning model:

The top three images show well pads that were correctly identified, and fairly well masked — note the blue mask overlaying the well pads. The bottom three images do not contain well pads, and you can see that our model ignores forests, fields, and houses very well in the first two images, but is a little confused by parking lots — it has masked the parking lot in the third image in blue (incorrectly), as if it were a well pad. This is reasonable, as parking lots share many features with well pads — they are usually rectangular, gray, contain vehicles, and have an access road. This is not the end of the machine learning process; rather it is a first pass through that informs us of a need to capture more images of parking lots and further train the model that those are negative examples.

When working on image segmentation, there are a number of challenges that we need to mitigate. 

Biased Training Data

Predictions that the computer makes are based solely on training data, so it is possible for idiosyncrasies in the training data set to be encoded (unintentionally) as meaningful. For instance, imagine a model that detects a person’s happiness from a picture of their face. If it is only shown open-mouth smiles in the training data, then it is possible that when presented with real world images, it classifies closed-mouth smiles as unhappy.

This challenge often affects a model in unanticipated ways because those biases can be inherent in the data scientist. We try to mitigate this by making sure that our training dataset comes from the same set of images as those that we need to be automatically classified. Two examples of how biased data might creep into our work are: training a machine learning model on well pads in Pennsylvania and then asking it to identify pads from California (bias in the data source), or training a model on well pads centered in the picture, and then asking it to identify them when halfway out of the image (bias in the data preprocessing).

Garbage In, Garbage Out

The predictions that the computer makes can only be as good as the samples that we provide in the training data. For instance, if the person responsible for training accidentally includes the string of a balloon in half of the images created for the training dataset and excludes it in the other half, then the model will be confused about whether or not to mask the string in its predictions. We try to mitigate this by adhering to strict explicit guidelines about what constitutes the boundary of a well pad.

Measuring Success

In most other machine learning systems, it is useful to measure success as a product of two factors. First, was the guess right or wrong? And second, how confident was the guess? However, in image segmentation, that is not a great metric, because the model can be overwhelmed by an imbalance between the number of pixels in each class. For instance, imagine the task is to find a single ant on a large white wall. Out of 1000 white pixels, only 1 is gray. If your model makes a mask that searches long and hard and guesses that one pixel correctly, then it gets 100% accuracy. However, a much simpler model would say there is no ant, that every pixel is white wall, and get rewarded with 99.9% accuracy. This second model is practically unusable, but is very easy for a training algorithm to achieve.

We mitigate this issue by using a metric known as the F-beta score, which for our purposes avoids objects that are very small being ignored in favor of ones that are very large. If you’re hungry for a more technical explanation of this metric, check out the Wikipedia page.

Next Steps

In the coming weeks we will be creating an online environment in which our machine learning model can be installed and fed images with minimal human guidance. Our objective is to create two pipelines: the first allows training data to flow into the model, so it can learn. The second allows new images from satellites to flow into the model, so it can perform image segmentation and tell us the acreage dedicated to these well pads.

We’ll keep you posted as our machine learning technology progresses.

Update 2019-12-13:

In a major step forward, we set up Google SQL and Google Storage environments to house a larger database of training data, containing over 2000 uniquely generated polygons that cover multiple states in the Colorado River Basin. The GeoJSON is publicly available for download at https://skytruth-org.carto.com/maps. These data were used as fodder for a deep learning neural network, which was trained in this iPython notebook. We reached DICE accuracies up to 86.3%. The trained models were then used to run inference on sites that were permitted for drilling to identify the extent of the well pads in this second iPython notebook.