PA and WV Drilling Alerts have Moved to SkyTruth Alerts

If you’ve been on the Pennsylvania Drilling Alerts or West Virginia Drilling Alerts pages lately, you know that they’ve been semi-broken for a while. The technology we’re using on the Drilling Alerts pages is pretty old and will be retired soon. However, you can now do the same county monitoring in SkyTruth Alerts. We’d love it if you’d take it for a spin and tell us what you think.

The PA and WV Drilling Alerts pages have been semi-broken for a while.

We’ve set up two public accounts at SkyTruth Alerts — one for Pennsylvania counties and one for West Virginia counties — that will let you view county alerts in pretty much the same way you did on the Drilling Alerts pages, and with some extra features that we use in-house and hope you’ll find useful too.

To view Drilling Alerts at SkyTruth Alerts:

  1. Go to https://alerts.skytruth.org
  2. Select Login from the top right of the map.  Log in using the UserID and Password information below.

    UserID: Pennsylvania or WestVirginia (no spaces)
    Password: skytruth
  3. Select the My AOIs tab from the left sidebar and choose a county.
  4. Select the Alerts tab from the left sidebar and choose which alerts you want to see.
  5. You can opt to view only alerts within the county you selected and view alerts for a particular date range (Alerts tab).
  6. You can also view near-real-time satellite imagery to help you assess what’s happening on the ground (My AOIs tab).

If you plan to keep using SkyTruth Alerts, consider creating your own account. You’ll be able to keep your settings instead of having to select them every time you log in, and you can optionally receive email notifications when new alerts come in. If you have comments, suggestions, questions, etc., contact us at feedback@skytruth.org.

Allegheny County Drilling App Receives Its First Update

The SkyTruth app that maps potential drillout scenarios across the landscape of Allegheny County, PA has officially received its first update! In an effort to make the experience more user-friendly, explanatory text and tips have been added. Our app has also been updated to remove from the drillout scenario areas such as major highways and the Pittsburgh International Airport, where drilling would obviously not take place.

A screenshot of the app when first initialized.

At the request of some users, we’ve also tabulated the results for the potential drillout scenarios by municipality.  See the results in this table showing the number of occupied structures within two miles of a hypothetical drilling site, based on a given setback distance (in feet) and drilling site spacing (in acres), for every township and borough.  

We were also asked to calculate the number of occupied structures located within 500 feet, and within two miles, of existing Marcellus Shale drilling and fracking sites. According to our analysis, 78 occupied structures fall within 500 feet of an active drilling site in Allegheny County and 67,673 occupied structures sit within two miles of an active drilling site.  Recent scientific research has found human health impacts for people living within 2 miles of a drilling site.

Be sure to check out these insightful new updates for yourself.  Give the app a try and let us know what you think by contacting Brendan at info@skytruth.org with any feedback you might have!

Using machine learning to map the footprint of fracking in central Appalachia

Fossil fuel production has left a lasting imprint on the landscapes and communities of central and northern Appalachia.  Mountaintop mining operations, pipeline right-of-ways, oil and gas well pads, and hydraulic fracturing wastewater retention ponds dot the landscapes of West Virginia and Pennsylvania.  And although advocacy groups have made progress pressuring regulated industries and state agencies for greater transparency, many communities in central and northern Appalachia are unaware of, or unclear about, the extent of human health risks that they face from exposure to these facilities.  

A key challenge is the discrepancy that often exists between what is on paper and what is on the landscape.  It takes time, money, and staff (three rarities for state agencies always under pressure to do more with less) to map energy infrastructure, and to keep those records updated and accessible for the public.  But with advancements in deep learning, and with the increasing amount of satellite imagery available from governments and commercial providers, it might be possible to track the expansion of energy infrastructure—as well as the public health risks that accompany it—in near real-time.

Figure 1.  Oil and gas well pad locations, 2005 – 2015.

Mapping the footprint of oil and gas drilling, especially unconventional drilling or “fracking,” is a critical piece of SkyTruth’s work.  Since 2013, we’ve conducted collaborative image analysis projects called “FrackFinder” to fill the gaps in publicly available information about the location of fracking operations in the Marcellus and Utica Shale.  In the past, we relied on several hundred volunteers to identify and map oil and gas well pads throughout Ohio, Pennsylvania, and West Virginia.  But we’ve been working on a new approach: automating the detection of oil and gas well pads with machine learning.  Rather than train several hundred volunteers to identify well pads in satellite imagery, we developed a machine learning model that could be deployed across thousands of computers simultaneously.  Machine learning is at the heart of today’s companies. It’s the technology that enables Netflix to recommend new shows that you might like, or that allows digital assistants like Google, Siri, or Alexa to understand requests like, “Hey Google, text Mom I’ll be there in 20 minutes.”

Examples are at the core of machine learning.  Rather than try to “hard code” all of the characteristics that define a modern well pad (they are generally square, generally gravel, and generally littered with industrial equipment), we teach computers what they look like by using examples.  Lots of examples. Like, thousands or even millions of them, if we can find them. It’s just like with humans: the more examples of something that you see, the easier it is to recognize that thing later. So, where did we get a few thousand images of well pads in Pennsylvania?  

We started with SkyTruth’s Pennsylvania oil and gas well pad dataset. The dataset contains well pad locations identified in National Agriculture Imagery Program (NAIP) aerial imagery from 2005, 2008, 2010, 2013, and 2015 (Figure 1).  We uploaded this dataset to Google Earth Engine, and used it to create a collection of 10,000 aerial images in two classes: “well pad” and “non-well pad.” We created the training images by buffering each well pad by 100 meters, clipping the NAIP imagery to the bounding box, and exporting each image.

 

The images above show three training examples from our “well pad” class. The images below show three training examples taken from our “non-well pad” class.

 

We divided the dataset into three subsets: a training set with 4,000 images of each class, a validation set with 500 images of each class, and a test set with 500 images of each class.  We combined this work in Google Earth Engine with Google’s powerful TensorFlow deep learning library.  We used our 8,000 training images (4,000 from each class, remember) and TensorFlow’s high-level Keras API to train our machine learning model.  So what, exactly, does that mean? Well, basically, it means that we showed the model thousands and thousands of examples of what well pads are (i.e., images from our “well pad” class) and what well pads aren’t (i.e., images from our “non-well pad” class).  We trained the model for twenty epochs, meaning that we showed the model the entire training set (8,000 images, remember) twenty times.  So, basically, the model saw 160,000 examples, and over time, it “learned” what well pads look like.

Our best model run returned an accuracy of 84%, precision and recall measures of 87% and 81%, respectively, and a false positive rate and false negative rate of 0.116 and 0.193, respectively.  We’ve been pleased with our initial model runs, but there is plenty of room for improvement. We started with the VGG16 model architecture that comes prepackaged with Keras (Simonyan and Zisserman 2014, Chollet 2018).  The VGG16 model architecture is no longer state-of-the-art, but it is easy to understand, and it was a great place to begin.  

After training, we ran the model on a few NAIP images to compare its performance against well pads collected by SkyTruth volunteers for our 2015 Pennsylvania FrackFinder project.  Figures 4 and 6 depict the model’s performance on two NAIP images near Williamsport, PA. White bounding boxes indicate landscape features that the model predicted to be well pads.  Figures 5 and 7 depict those same images with well pads (shown in red) delineated by SkyTruth volunteers.

Figure 4.  Well pads detected by our machine learning algorithm in NAIP imagery from 2015.
Figure 5.  Well pads detected by SkyTruth volunteers in NAIP imagery from 2015.
Figure 6.  Well pads detected by our machine learning algorithm in NAIP imagery from 2015.
Figure 7.  Well pads detected by SkyTruth volunteers in NAIP imagery from 2015.

One of the first things that stood out to us was that our model is overly sensitive to strong linear features.  In nearly every training example, there is a clearly-defined access road that connects to the well pad. As a result, the model regularly classified large patches of cleared land or isolated developments (e.g., warehouses) at the end of a linear feature as a well pad.  Another major weakness is that our model is also overly sensitive to active well pads.  Active well pads tend to be large, gravel squares with clearly defined edges. Although these well pads may be the biggest concern, there are many “reclaimed” and abandoned well pads that lack such clearly defined edges.  Regrettably, our model is overfit to highly-visible active wells pads, and it performs poorly on lower-visibility drilling sites that have lost their square shape or that have been revegetated by grasses.

Nevertheless, we think this is a good start.  Despite a number of false detections, our model was able to detect all of the well pads previously identified by volunteers in images 5 and 7 above.  In several instances, false detections consisted of energy infrastructure that, although not active well pads, remain of high interest to environmental and public health advocates as well as state regulators: abandoned well pads, wastewater impoundments, and recent land clearings.  NAIP imagery is only collected every two or three years, depending on funding. So, tracking the expansion of oil and gas drilling activities in near real-time will require access to a high resolution, near real-time imagery stream (like Planet, for instance).  For now, we’re experimenting with more current model architectures and with reconfiguring the model for semantic segmentation — extracting polygons that delineate the boundaries of well pads which can be analyzed in mapping software by researchers and our partners working on the ground.

Keep checking back for updates.  We’ll be posting the training data that we created, along with our initial models, as soon as we can.

 

From Trial to Triumph: Learning to Code at SkyTruth

When I was brought on to intern at SkyTruth, I was presented by one of our analysts, Ry Covington, with a rather open-ended challenge: “Here at SkyTruth, we want our internships to be a learning experience first and foremost. With that said, you can learn almost anything you want while working here. Think about your main goals, what you hope to get out of these next few months and let me know so that we can decide together how best to help you achieve your goal.” This isn’t verbatim, but it hits the mark in regards to the message he was conveying. I already knew the answer to this question, because it was something that I had thought about even before I started at SkyTruth: I had a strong desire to learn programming, especially as it pertains to the field of geospatial analysis. Now mind you, this is coming from someone who is brand new to programming in any respect. I had one class in C++ my sophomore year of college and all of that information has since slipped my mind, as most unimportant (or at least I regarded it as unimportant at the time) college course material does. I also had attempted to teach myself Python the summer before I started at SkyTruth while rehabilitating a nasty toe fracture, but that information didn’t quite encapsulate what I hoped to achieve with coding. I only knew the foundational pieces of what it means to communicate with a computer through coding, such as iterators, variables, and Boolean operators.

Around a month or so into the internship I received my opportunity to work towards my goal. Ry had a project he was doing with the Heinz Foundation that involved creating a map which detailed potential well pad sites in Allegheny County, PA (to see my blog post detailing the final results of this study, follow this link). We worked out a research approach and came up with a plan to map out these locations that seemed logical to both of us. Great. Now here’s an idea: let’s take all of those things that we want to incorporate into this project and write that into a Google Earth Engine (EE) script. On top of that, let’s also publish an app which uses this same Earth Engine script. These suggestions made by Ry seemed fine and well, until the moment where he delegated the task of writing this script unto me. In his words, “It’s a steep learning curve, but you’re a smart guy. You’ll figure it out!” He left the room as he was making this statement, leaving me stunned like a deer in headlights. Yes I wanted to learn to program, but this seemed excessive! After a few moments of processing, I collected myself and turned to the most trustworthy source of them all for guidance: Google Search. Looking up “Google Earth Engine tutorial” led me to a website which detailed the functions which you can use to make neat little UI plugins in your EE code. After a day of playing with them and applying some of the sample code they offered, I figured out how to display the elements of interest on the map. Following this discovery, I excitedly called to Ry to come and look at what I had made. He took one look at what I had done and said, “Good job Brendan, that’s an excellent start. Keep going.” Wait, this wasn’t what I was trying to get? It was an excellent start, which meant I was nowhere near the end. So it was time to expand my search for information.


An example of the guides found on Google’s official Earth Engine site.

At Ry’s suggestion, I looked at the code that one of SkyTruth’s other analysts, Christian Thomas, had written in EE for global ocean monitoring. This turned out to be an invaluable source of information for me but it was intimidating at first glance. The script itself is composed of over 500 lines of intensely in-depth code that made no sense to me whatsoever. On more than one occasion, I found myself becoming lost in the script’s intricacies while trying to find answers to my own predicament. Nevertheless, I remained determined to figure out how to compose my script, so I proceeded forward with the examples Christian’s code offered me. My program would involve creating a panel on which I’d load on a few buttons with different unique functions that built off of each other to display different text and layers on the panel and the map, respectively. Needless to say, my first few stabs at all of this led me to fall flat on my face. I could create the buttons that I wanted, but I couldn’t for the life of me figure out how to put the buttons on the side panel that I’d made. I spent roughly two weeks trying to get over this hurdle.

It was nice to have the encouragement of my mentors to keep me motivated to seek the answer to this question. Without Christian and Ry’s calm demeanors and calculated approach to solving these programming issues, I probably would’ve lost hope quickly. Days of work would go by and I’d spend a lot of time carefully combing through my code trying to discover the answer to my plight. Even when I was at home at night, I’d think about ways I could get to the result I desired. Then one day, all at once, it hit me. It was like the roadblock in my mind was slammed into by a Mack truck and just like that I figured out how to get the buttons to display on the side panel. To describe the feeling as a whole is difficult, but I think I could compare it to the way I felt as a little kid on Christmas morning when I rushed down the stairs and saw presents underneath the Christmas tree. I jumped up out of my seat, did some silent yelling, and added in a few fist pumps for good measure. I was elated! The cool part about it was that once I broke through that wall, it was like the rest came naturally. My fingers fluttered across my keyboard as I coded exactly what I had in mind for this project. Before I knew it I had an EE script that was logical and effective. I remember when Ry and Christian first looked at my latest script and they had nothing but good things to say about what I had accomplished. That positivity made me feel great; it was a huge payoff after a great deal of tribulation. I had finally done it!

A screenshot from my finished app.

Now that I reflect on the path I followed to create this code, I’m astounded by how much I learned. I learned more in those few chaotic weeks than I would typically learn over the course of a semester long college course! It was a chaos that you can’t quite replicate in the classroom because there aren’t people depending on you when you’re doing homework or taking a test. The urgency to make your product the finest ever conceived is palpable. You do the best that you can for your coworkers, who end up being more like your friends here at SkyTruth. I think that hits the nail on the head when considering the mission of SkyTruth overall; we’re all here to provide information for others so that they can do meaningful work in defending our precious environment. Every task you work on here could have an effect on your world, so you do everything you can to give every project your all. I definitely got way more than I bargained for when I said that I wanted to learn to program at SkyTruth. Despite that, I couldn’t be more satisfied with the work I was able to do over my first few months here.

If you would like to view my completed app, please click here!

Mapping potential “drill out” scenarios in Allegheny County, Pennsylvania

SkyTruth has just launched its first Google Earth Engine app, detailing potential natural gas drilling scenarios in Allegheny County, Pennsylvania.  If you’re interested, you can view the app here.

Hydraulic fracturing — fracking — has unlocked natural gas resources from formations like the Utica Shale and Marcellus Shale, resulting in an explosion of gas-drilling activity across the Mid-Atlantic states. One of the states sitting above this hot commodity is Pennsylvania; the state boasts a massive reserve of nearly 89.5 trillion cubic feet of dry natural gas, according to the US Energy Information Administration.  In the thick of it all, Allegheny County, in the southwestern portion of the state, is one of the few counties where drilling activity has been relatively light. The county’s main defense against well drilling has been zoning regulations which require a “setback” between unconventional natural gas drilling sites and “occupied buildings.”  At present, the minimum distance required between a well pad and a building is 500 feet (unless consent has been received by the building’s owner). However, this distance may not adequately protect human health, especially in communities surrounded by drilling. Municipal officials might want to consider alternative setbacks, based on the latest scientific research on the impacts of drilling on the health of nearby residents.  This analysis evaluates a range of setback scenarios, and illustrates the likely drilling density and distribution of drilling sites across the county for each scenario.

To better understand the potential impact of drilling in Allegheny County, I analyzed several different “drill out” scenarios (Figure 1).  I developed our first Google Earth Engine app to give users a glimpse of how different setback distances and different well spacing intervals might impact the number of homes at risk from drilling impacts in the future.  Check out the analysis here.

Figure 1. A screenshot of the app when first launched.

To begin this analysis, I downloaded building footprint data for Allegheny County from the Pennsylvania Geospatial Data Clearinghouse.  Next, I downloaded shapefiles representing the centerlines of major rivers passing through the county, other hydrological features in Allegheny County, and county-owned roads from the Allegheny County GIS Open Data Portal. I also downloaded a TIGER shapefile representing Allegheny County’s Major Highways as of 2014, courtesy of the US Census Bureau. Setback distances of 500 feet, 1,000 feet, 1,500 feet, and 2,500 feet were used to buffer the center points of “occupied buildings” in the county. I selected the minimum and maximum setback distances based upon the current Pennsylvania setback laws (500 ft.) and a recently proposed and defeated setback distance from Colorado (2,500 ft.). The latter regulation, if passed, would have been the most restrictive regulation on fracking of any state.  The 1,000 and 1,500 foot setbacks are meant to serve as intervals between these two demonstrated extremes of zoning regulation. I also created buffers around rivers and streams as well as roads. I applied a 300 foot buffer to the centerlines of all rivers and streams in the county (based upon the current regulations). I also applied a 328 foot buffer to all major highways and a 40 foot buffer to all county roads. These three buffer zones remained constant throughout the project.  

After applying these buffer distances to rivers, roads, and buildings, I calculated how many acres of Allegheny County were potentially open to drilling.  Using the currently required distance of 500 feet, there are approximately 53,000 acres potentially available for drilling in Allegheny County, PA (See Figure 2).  

Figure 2: Screenshot from the app showing the available drilling area in Allegheny County (shown in grey) when considering the 500-foot setback from occupied structures.  Current well pad locations are denoted by red points on the map.

Using the setback distances that we identified (e.g., 500 feet, 1,000 feet, 1,500 feet, 2,500 feet), I wanted to visualize what different potential “drill out” scenarios might look like.  To do that, I had to decide how much space to leave between potential well sites. I chose to space out the potential drilling sites according to three different intervals: 40 acres per well, 80 acres per well, and 640 acres per well.  Calculating different setback distances and different spacing intervals allowed me to investigate the range of possible “drill out” in Allegheny County.  I calculated the number of new drilling sites that each “drill out” scenario could potentially support. I’ve summarized the results below:

  40 acre well spacing 80 acres well spacing 640 acre well spacing
500 ft. setback 928 465 52
1,000 ft. setback 257 156 14
1,500 ft. setback 84 48 8
2,500 ft. setback 18 10 3

So, for example, a setback distance of 500 feet coupled with a spacing between well pads of 40 acres would allow for 928 new potential drilling locations.  Taking into consideration the approximate 3-5 acre area required for the development of a well pad, this suggests that 2,700-4,600 acres of land in Allegheny County could be subjected to surface well development.

For each “drill out” scenario, I mapped the number of potentially supported wells, and I put a two-mile buffer around each point to simulate the potential zone of adverse health impacts (See Figure 3).  I used the buffered points to calculate the number of “occupied structures” that would be at risk of exposure if a drilling site was built. The number of occupied structures at risk when considering each of the different scenarios is summarized in the table below:

  40 acre well spacing 80 acre well spacing 640 acre well spacing
500 ft. setback 446,901 380,284 194,053
1,000 ft. setback 222,481 215,415 43,256
1,500 ft. setback 90,046 60,919 26,722
2,500 ft. setback 4,816 4,524 3,626
Figure 3. Screenshot from the app showing potential drill-out locations (shown in yellow), considering a 500-ft setback from occupied structures and a separation between potential drilling operations of 40 acres. Notice the area of the county potentially subjected to adverse health consequences considering a two-mile buffer zone (shown in black) around each of these locations.

Setback distances can be an important tool for municipal governments looking to reign in drilling to protect the health, safety, and quality of life of local residents.  My analysis demonstrates how setback distances can help protect the public from the adverse impacts of oil and gas drilling in Allegheny County, Pennsylvania. Please be sure to check out the app here.