Mined the Overstep
Intern Rachel Pierce reports on her work at SkyTruth.
Prospecting
I first heard of SkyTruth at least a decade ago. I was amazed! Like the Eye of Sauron (but for doing good), they monitored polluters and assessed biodiversity. All right around the corner from my classes in that tiny place called Shepherdstown. I just knew: one day I had to get in. Well, my fellow earthlings, this eagle has landed. With a Remote Sensing course under my belt, among other graduate-level GIS coursework, I felt more driven than ever to be a part of this eclectic crew. The unique personalities and semi-casual, remote work atmosphere contrast sharply with the grim reality of why we do what we do.
We can’t do anything to return Appalachia to its original splendor, but we are doing everything we can to improve how we monitor the extent of surface mining as well as post-mining performance. I was excited to collaborate with SkyTruth’s Geospatial Engineer, Christian, and fellow intern Ethan, on how to approach automated detection methodologies.
As you may already know from our previous work, surface mining — and Mountain Top Mining (MTM) specifically — is the most prevalent method of coal extraction in Central Appalachia. It comes with its own history that’s as twisted as the local bedrock. Over the past two years we worked to examine the complex legacy of environmental implications of MTM, in collaboration with Appalachian Voices and which we explored further with our latest publication in Restoration Ecology. We also continue to publish annual updates to our MTM dataset to continue highlighting the ongoing impacts mining has on Appalachia.
As a part of our commitment to highlighting the impacts of MTM, we are developing new approaches to improve the overall performance and accuracy of our detection model. We also want to find out to what extent mines may operate outside of their permitted boundaries. Considering these issues, we are exploring the potential of how utilizing available permit data can improve our overall detections, while also highlighting areas of concern.
Digging Into the Issues
To address the issue of identifying areas of concern, we can frame the problem around the objective: To find out where coal mines may be overstepping designated permit boundaries and to evaluate the area of overstep compared to the accepted tolerance. This complex situation involves a combination of legal, environmental, and geographical assessments:
- What are considered allowable tolerances to mining outside of a permit boundary?
- Who are the mine operators and where are the suspected facilities operating out of tolerance?
- Are the relevant permit documents up to date?
- Do the state agencies update their spatial information regularly?
Reviewing permit documentation and utilizing spatial data in concert with satellite imagery is a practical and useful method for evaluating the extent of any overstep compared to accepted tolerance, as a similar methodology was implemented in the aforementioned post-mining reforestation assessment. Because we can filter active, inactive, or another mine status, we will continue to use coal surface mine permit boundaries to bolster what we hope will become a ML pixel classification training dataset.
False Positives are an issue because they depict areas that exhibit features easily mistaken for an active surface mine. Types of false positive detections include unreclaimed sites, such as where mining activity has recently ceased, artifacts that appear in imagery despite efforts to filter them out completely, and barren landscapes, like open rock face, eroded soils, or even seasonal changes in greenness.
Shortfalls in spatial data collection are every data hunter’s nightmare and are ever-present in our world, despite modern marvels such as the Internet of Things and Software as a Service geospatially linking us all. Completeness, accuracy, and even accessibility are still very real issues that limit the ability of analysis to have trustworthy and impactful results. This is one of the biggest hurdles in conservation work, often paired with time, funding, and ground-truthing constraints.
Alas, we persevere.
Finding out the accepted overstep tolerance for each state is not so easy either. Without any standard, quantifiable limits, any surface mine detections that occur outside permitted bounds will need to be manually scrutinized. The area of overstep can be totaled and we can look for potential trends or patterns in the data. Opening lines of communication with relevant organizations and agency representatives can help bridge this knowledge gap, provided the information is available and accurate.
Maintaining data integrity is important at every step, so it’s paramount that we preserve the original data. While permit data can be sourced from a single entity — such as through the Geomine application from the United States Office of Surface Mine and Reclamation — it is better to go directly to the source, meaning we run into a lack of data standards and continuity across the board. The reasoning is that each Source Agency will have the most up-to-date records, likely with regularly scheduled updates. For example, Virginia’s Department of Energy makes daily updates to their online datastores, and mine permit data is updated regularly throughout a mine’s life cycle (Daniel Kestner, Personal Communication, Dec 2023). The Geomine data hub happens to contain Virginia’s coal surface mine permit boundaries in their composite dataset; however, during our comparison of current permit bounds to prior detections, we found a considerable gap in this data.
To ensure our newly acquired data maintains its integrity, we must create a new ‘normalized’ dataset of all the permit boundaries within the study area. In doing so, we will be able to refer to the data should something come up during extracting and transforming the permit boundary datasets. This can easily be done by selecting relevant records and exporting them to a new file within a database or directory. This process with the normalization, is known as Extract, Transform, and Load (ETL). We select and export (extract), add new attribute fields and populate them (transform) and, finally, export the data into a repository (load).
But what does ‘normalized’ mean, you ask?
Chipping Away at the Goods
Each state has its own set of procedures and naming conventions when it comes to generating spatial data such as permit boundaries. As a mine’s life cycle is dynamic, there are likewise many attributes describing the mine, its status, its type, and so on. But because there are no standardized methods of attributing a mine or its permit boundary, we need to create a dataset with consistent attributes. This is called Normalizing the Data. This means we will look at all the separate permit data (each state’s permit boundary polygons) and create new attributes that describe the permits: State Name, Permit ID, Permit Status, Permittee, Mine Name, Mine Operator, and Source Agency. To populate these new attributes, we can fill in things like State Name and Source Agency, then we’ll use Structured Query Language (SQL) to write expressions that reference the raw data and fill in the corresponding blanks of our new attributes.
Each state’s department of mining delegates different naming conventions for permit statuses, and some states, such as WV, go into excruciating detail while others maintain simpler designations. Alternatively, Virginia maintains separate and distinct datasets on permits with released status, bond forfeitures, and bond statuses. All considered, simplifying permit status to ‘Active,” ‘Inactive,” and ‘Other’ will suffice. Along the way there are bound to be bottlenecks and roadblocks. For instance, the question arose regarding whether a mine’s status has the same technical and legal meaning as permit status. How does one approach this without losing valuable resources? Sometimes an approach is best taken while noting these considerations for future work and continuing as planned. Other times, it means starting over at square one. Careful assessment of each attribute and its definition is necessary early in the process here.
Once we have our new attribute columns in place, and relevant data entered or copied into the empty values, we can finally merge the cleaned, normalized data into a singular unit, where a user can filter permit boundaries by any of the new attributes and our new, beautiful, and clean dataset will remain shiny and intact.
The Haul
The hope is this cleaned permit dataset will feed into the work we aim to do in terms of assessing post-mining performance. If we’re to improve the accuracy of future detections, we must assess the false positives in those prior. Inactive mines will be classified as non-mine pixels. The detections will be passed through a vegetation index (the well-known NDVI algorithm) to ensure a lack of greenness as we expect zero vegetation in an active surface mine. By manually reviewing the prior detections for any of the false-positive scenarios discussed in the section above, we can further train a ML model to evaluate detections based on Permit Status (Active, Inactive, Other).
Stewardship
The effort put into cleaning this permit data, and that of a future ML model, will benefit ongoing work in assessing post-mining performance. By improving our detections while monitoring these mining activities, we continue to keep the spotlight on operations that overstep their bounds, raise awareness of the impacts and dangers of MTM, advocate for those most impacted by MTM, and educate the public on how MTM affects those of us downstream.
The work doesn’t stop here. Mining activity is staged in bonds and while the staging isn’t as important for this aspect of the work in improving our detections of active mining sites, bond release status will be important for reclamation assessment. As we move toward new approaches that leverage this new mine permit boundary dataset, it will be made publicly available.
Working with and learning from the incredibly talented, intelligent, diverse group of driven people at SkyTruth is an extraordinary experience, one I’ll carry with me onward and upward. I am proud to yell it from any mountain top, removed or not.