Pennsylvania Marcellus Fines Data

Usually, I try to let readers know the state of affairs related to oil and gas extraction by taking a hard look at publicly available data.  Sometimes, however, it seems like the simplest questions have an answer that starts off with, “Well, it’s complicated…”  Such is the case when it comes to fines issued by the the Pennsylvania Department of Environmental Protection’s (PADEP) Office of Oil and Gas Management.  Luckily  PADEP releases data about fines issued to operators in its compliance report, but unfortunately, it can be confusing to interpret.  Let’s take a closer look:

The first point of confusion is the compliance report itself.  Specifically, there are more rows of data than there are violations, as counted by PADEP.  The answer seems to be that PADEP counts the number of unique violation ID numbers, however sometimes (but not always), the same Violation ID will be used for multiple rows of data on the violation report.  When I downloaded Marcellus violation data from January 1, 2005 through May 2, 2012, there were 4,293 rows of violation data, but only 3,689 unique violation ID’s.


Distribution of Violation ID frequency on the PADEP Marcellus Shale compliance report, January 1, 2005 through May 2, 2012.

If a violations are counted by unique Violation ID numbers, what then are we to make of the other 604 items on the list? They all have violation numbers, but share them with between one and three other incidents. My perception is that this is one of those decisions made in the field that has unanticipated consequences with respect to database maintenance. That is just a guess though–I’ve contacted PADEP for clarification on this point, and will be sure to relay that information when I receive it.

The second confusing aspect of the fines data is fairly similar in nature, in that identical fine amounts will often appear for multiple violations (or rows of violation data, as the case may be).  This is an ostensibly reasonable thing to do; if PADEP can deal with a suite of related violations all at once, why not do so?  But it does beg the question of whether the full fine is posted for each item on the compliance report, or whether it has been prorated between them.

I believe the former case to be correct.  Take for example, the recent announcement of a fine issued by PADEP to Ultra Resources for improper storage of flowback water at a Potter County site.  The announcement mentions a $40,000 fine, but the data reflects three fines assessed to Ultra for that amount on March 23, 2012 for three incidents with unique Violation ID numbers.

The implication is that if you go through the dataset and add up the value of all the fines, the result will almost certainly be wildly inflated.  The same is true for the number of fines that have been assessed.   However, there is something linking the three records together in the Ultra example:  they all share the same Enforcement ID number.  So perhaps that is the key?  Let’s take a look at the total number and value of Marcellus fines assessed, with the data organized in three different ways:


Aggregated number and value of Marcellus related fines in PA from 1-1-2005 to 5-2-2012, by method.

Given that the three Ultra violations in the example above all had unique Violation ID numbers but shared the same Enforcement ID, my expectation was that aggregating the data by unique Enforcement IDs would yield the smallest (and most accurate) statewide totals.  Clearly, that hypothesis needs to be relegated to the scrap heap, based on the table above.

And to be honest, I don’t have a better hypothesis on deck.  I have also asked PADEP for clarity on this point, and will be happy to share that information when I receive a reply.  But for now, I’m not even sure if it is possible to tease the correct answers from the data that have been provided.  Which is a shame, because if we knew a reliable methodology for doing so, it would be possible to explore the topic in much more interesting detail, finding answers for questions like:  Which company gets fined the most?  What’s the ratio of violations to fines assessed?  How many days pass between a violation being issued and a fine?  (For this last one, I can tell you that the maximum amount of time so far is 755 days–I just can’t provide a reliable distribution of the results).

I like to give the DEP credit where it is due:  they are making tremendous progress in their dissemination of oil and gas data.  Two years ago, there were no compliance, production, or waste reports.  Drilled well data was available, but much of it didn’t have location data, and you had to copy and paste from web tables to a spreadsheet, which didn’t always work very well.  And some of the location data for permits were miles away from the actual well site with the corresponding API number.  PADEP has come a very long way in the reliability and accessibility of their oil and gas data.  Here’s hoping that trend continues.

2 replies
  1. Sue DeVito
    Sue DeVito says:

    Hi Matt,

    Thank you for your post. I am struggling with these numbers right now. I am not very encouraged by what you say. I just placed a call to the number provided on the O&G web site and got their voice mail. I pulled all of the O&G reports for 4 counties which I am looking at for 1-1-08 to 8-13-2012. I found 3 rows with data for April 22, 2011 with penalties of $190,000 each for Chesapeake energy in my Bradford county report. Since it looked from news accounts like there was a $565,000 fine around this time I thought this was a possible match and that each $190,000 was separate. I am hoping to get a response from the DEP help desk to confirm this.
    However, there are also news accounts of Chesapeake being fined 900,000 for methane migration in Bradford county, and I can find NO evidence of this in my dataset. http://www.propublica.org/article/pa-officials-issue-largest-fine-ever-to-gas-driller
    Have you done any other work with this data since your post in May?

  2. Jen Gregory
    Jen Gregory says:

    Thanks Matt:
    You are so helpful to present the data and discuss it- there is so much in the database and DEP has come a long way.
    Thanks for your insight!

Comments are closed.