Adding Data Categories to Blog and DataTool
Among our initial discussions with stakeholders and users of the DataTool, it was often suggested that CHEC introduce new data categories instead of the standard ones that define data on the DataTool, and then list those groups on the blog for people less familiar with the navigation of FracTracker’s DataTool.
Matt Kelso, our new data manager, has quickly assessed the data currently found on the DataTool and made some recommendations as to how all of the datasets can be defined and posted on the blog. The metadata (descriptions of the data’s origins, keywords, timeliness, etc) associated with each dataset is also an important feature within the DataTool that also needed some attention. In addition to taking an inventory of the datasets that have been posted to data.fractracker.org (the DataTool), Kelso attempted to bring some clarity to the various categories, and made some notes as to the quality of the metadata that was provided. The 79 datasets that currently exist on the DataTool can fit into one of the following categories (frequencies of each are parenthetical):
Geologic formations – gas fields (3)
Geologic formations – other (2)
Physical geography (2)
Political boundaries (6)
Wildlife habitat (4)
Air quality (6)
Land quality (1)
Water quality (3)
Drilling permits (28)
Gas well sites (6)
Incident reports and regulations (7)
Community health data (2)
Interview data (3)Other
In fact, the two datasets best described by “Other” are tests that have been scheduled for deletion. This is not to say that there might not eventually be more legitimate categories—perhaps an agricultural or economic dataset will eventually be uploaded to the site, but until they are, it is probably best to keep the number of categories to a minimum. Currently users do not need to choose one of the above categories to define their datasets, but we are considering adding that as a requirement, with perhaps an option for a secondary category. We would appreciate your feedback on that issue and the proposed categories.
Some users have experienced difficulty using the geographic search tool located on the Explore page. Kelso suggests that rather than drawing a rectangle on the screen to define a geographic location (as it stands now), it might be better to allow users to narrow their searches by a specific state or region. In reality, it is only as reliable as the data that’s been provided. For example, there are five datasets that relate to Marcellus drilling permits in Ohio, but if you look up the word “Ohio” there will not be any results, since the information was entered as “oh”. For this reason, Kelso suggests that the data uploader be required to select a geographic location from a drop-down box, as well.
There is a feature sort of similar to the one Kelso is suggesting for geographic search. When you refine the area of the search, you can choose to either draw a bounding box or pick a dataset to use as the bounding box. The latter feature only lets you search from datasets that are in your library, though — it was designed with the idea that a particular user would often want to find data associated with a regional dataset that they were already interested in. It doesn't work so well for a new user that is just starting to explore the system.