Active funded projects
CHANGES: Collections, Heterogeneous data, And Next Generation Ecological Studies (funded through MIDAS PODS)With Karen Alofs (PI), Hernan Lopez-Fernandez, Randy Singer, Kevin Wehrly, and Justin Schell
Ecological systems are facing unprecedented and rapid environmental change. Understanding and predicting the impacts of environmental change requires novel data science methods that make use of historical data in ways never imagined when originally collected. Such data include long-term ecological surveys, natural resource monitoring, and museum specimens with ancillary field notes. Though potentially transformative, leveraging these heterogeneous data with varied structures and sampling biases remains challenging. We are developing interdisciplinary protocols for integrating heterogeneous natural science datasets to investigate the impacts of environmental changes on species distributions across a range of taxa and ecosystems, thereby making critical contributions to the fields of ecology, environmental science, and data science.
Specific Aims. 1: To develop novel, reusable protocols and “recipes” for the curation and integration of heterogeneous, longitudinal ecological datasets in accessible and useable databases. 2: To develop methods to analyze heterogeneous and biased ecological data. 3: To examine changing environmental drivers of fish distributions through time to test the validity of space-time substitutions.
Measuring the Impact of Curatorial Actions (IMLS LG-37-19-0134-19; NSF SCI-SIP 1930645)Co-Principal Investigator with Libby Hemphill (PI), Amy Pienta, Dharma Akmon & Beth Yakel. IMLS National Leadership Grant, $480,637; NSF Science of Science and Innovation Policy, $498,643. awarded 2019.
Empirical studies are sorely needed to evaluate the impact of specific aspects of digital data curation. In this project we developing curatorial metrics to evaluate the impact and efficacy of specific data curation processes. Curatorial metrics are statistical measures similar to bibliometrics but designed to assess the impact of curatorial work over time on the use of collections. Using curation logs and other records, we will develop and analyze a range of curatorial metrics from the last five years of data curation at the Inter-university Consortium for Political and Social Research (ICPSR), a highly impactful social science data repository.
Press release IMLS award NSF Award
Reducing Time-To-Science in the Earth Sciences: Annotations to foster convergence, inclusion, and credit (NSF ICER 1928366)Principal Investigator, with Simon Goring, Stephen Kuehn, Nicholas McKay, Kerstin Lehnert, Anders Noren; Co-PIs: John Williams, Shanan Peters, Amy Myrbo
The long term sustainability of federally funded research depends on the discovery, accessibility and reuse of data. However, data and research products are often stored in different locations. This makes it challenging to find and integrate related data. In this project, we're developing and populating an annotation database that will support the discovery of related but distributed research products for Earth science and natural history data. Researchers will have a way to link data resources, add context, or provide additional information about data, software and publications. Researchers use this system to create annotations that link resources using unique identifiers. Over time, these links connect to create a network of data resources.
More about the Throughput project
Completed funded projects
Migrating Research Data Collections (IMLS RE-07-18-0118-18)Principal Investigator. Institute for Museum and Library Services, Laura Bush 21st Century Librarian Program, Curating Collections Early Career Research Grant. $428,934, awarded 2018.
In this project, I investigate questions related to the migration of research data collections between data management and preservation platforms over the course of the lifespan of specific datasets. I am also exploring the factors driving (and tensions arising from) the growing adoption by libraries of "off-the-shelf," sometimes proprietary, collections management systems. The project will develop a model of migration patterns in digital collections; best practices to better support the migration of research data collections; prototypes of tools and techniques to support migration processes, particularly focusing on migration audits, metadata creation, and data schema alignment; and open access course modules based on this work.
Paper describing pilot work: Supporting the long-term curation and migration of natural history museum collections databases
ASIS&T 2022 paper presenting literature review: Maintaining Repositories, Databases, and Digital Collections in Memory Institutions: An Integrative Review.
Open Badge Researcher Credentials for Secure Access to Restricted and Sensitive Data (NSF CICI 1839868)Co-Principal Investigator, with Margaret Levenstein (PI), Libby Hemphill, and Florian Schaub. National Science Foundation, Office of Advanced Cyberinfrastructure, Cybersecurity Innovation for Cyberinfrastructure, Research Data Protection. $881,342, awarded 2018.
We are building a system that uses visible tokens, or “badges,” to help safeguard the integrity and provenance of research data, and the conclusions drawn from them. This project has three main objectives:
Develop an open badge system for managing researcher credentials.
Articulate levels of data sensitivity and risk that indicate criteria for access for each level.
Identify the right balance between openness and privacy for data users in a restricted data access system.