Active funded projects

FAIROS RCN: Ethical Open Science for Past Global Change Data (NSF #2226371)

Our aim is to build technical and social capacity among community-curated data repositories in the Quaternary sciences. We are doing this by supporting technical implementation of ethical open science principles and developing communities of practice focused on CARE and FAIR principles.


Internet of Samples (iSamples): Toward an Interdisciplinary Cyberinfrastructure for Material Samples (NSF #2004562)

The Internet of Samples (iSamples) is a multi-disciplinary and multi-institutional project to design, develop, and promote service infrastructure to uniquely, consistently, and conveniently identify material samples, record metadata about them, and persistently link them to other samples and derived digital content, including images, data, and publications. 

NSF award 

Next Generation Interoperable Data Infrastructure for Geoscience Sample Data (EarthChem, LEPR/traceDs, SESAR): IEDA Re-invented (sub-award from NSF #2148939; PI: Kerstin Lenhert)

The Interdisciplinary Earth Data Alliance is a a collaborative data infrastructure of three complementary data systems – EarthChem, LEPR/traceDs, SESAR – that jointly support researchers in the Geosciences to share and access sample data following the FAIR data principles. In this grant, I am funded as the product manager of SESAR, and am guiding our transition to a new, independent, multi-disciplinary infrastructure for sample registration. 


Completed funded projects

CHANGES: Collections, Heterogeneous data, And Next Generation Ecological Studies (funded through MIDAS PODS)

With Karen Alofs (PI), Hernan Lopez-Fernandez, Randy Singer, Kevin Wehrly, and Justin Schell

Ecological systems are facing unprecedented and rapid environmental change. Understanding and predicting the impacts of environmental change requires novel data science methods that make use of historical data in ways never imagined when originally collected. Such data include long-term ecological surveys, natural resource monitoring, and museum specimens with ancillary field notes. Though potentially transformative, leveraging these heterogeneous data with varied structures and sampling biases remains challenging. We are developing interdisciplinary protocols for integrating heterogeneous natural science datasets to investigate the impacts of environmental changes on species distributions across a range of taxa and ecosystems, thereby making critical contributions to the fields of ecology, environmental science, and data science. 

Measuring the Impact of Curatorial Actions (IMLS LG-37-19-0134-19; NSF SCI-SIP 1930645)

Co-Principal Investigator with Libby Hemphill (PI), Amy Pienta, Dharma Akmon & Beth Yakel. IMLS National Leadership Grant, $480,637; NSF Science of Science and Innovation Policy, $498,643. awarded 2019.

Empirical studies are sorely needed to evaluate the impact of specific aspects of digital data curation. In this project we developing curatorial metrics to evaluate the impact and efficacy of specific data curation processes. Curatorial metrics are statistical measures similar to bibliometrics but designed to assess the impact of curatorial work over time on the use of collections. Using curation logs and other records, we will develop and analyze a range of curatorial metrics from the last five years of data curation at the Inter-university Consortium for Political and Social Research (ICPSR), a highly impactful social science data repository. 

Press release  IMLS award  NSF Award

Reducing Time-To-Science in the Earth Sciences: Annotations to foster convergence, inclusion, and credit (NSF ICER 1928366)

Principal Investigator, with Simon Goring, Stephen Kuehn, Nicholas McKay, Kerstin Lehnert, Anders Noren; Co-PIs: John Williams, Shanan Peters, Amy Myrbo
Awarded 2019.

The long term sustainability of federally funded research depends on the discovery, accessibility and reuse of data. However, data and research products are often stored in different locations. This makes it challenging to find and integrate related data. In this project, we're developing and populating an annotation database that will support the discovery of related but distributed research products for Earth science and natural history data. Researchers will have a way to link data resources, add context, or provide additional information about data, software and publications. Researchers use this system to create annotations that link resources using unique identifiers. Over time, these links connect to create a network of data resources. 


More about the Throughput project

Migrating Research Data Collections (IMLS RE-07-18-0118-18)

Principal Investigator. Institute for Museum and Library Services, Laura Bush 21st Century Librarian Program, Curating Collections Early Career Research Grant. $428,934, awarded 2018.

In this project, I investigate questions related to the migration of research data collections between data management and preservation platforms over the course of the lifespan of specific datasets. I am also exploring the factors driving (and tensions arising from) the growing adoption by libraries of "off-the-shelf," sometimes proprietary, collections management systems. The project will develop a model of migration patterns in digital collections; best practices to better support the migration of research data collections; prototypes of tools and techniques to support migration processes, particularly focusing on migration audits, metadata creation, and data schema alignment; and open access course modules based on this work.  


Paper describing pilot work: Supporting the long-term curation and migration of natural history museum collections databases

ASIS&T 2022 paper presenting literature review: Maintaining Repositories, Databases, and Digital Collections in Memory Institutions: An Integrative Review. 

Open Badge Researcher Credentials for Secure Access to Restricted and Sensitive Data (NSF CICI 1839868)

Co-Principal Investigator, with Margaret Levenstein (PI), Libby Hemphill, and Florian Schaub. National Science Foundation, Office of Advanced Cyberinfrastructure, Cybersecurity Innovation for Cyberinfrastructure, Research Data Protection. $881,342, awarded 2018.

We are building a system that uses visible tokens, or “badges,” to help safeguard the integrity and provenance of research data, and the conclusions drawn from them. This project has three main objectives: