Research is inherently dynamic, leading to equally dynamic data. This dynamic nature of research data raises numerous questions for professionals in information management at research institutions, libraries, and computing centers. Despite the conceptual establishment of research data citation through persistent identifiers (PIDs) in some communities, practical challenges remain. This is the case within the infrastructure of repositories and journals, particularly regarding their practices, guidelines, and standards.
Our team at the Research Group Information Management at the Berlin School of Library and Information Science is dedicated to understanding research data versioning as part of information management processes at research-performing organizations (RPOs). With the project “PID Reference Model for Research Data Versioning” (#PIDsBUA) funded by the “Advancing Research Quality and Value” fellowship program of the Berlin University Alliance (BUA), we address questions of dynamism and persistence in the context of the persistent identification of research data.
Our project benefits from the expertise of Jens Klump, Senior Principal Research Officer at Australia’s national science agency CSIRO, who is working as a BUA Fellow with our research group. Jens has extensive experience in this field, including contributions to the Research Data Alliance (RDA) Data Versioning Working Group (Klump et al. 2020), (Klump et al. 2021).
One question touching on the tension between the dynamic nature of data and the need for persistence on the side of infrastructures that can arise when dynamic research data is published in repositories is how these datasets should be recorded in publication management and research information documentation. Ensuring proper attribution of research performance for publishing dynamic datasets while addressing practical recording issues is a central concern of our project.
To facilitate a broad discussion with experts from academia, infrastructure, and administration, and to enhance the transfer of our work within Berlin’s research landscape, we hosted a workshop on June 28, 2024, at the Einstein Center Digital Future (ECDF) in Berlin. The event attracted around 40 experts from diverse scientific domains.
After an introductory session, discussions focused on four key themes and the associated questions:
Revision and Release:
- When do we differentiate between a revision and a release in research data management?
Time Series and Dynamic Data:
- How can we address the unique characteristics of time series and dynamic data in research data management?
Granularity and Collections:
- How does versioning support recording the provenance of an object?
Provenance, Citation, and Metadata:
- How should we handle provenance, citation, and metadata in the context of research data versioning?
These themes were discussed within the framework of the RDA’s work.
Lessons Learned
In a fruitful and lively discussion, we identified the following points
Revision and Release:
- While many technological aspects of versioning are clear, definitional and coordination activities are still crucial. Establishing whether a change is a revision or a release requires developing conventions within user groups.
Time Series and Dynamic Data:
- Different practices are necessary depending on the use case, such as ensuring traceability or allowing ongoing machine access to a dataset. Solutions should be tailored to the specific user groups.
Granularity and Collections:
- Professional requirements significantly influence granularity and collections, posing challenges for infrastructures that organize and provide access to these collections.
Provenance, Citation, and Metadata:
- Discussions highlighted the importance of metadata and evolving citation practices in facilitating dataset provenance. These practices are often shaped by disciplinary publication cultures and require infrastructure-level support.
We thank all participants for their enthusiasm and engaging discussions. The workshop offered a great platform for cross-institutional and cross-disciplinary dialogue, promoting valuable mutual learning.
Many thanks to the BUA for their funding and to the ECDF for their hospitality.
A workshop report in German is available (Klump 2024). The slides and posters are also published (Klump et al. 2024).
Further information about the research group can be found on our official website.
This text – excluding quotes and otherwise labeled parts – is licensed under the CC BY 4.0 DEED.
References
Citation
@online{klump2024,
author = {Klump, Jens and Pampel, Heinz and Rothfritz, Laura and
Strecker, Dorothea},
title = {Research {Data} {Publications} {Between} {Dynamism} and
{Persistence} - {Insights} from the {\#PIDsBUA} {Workshop}},
date = {2024-07-26},
url = {https://doi.org/10.59350/6f7sw-etd43},
langid = {en}
}