Connecting research data and software communities: sharing experiences and creating opportunities

Tuesday 10th February 2026, 10:30–16:30, Cambridge University Press & Assessment, CB2 8EA

Join STEP-UP, the University of Cambridge Data Service and Reproducible Research Cambridge on 10 February 2026 during Love Data Week for an event focused on bridging research software and research data communities.

Cambridge Libraries and Archives logo          Responsible Research Cambridge logo          Cambridge University Press & Assessment logo

As recognition grows of the increasingly vital role that “digital Research Technical Professionals” (dRTPs) play in undertaking research, we will bring together representatives of two key communities within the dRTP space. Whether you consider yourself to be a Research Software Engineer, a Data Steward or Champion, a researcher or just someone interested in software or data, join us for keynotes, lighting talks and discussion to share our experiences, build connections and identify new opportunities for supporting, recognising and rewarding research software and research data practitioners.

Schedule

10:30-10:35 Welcome and introduction
10:35-11:45 Keynote speakers

  • Dr Gita Moghaddam, Principal Investigator, Department of Clinical Neurosciences, University of Cambridge
  • Dr Martin O’Reilly, Director of Research Engineering, Alan Turing Institute

11:45-12:00 Coffee/tea and networking
12:00 -13:00 Lightning talk session 1
13:00-14:00 Lunch
14:00-15:00 Lighting talk session 2
15:00-15:15 Break and network
15:15-16:15 World Café session
16:15-16:30 Wrap-up

Lighting talk abstracts

CODECHECK - a system for independently verifying the results of computations reported in scientific articles
Stephen Eglen, University of Cambridge
CODECHECK is a system by which we take an author’s paper, code, and data and independently verify if the results in the paper can be generated from the code and data. I will present how this works in practice and highlight the advantages and limitations of this approach. Further details can be found at https://codecheck.org.uk

Decentralized materials research data management, curation & dissemination for accelerated discovery
Matthew Evans, Department of Chemistry, University of Cambridge
This talk will briefly introduce the open-source research data management platform, datalab, which targets the materials and chemical sciences and allows users to track the provenance of samples, characterisation data and more. We will briefly discuss our federated approach to data management and infrastructure and present a call-to-arms to provides software solutions for other underserved scientific communities.

COBLE: Reproducible Research-Code Environments
Rachel Alcraft, Institute of Cancer Research
COBLE is a multi-package-manager tool designed to create reproducible research code environments across python and R with Bioconductor, CRAN, r-forge, pip, conda and github all in the mix. It seamlessly allows the creation of conda, docker or singularity containers with a single command and tracks environment changes. The tool is designed to make methods code sections in publications more easily reproducible, and to be used internally for a lab’s own environment management.

Contributing to the Computational Modelling Community through BioModels Curation
Rahuman S Malik Sheriff, EMBL-EBI
BioModels is a two decade old, internationally used repository of curated computational models of biological and medical systems. I will show how software engineers and AI/ML researchers can contribute to BioModels by reproducing and curating published models, turning technical validation and benchmarking work into reusable community assets. Contributors receive formal, citable credit for this work through persistent identifiers and credit-tracking systems.

Building and Sustaining a Research Software Platform: Lessons from BioModels
Tung Nguyen, EMBL-EBI
Behind every successful research data platform is sustained software engineering and community engagement. This talk shares practical lessons from building, operating, collaborating and maintaining BioModels as a trusted, scalable research infrastructure. It focuses on how software engineers can support scientific communities while gaining recognition and long-term impact.

Interconnected software and data management at EMBL - a grassroots story
Renato Alves, European Molecular Biology Laboratory
I will present a very brief success story of a (Git) project under the grassroots initiative Bio-IT at EMBL. This effort has since evolved into a mature and critical piece of the computational infrastructure for research, software and data management at EMBL that is now embedded into Data Science support structure.

A case of an in-house data engineering workflow for longitudinal cohort data
Vilma Agalioti-Sgompou, University College London
This talk describes the data management approaches followed by the research data managers/engineers at the UCL Centre for Longitudinal Studies to curate survey data of four major UK longitudinal cohort studies. It outlines the key challenges we faced and the design solutions we implemented. It also shows how strong teamwork and shared processes made these developments possible.

Common ground, community, and belonging: an important dRTP technique
Yo Yehudi, OLS
“The scientific method” is often portrayed on a pedestal of pseudo-objectivity, with research professionals being expected to perform as emotion-free vessels creating scientific knowledge. Counterintuitively to the “objective” ideal, teams with real psychological safety, trust, and emotional intelligence tend to perform more effectively and have fewer errors compared to teams that do not focus on interpersonal needs. The OLS Open Seeds and Nebula training programmes (https://we-are-ols.org/) use an open science framework to build research teams that are consciously designed to encourage psychological safety, researcher wellbeing, and scientific robustness.

Green Disc: open-access community-driven Digital Sustainability Certification scheme
Loïc Lannelongue, University of Cambridge
Scientific computing has enabled amazing discoveries and there is no doubt it will continue to do so. However, the corresponding environmental impact is a growing concern in light of the urgency of the climate crisis, so what can we all do about it? We will discuss how the Green DiSC framework can support more sustainable research computing. Coordinating sustainability initiative across teams and departments in an institution brings its own challenges and this will be a chance to discuss how a scheme like Green DiSC addresses that, and what opportunities it presents for collaborations between different communities.

Yan He, Cambridge University Libraries and Archives
Abstract and title not received yet

Federated Research Data Movement API
Piper Fowler-Wright The Rosalind Franklin Institute
We are exploring requirements and technologies for a data centre API that enables federated research data movement across digital research infrastructure. This NFCS-funded project will survey user needs and existing solutions to develop a roadmap for shared protocols that facilitate moving data into, out of, and between research facilities.