Connecting research data and software communities

4 minute read

What are the common challenges faced by research software and data communities in 2026?

Connecting research data and software communities

This event was co-organised by STEP-UP, the University of Cambridge Research Data Team and Reproducible Research Cambridge on 10 February 2026 during Love Data Week. It brought together 38 participants from across the UK and Europe to examine how research data and software practices intersect across UK digital research. The programme combined two keynote talks, two lightning talk sessions, and a structured World Café discussion. This blog post summarises the experiences people shared and the themes that emerged.

Event hosts Keira McNeice (CUP&A), Alexia Cardona (RRC), Clair Castle (University of Cambridge Research Data Team) and Jeremy Cohen (STEP-UP) welcoming participants.

Keynotes: two very different dRTP worlds

Our two keynote speakers provided a glimpse into two very different dRTP worlds:

Martin O’Reilly, Director of Research Engineering, Alan Turing Institute, reviewed the historical development of Research Software Engineering (RSE) and their communities. He talked about ongoing technical and professional challenges in RSE roles nationally and internationally. Martin gave an overview of the different types of roles at the Turing Institute and their career development pathways. He used polls in his talk to ask the attendees about their roles and challenges.

Gita Moghaddam, Principal Investigator, Department of Clinical Neurosciences, University of Cambridge, discussed her work involving automated decision-making processes based on data. She identified divergent interpretations of a decision-ready system, which included not only the scientific robustness of the model, but also transparent decision processes, data stability and representativeness, real world operational reliability, and overall data quality. She highlighted the need for predefined actions when failures occur and emphasised treating data and software as a unified system.

Lighting talks

We heard from 10 speakers representing 9 organisations who spoke about projects they’re doing to improve research software and data workflows, including:

Building tools to verify published code, create reproducible code environments and manage data.
Exploring requirements for federated research data movement between organisations.
Contributing to platforms of curated computational models.
Building in-house computational infrastructure and workflows.
Developing common ground, community and belonging in dRTP teams.
Developing a certification scheme for green computing.

A number of common themes emerged in discussions, including:

A lot of this work is attempting to solve long-term problems without long term funding.
There need to be mechanisms to recognise the work that goes into these projects (including behind the scenes).
There is huge variability in institutional tech stacks, priorities and policies.

Event participants listening to keynote speaker Martin O'Reilly.

World Café

We held a World Café session to draw out common challenges and support across a range of themes. The session included 7 themed tables and 10-minute discussion blocks giving participants an opportunity to engage with several of the topics. Here is a summary of some of the discussion points at each table:

Training

Required skills: co-leadership, database querying, data management and metadata, documentation, version control, testing, AI/ML, Trusted Research Environments, project management.
Missing elements: training embedded within research teams, mentoring.
Enablers: management support, protected time, recognition frameworks.

Careers

Blockers: undervalued technical roles, unclear/non-existent progression pathways, misalignment with industry standards, leadership disengagement, need for constant self-advocacy.
Needed support: structured career routes, community support mechanisms, dedicated skill development time.

Infrastructure

Critical components: cloud compute, HPC, data centres, TREs, repositories, collaboration platforms.
Issues: data sovereignty questions, fragmentation, slow storage systems, understaffing, legal complexity, lack of standards.
Support gaps: legal guidance, documentation, governance clarity, structured knowledge transfer.

Observed changes: automation of routine tasks, improvements in search and metadata workflows, emerging agent-based methods.
Risks: loss of skills, opacity, reproducibility challenges, security concerns, bias, unclear accountability.
Requirements: governance frameworks, secure deployment policies, theoretical understanding, higher quality metadata.

Sustainability

Actions discussed: assessing compute carbon cost, clarifying preservation policies, recognising that not all research outputs require long term retention.
Risks: persistent “zombie” results, incomplete metadata.
Needs: sustainability certification, consistent policies for data/software retirement.

Publishing outputs

Challenges: licensing confusion, TRE restrictions, fear of exposing errors, lack of standardised processes, poor incentives.
Support: training in software/data publication, clearer institutional policies on licensing and rights retention.

Commercialisation

Opportunities: reusable software, data products, service-based models, PhD students’ work supports PIs in exploring commercial value of research.
Barriers: balancing open source norms with commercial needs, navigating data restrictions, unclear business structures.
Support needed: business training, funding for professional developers, stronger industry engagement.

Conclusions

Several overarching patterns emerged from the day:

Data management, software development, and decision systems are typically separate teams in research organisations, but they need to be much more closely linked.
There is a lot of fragmentation and inconsistency in infrastructure, governance, and career structures.
Although AI/ML, data stewardship, and software engineering practices require continuous learning, organisational structures and workload often limit protected time to develop skills.
Peer networks, grassroots initiatives and open-source software are core shared practices in dRTP spaces but these community contributions are rarely recognised formally in research organisations, leading to demotivation.

Feedback received from attendees of the event was positive and highlights the benefit of bringing research data and software communities together to network, share knowledge, tools and challenges. We hope to have follow up events to work on some of the challenges highlighted during the session. Thank you to all our speakers, attendees and to Cambridge University Press & Assessment for hosting our event.

Get involved

Find out about our next events by joining our mailing list and following us on LinkedIn or Bluesky. And feel free to get in touch if you have questions, suggestions or ideas.

Share on

Twitter Facebook LinkedIn

STEP-UP Project

Connecting research data and software communities

Connecting research data and software communities

Keynotes: two very different dRTP worlds

Lighting talks

World Café

Conclusions

Get involved

Share on

You may also enjoy

The current landscape of training for dRTPs

What is a research software analyst?

Recruiting digital research technical professionals: do job descriptions reflect the work they do?

Licensing, credit and quality for research software