Dataspaces for Collaborative Research

Soo-Yon Kim; Liam Tirpitz; Max Wagels; Benedikt T. Arnold; Christian Rennert; István Koren; Jannik Rapp; Mario Moser; Wil Van Der Aalst; Bernhard Rumpe; Robert H. Schmitt; Jan Pennekamp; Sandra Geisler

Dataspaces for Collaborative Research

Soo-Yon Kim, Liam Tirpitz, Max Wagels, Benedikt T. Arnold, Christian Rennert, István Koren, Jannik Rapp, Mario Moser, Wil Van Der Aalst, Bernhard Rumpe, Robert H. Schmitt, Jan Pennekamp, Sandra Geisler

December 2025

Abstract

Data-driven collaborative research is a key driver of innovation. However, effective data exchange across institutions remains difficult in practice, often relying on ad hoc mechanisms that provide little support for discoverability, sovereignty, or the dynamic sharing of datasets during ongoing collaboration. Dataspaces have been proposed as a method to address these shortcomings, but their suitability for collaborative research remains largely unexplored in terms of the needs that academic collaborations impose, of practical deployment experiences, and of the potential for concrete use cases. Addressing this gap, we derive the requirements for infrastructures in data-driven collaborative research, providing the basis for assessing dataspaces. We further report on the deployment of a pilot dataspace for a real-world, large-scale research project, focusing on the onboarding of four institutes with diverse data types, interests, and disciplinary backgrounds. The deployment highlights the practical steps required to select and prepare a dataspace technology stack, establish connectors, and further assesses challenges posed by heterogeneous environments and the level of effort involved in integration. Beyond deployment, we explore the dual role of research dataspaces, serving as both a generic data sharing infrastructure and as a testbed for practical research on data sharing technologies. A federated process mining use case for data-driven production demonstrates the latter, where distributed process data are analyzed collaboratively. Our findings indicate that dataspaces are indeed a viable option for collaborative research if supported by adequate expertise. By deriving requirements, reporting deployment experiences, and demonstrating use cases, we contribute guidance for research practitioners. Next, future work should focus on sustainability and scalability needs, such as lowering entry barriers, developing trust mechanisms, and extending use case scenarios.

Type

Conference paper

Publication

Proceedings of the 2025 IEEE International Conference on Big Data (BigData '25)

Event

2025 IEEE International Conference on Big Data, Dec 8 - Dec 11, 2025, Macau, China

Dataspaces Data Ecosystems Data Exchange Collaborative Research Research Data Management

Dataspaces for Collaborative Research

Abstract

Soo-Yon Kim

Liam Tirpitz

Max Wagels

Benedikt T. Arnold

Christian Rennert

István Koren

Jannik Rapp

Mario Moser

Wil Van Der Aalst

Bernhard Rumpe

Robert H. Schmitt

Dr. rer. nat. Jan Pennekamp

Postdoctoral Researcher

Sandra Geisler