IntelComp STI Data Space
- 26 September 2023 |
- 16:00 |
- Session 3 |
- Sala Nouvel - Reina Sofia Museum
The IntelComp STI Data Space, stores structured and unstructured data at any scale so that IntelComp services can run on them. We store relational and non-relational data and apply different types of analytics and machine learning algorithms to uncover insights for policy makers, focusing on three domains of Science, Technology and Innovation: Health, AI, Climate change.
IntelComp STI Data Lake, allows us to scale to data of any size. It complies with OpenAIRE guidelines and EOSC, built upon FAIR principles.
The major data resource of the IntelComp STI Data Lake is the OpenAIRE Graph (Open Science scholarly communication service, available on EOSC as Resource Catalogue), that provides more than 160M of metadata records about publications, 58M research data, 330K research software and 6.7M other research projects. We add private data from research funders (HFRI) and public organisations (only accessible and analysed by them) along with third policy reports (EC), and data from industry.
In this lighting talk we will present how we encountered the problem of combined such information and data, hiding complexity from users, allowing policy makers and research funders to save time and get insights on the rest of the IntelComp tools about the three domains of focus, while respecting GDPR, and FAIR principles, and managing to retain a big data pool of information accessible as a whole or as a partial ”data slice” via the IntelComp catalogue. It is the innovations that lay on the STI Data Lake that enable and empower the consumption of data by the IntelComp STI Viewer, STI Participation portal and Evaluation Workbench.