Skip to main content
Demo

Matilda: a bibliographic/metric tool for open science

September 26| 14:15| Demo Session| Sala de Protocolo Nouvel

The Open Access movement has long insisted on the availability and reusability of academic texts as a goal to achieve knowledge dissemination, without putting specific attention to the question of metadata. The fact that no reference was made to metadata in the main OA declarations (Budapest, Berlin, Bethesda) has led to a paradoxical situation. The more publication as a process became accessible and reusable, the more its content was searched and found through privately-owned and often costly bibliographic/metric tools by research communities.

In recent years, the I40C coalition and the I4OA coalition have advocated for the opening of metadata and databases such as OpenAlex and OpenCitations have enabled the sharing of metadata. However, these achievements do not currently meet the needs of researchers for two reasons. On the one hand, they are databases to be used via APIs or dumps and not tools that can be easily appropriated by untrained users. On the other hand, their temporality does not enable services that can be used by those who wish to follow the evolution of the literature on a day-to-day basis.

Faced with this lack, Matilda is based on open source software and multiple open data sources, Matilda aims at constituting an open science infrastructure for all research domains which don’t currently have well-designed community-based search & alert services.

It currently provides at least four services: an easy-to-use multi-criteria search, citation tracking on texts, authors, multi-criteria, the creation of associated alerts via RSS feeds and of course DOI, HTML and PDF links to enable the reading of texts of interest to researchers, outside the platform. Matilda is currently based on four documentary sources (ArXiv, Crossref, PubMedCenral, RePEc) which continuously feed deduplication/aggregation operations, then metadata enrichment through ORCID and the production of reference links, notably through GROBID. As of September 1st, 2023, it displays 112 million “works” (the aggregation of "identical" documents), 200 million documents  and more than 9.2 million authors, with a mean daily update of around 200,000 documents.

The current version only relies on metadata, including abstract, reconstructed references and authors identification. The ongoing development will enable full-text search by the end of 2023, making Matilda a real alternative to Google Scholar and commercial databases, especially for citation-tracking services. It will be fully available to researchers from autumn on, for feedback and to better understand how researchers use these search tools, as there is almost no literature on these uses.

Organisations involved

Presenters

Didier Torny

Didier Torny, trained as a sociologist, is a senior researcher at CNRS in Paris. In CSI (UMR 9217, Mines Paris), he currently works on on the political economy of academic publishing, including the funding models of open access publications. A member of the French Open Science Committee, he has been working on research evaluation, bibliometrics, peer review and the measurement of openness through the French Open Science Monitor. He is also a scientific delegate at the Open Data Division, at CNRS headquarters. He is currently part of the European-funded DIAMAS project on institutional publishing and open access. Didier Torny is the scientific director of Matilda.

    About the service

    Matilda is based on open source software and multiple open data sources, Matilda aims at constituting an open science infrastructure for all research domains which don’t currently have well-designe...