Identifying, naming and interoperating data in a Phenotyping platform network : the good, the bad and the ugly
Romain David, Jean-Eudes Hollebecq - MISTEA, INRA, Montpellier SupAgro, Université de Montpellier, Llorenç Cabrera-Bosquet - LEPSE, INRA, Montpellier SupAgro, Université de Montpellier, Hanna Ćwiek-Kupczyńska - Institute of Plant Genetics, Polish Academy of Sciences, François Tardieu - François Tardieu, LEPSE, INRA, Montpellier SupAgro, Université de Montpellier and Pascal Neveu - MISTEA, INRA, Montpellier SupAgro, Université de Montpellier
The EPPN2020 is a research project funded by Horizon 2020 Programme of the EU that will provide European public and private scientific sectors with access to a wide range of state-of-the-art plant phenotyping installations, techniques and methods. Specifically, EPPN2020 includes access to 31 plant phenotyping installations, and joint research activities to develop: novel technologies and methods for environmental and plant measurements.
Here we present the results of the discussions of the 2019 annual project meeting to adopt community-approved architectural choices. It focuses on persistent identification of data and real objects, the naming of variables and the priorities for increasing interoperability among phenotyping installations. We describe the main elements to prioritize (the good) in order to enhance Findable, Accessible, Interoperable and Reusable (FAIR) quality for each data management system with a pragmatic concern for all partners.
The plant phenotyping community gathers different actors with various means and practices. Among all the recommendations (including the bad: avoiding bad practices), the community requests identification methods (including the use of ontologies) compatible with the ‘local’ pre-existing ones.
The identification scheme being adopted is based on Uniform Resource Identifiers (URIs) with independant left and right parts for each identifier. It focuses on the associated objects and variables common to all EPPN2020 members, namely the experimental units (which can be a plant in a pot or a plot), sensors and variables. A common architecture for identifiers and variable names is presented in order to enable a first level of interoperation between information systems.
In conclusion, we present some of the next challenges (the ugly) that need to be addressed by the EPPN2020 community related with i) the partial reuse of pre-existing ontologies, ii) the persistence of long-term access to data iii) interoperation between all potential users of the phenotyping data.