2022 |
Arnald Marcer Arthur D. Chapman, John Wieczorek Xavier Picó Francesc Uribe John Waller Arturo Ariño R F H ECOGRAPHY, 2022 , 2022, ISSN: 0906-7590, 1600-0587. Abstract | Links | BibTeX | Tags: ecological niche modelling (ENM), ecological research, GBIF, georeferencing, natural history collections, preserved specimens, species distribution modelling (SDM), Uncertainty @article{Marcer2022b, title = {Uncertainty matters: ascertaining where specimens in natural history collections come from and its implications for predicting species distributions}, author = {Arnald Marcer,Arthur D. Chapman,John R. Wieczorek,F. Xavier Picó,Francesc Uribe,John Waller,Arturo H. Ariño}, url = {https://onlinelibrary.wiley.com/doi/epdf/10.1111/ecog.06025}, doi = {/10.1111/ecog.06025}, issn = {0906-7590, 1600-0587}, year = {2022}, date = {2022-05-09}, journal = {ECOGRAPHY}, volume = {2022}, abstract = {Natural history collections (NHCs) represent an enormous and largely untapped wealth of information on the Earth’s biota, made available through GBIF as digital preserved specimen records. Precise knowledge of where the specimens were collected is paramount to rigorous ecological studies, especially in the field of species distribution modelling. Here, we present a first comprehensive analysis of georeferencing quality for all preserved specimen records served by GBIF, and illustrate the impact that coordinate uncertainty may have on predicted potential distributions. We used all GBIF preserved specimen records to analyse the availability of coordinates and associated spatial uncertainty across geography, spatial resolution, taxonomy, publishing institutions and collection time. We used three plant species across their native ranges in different parts of the world to show the impact of uncertainty on predicted potential distributions. We found that 38% of the 180+ million records provide coordinates only and 18% coordinates and uncertainty. Georeferencing quality is determined more by country of collection and publishing than by taxonomic group. Distinct georeferencing practices are more determinant than implicit characteristics and georeferencing difficulty of specimens. Availability and quality of records contrasts across world regions. Uncertainty values are not normally distributed but peak at very distinct values, which can be traced back to specific regions of the world. Uncertainty leads to a wide spectrum of range sizes when modelling species distributions, potentially affecting conclusions in biogeographical and climate change studies. In summary, the digitised fraction of the world’s NHCs are far from optimal in terms of georeferencing and quality mainly depends on where the collections are hosted. A collective effort between communities around NHC institutions, ecological research and data infrastructure is needed to bring the data on a par with its importance and relevance for ecological research.}, keywords = {ecological niche modelling (ENM), ecological research, GBIF, georeferencing, natural history collections, preserved specimens, species distribution modelling (SDM), Uncertainty}, pubstate = {published}, tppubtype = {article} } Natural history collections (NHCs) represent an enormous and largely untapped wealth of information on the Earth’s biota, made available through GBIF as digital preserved specimen records. Precise knowledge of where the specimens were collected is paramount to rigorous ecological studies, especially in the field of species distribution modelling. Here, we present a first comprehensive analysis of georeferencing quality for all preserved specimen records served by GBIF, and illustrate the impact that coordinate uncertainty may have on predicted potential distributions. We used all GBIF preserved specimen records to analyse the availability of coordinates and associated spatial uncertainty across geography, spatial resolution, taxonomy, publishing institutions and collection time. We used three plant species across their native ranges in different parts of the world to show the impact of uncertainty on predicted potential distributions. We found that 38% of the 180+ million records provide coordinates only and 18% coordinates and uncertainty. Georeferencing quality is determined more by country of collection and publishing than by taxonomic group. Distinct georeferencing practices are more determinant than implicit characteristics and georeferencing difficulty of specimens. Availability and quality of records contrasts across world regions. Uncertainty values are not normally distributed but peak at very distinct values, which can be traced back to specific regions of the world. Uncertainty leads to a wide spectrum of range sizes when modelling species distributions, potentially affecting conclusions in biogeographical and climate change studies. In summary, the digitised fraction of the world’s NHCs are far from optimal in terms of georeferencing and quality mainly depends on where the collections are hosted. A collective effort between communities around NHC institutions, ecological research and data infrastructure is needed to bring the data on a par with its importance and relevance for ecological research. |
Arnald Marcer Agustí Escobar, Víctor Garcia-Font Francesc Uribe Ali-Bey - an open collaborative georeferencing web application Journal Article 2022. Links | BibTeX | Tags: collaborative database, digital specimens, georeferencing, natural history collections, site name versioning, traceability, web application @article{Marcer2022, title = {Ali-Bey - an open collaborative georeferencing web application}, author = {Arnald Marcer, Agustí Escobar, Víctor Garcia-Font, Francesc Uribe}, doi = {https://doi.org/10.3897/BDJ.10.e81282}, year = {2022}, date = {2022-04-28}, keywords = {collaborative database, digital specimens, georeferencing, natural history collections, site name versioning, traceability, web application}, pubstate = {published}, tppubtype = {article} } |
2020 |
Arnald Marcer Elspeth Haston, Quentin Groom Arturo Ariño Arthur Chapman Torkild Bakken Paul Braun Mathias Dillen Marcus Ernst Agustí Escobar David Fichtmüller Laurence Livermore Nicky Nicolson Kaloust Paragamian Deborah Paul Lars Pettersson Sarah Phillips Jack Plummer Heimo Rainer Isabel Rey Tim Robertson Dominik Röpert Joaquim Santos Francesc Uribe John Waller John Wieczorek H D B R Quality issues in georeferencing: From physical collections to digital data repositories for ecological research Journal Article 2020. Links | BibTeX | Tags: eco-evolutionary research, georeferencing, global biodiversity information facility, natural history collections, uncertainty workshop @article{Marcer2020, title = {Quality issues in georeferencing: From physical collections to digital data repositories for ecological research}, author = {Arnald Marcer,Elspeth Haston,Quentin Groom,Arturo H. Ariño,Arthur D. Chapman,Torkild Bakken,Paul Braun,Mathias Dillen,Marcus Ernst,Agustí Escobar,David Fichtmüller,Laurence Livermore,Nicky Nicolson,Kaloust Paragamian,Deborah Paul,Lars B. Pettersson,Sarah Phillips,Jack Plummer,Heimo Rainer,Isabel Rey,Tim Robertson,Dominik Röpert,Joaquim Santos,Francesc Uribe,John Waller,John R. Wieczorek}, doi = {https://doi.org/10.1111/ddi.13208}, year = {2020}, date = {2020-12-03}, keywords = {eco-evolutionary research, georeferencing, global biodiversity information facility, natural history collections, uncertainty workshop}, pubstate = {published}, tppubtype = {article} } |
Hardy, Helen; Knapp, Sandra; Allan, Louise E; Berger, Frederik; Dixey, Katherine; Döme, Bernadette; Gagnier, Pierre-Yves; Frank, Jiri; Margaret Haston, Elspeth; Holstein, Joachim; Kiel, Steffen; Marschler, Maria; Mergen, Patricia; Phillips, Sarah; Rabinovich, Rivka; Chillón, Begoña Sanchez; V Sorensen, Martin; Thines, Marco; Trekels, Maarten; Vogt, Robert; Wilson, Scott; Wiltschke-Schrotta, Karin SYNTHESYS+ Virtual Access - Report on the Ideas Call (October to November 2019) Journal Article Research Ideas and Outcomes, 6 , pp. e50354, 2020. Abstract | Links | BibTeX | Tags: access, collaboration, digital data, digitisation, digitization, natural history collections, virtual data @article{10.3897/rio.6.e50354, title = {SYNTHESYS+ Virtual Access - Report on the Ideas Call (October to November 2019)}, author = {Helen Hardy and Sandra Knapp and Louise E Allan and Frederik Berger and Katherine Dixey and Bernadette Döme and Pierre-Yves Gagnier and Jiri Frank and Elspeth Margaret Haston and Joachim Holstein and Steffen Kiel and Maria Marschler and Patricia Mergen and Sarah Phillips and Rivka Rabinovich and Begoña Sanchez Chillón and Martin V Sorensen and Marco Thines and Maarten Trekels and Robert Vogt and Scott Wilson and Karin Wiltschke-Schrotta}, url = {https://doi.org/10.3897/rio.6.e50354}, doi = {10.3897/rio.6.e50354}, year = {2020}, date = {2020-01-01}, journal = {Research Ideas and Outcomes}, volume = {6}, pages = {e50354}, publisher = {Pensoft Publishers}, abstract = {The SYNTHESYS consortium has been operational since 2004, and has facilitated physical access by individual researchers to European natural history collections through its Transnational Access programme (TA). For the first time, SYNTHESYS+ will be offering virtual access to collections through digitisation, with two calls for the programme, the first in 2020 and the second in 2021. The Virtual Access (VA) programme is not a direct digital parallel of Transnational Access - proposals for collections digitisation will be prioritised and carried out based on community demand, and data must be made openly available immediately. A key feature of Virtual Access is that, unlike TA, it does not select the researchers to whom access is provided. Because Virtual Access in this way is new to the community and to the collections-holding institutions, the SYNTHESYS+ consortium invited ideas through an Ideas Call, that opened on 7th October 2019 and closed on 22nd November 2019, in order to assess interest and to trial procedures. This report is intended to provide feedback to those who participated in the Ideas Call and to help all applicants to the first SYNTHESYS+Virtual Access Call that will be launched on 20th of February 2020.}, keywords = {access, collaboration, digital data, digitisation, digitization, natural history collections, virtual data}, pubstate = {published}, tppubtype = {article} } The SYNTHESYS consortium has been operational since 2004, and has facilitated physical access by individual researchers to European natural history collections through its Transnational Access programme (TA). For the first time, SYNTHESYS+ will be offering virtual access to collections through digitisation, with two calls for the programme, the first in 2020 and the second in 2021. The Virtual Access (VA) programme is not a direct digital parallel of Transnational Access - proposals for collections digitisation will be prioritised and carried out based on community demand, and data must be made openly available immediately. A key feature of Virtual Access is that, unlike TA, it does not select the researchers to whom access is provided. Because Virtual Access in this way is new to the community and to the collections-holding institutions, the SYNTHESYS+ consortium invited ideas through an Ideas Call, that opened on 7th October 2019 and closed on 22nd November 2019, in order to assess interest and to trial procedures. This report is intended to provide feedback to those who participated in the Ideas Call and to help all applicants to the first SYNTHESYS+Virtual Access Call that will be launched on 20th of February 2020. |
2019 |
B Georgiev, Boyko; Casino, Ana; Voreadou, Catherina Training Taxonomists for the Digital World: Are we prepared? Journal Article Biodiversity Information Science and Standards, 3 , pp. e36106, 2019. Abstract | Links | BibTeX | Tags: capacity building, Digital knowledge, digital skills, DiSSCo, education, MOBILISE, natural history collections, research, taxonomy, training, young researchers @article{10.3897/biss.3.36106, title = {Training Taxonomists for the Digital World: Are we prepared?}, author = {Boyko B Georgiev and Ana Casino and Catherina Voreadou}, url = {https://doi.org/10.3897/biss.3.36106}, doi = {10.3897/biss.3.36106}, year = {2019}, date = {2019-01-01}, journal = {Biodiversity Information Science and Standards}, volume = {3}, pages = {e36106}, publisher = {Pensoft Publishers}, abstract = {Digital knowledge and skills are rapidly becoming integral part of the work of the modern taxonomist. Their importance is further increased with the recent recognition of DiSSCo (Distributed System of Scientific Collections, https://dissco.eu). This new pan-European research infrastructure envisions placing European natural science collections at the centre of data-intensive scientific excellence and innovation for taxonomic and environmental research, food security, health and the bioeconomy. The mission of this ambitious project is to mobilise, unify and deliver bio- and geo-diversity information at the scale, form and precision required by scientific communities as well as to transform a fragmented landscape into a coherent and responsive research infrastructure. An important step in improving the capacity of the research community underpinning DiSSCo is the COST Action MOBILISE (Mobilising Data, Policies and Experts in Scientific Collections, https://www.mobilise-action.eu). One of major capacity-building objectives is to facilitate implementation of common standards and newly-developed techniques by training and education. Its achievement is envisaged by standardised training modules such as training courses, workshops, webinars, online tutorials and short-term visits to other research units. The first impression from surveying interests of candidates to be included into training events, demonstrates an uneven distribution of digital knowledge and skills across countries, institutions and generations. We advocate that a massive coordinated training programme may result in more efficient establishment of common standards and, consequently, better implementation of the forthcoming joint efforts in the development of the new pan-European research infrastricture.}, keywords = {capacity building, Digital knowledge, digital skills, DiSSCo, education, MOBILISE, natural history collections, research, taxonomy, training, young researchers}, pubstate = {published}, tppubtype = {article} } Digital knowledge and skills are rapidly becoming integral part of the work of the modern taxonomist. Their importance is further increased with the recent recognition of DiSSCo (Distributed System of Scientific Collections, https://dissco.eu). This new pan-European research infrastructure envisions placing European natural science collections at the centre of data-intensive scientific excellence and innovation for taxonomic and environmental research, food security, health and the bioeconomy. The mission of this ambitious project is to mobilise, unify and deliver bio- and geo-diversity information at the scale, form and precision required by scientific communities as well as to transform a fragmented landscape into a coherent and responsive research infrastructure. An important step in improving the capacity of the research community underpinning DiSSCo is the COST Action MOBILISE (Mobilising Data, Policies and Experts in Scientific Collections, https://www.mobilise-action.eu). One of major capacity-building objectives is to facilitate implementation of common standards and newly-developed techniques by training and education. Its achievement is envisaged by standardised training modules such as training courses, workshops, webinars, online tutorials and short-term visits to other research units. The first impression from surveying interests of candidates to be included into training events, demonstrates an uneven distribution of digital knowledge and skills across countries, institutions and generations. We advocate that a massive coordinated training programme may result in more efficient establishment of common standards and, consequently, better implementation of the forthcoming joint efforts in the development of the new pan-European research infrastricture. |
Theeten, Franck; Adam, Marielle; Vandenberghe, Thomas; Dillen, Mathias; Semal, Patrick; Scory, Serge; Herpers, Jean-Marc; den Spiegel, Didier Van; Mergen, Patricia; Smirnova, Larissa; Engledow, Henry; Casino, Ana; Gödderz, Karsten NaturalHeritage: Bridging Belgian natural history collections Journal Article Biodiversity Information Science and Standards, 3 , pp. e37854, 2019. Abstract | Links | BibTeX | Tags: data analysis, data quality and cleaning, interoperable databases, natural history collections, search portal, standardisation, webservices @article{10.3897/biss.3.37854, title = {NaturalHeritage: Bridging Belgian natural history collections}, author = {Franck Theeten and Marielle Adam and Thomas Vandenberghe and Mathias Dillen and Patrick Semal and Serge Scory and Jean-Marc Herpers and Didier Van den Spiegel and Patricia Mergen and Larissa Smirnova and Henry Engledow and Ana Casino and Karsten Gödderz}, url = {https://doi.org/10.3897/biss.3.37854}, doi = {10.3897/biss.3.37854}, year = {2019}, date = {2019-01-01}, journal = {Biodiversity Information Science and Standards}, volume = {3}, pages = {e37854}, publisher = {Pensoft Publishers}, abstract = {The Royal Belgian Institute of Natural Sciences (RBINS), the Royal Museum for Central Africa (RMCA) and Meise Botanic Garden house more than 50 million specimens covering all fields of natural history. While many different research topics have their own specificities, throughout the years it became apparent that with regards to collection data management, data publication and exchange via community standards, collection holding institutions face similar challenges (James et al. 2018, Rocha et al. 2014). In the past, these have been tackled in different ways by Belgian natural history institutions. In addition to local and national collaborations, there is a great need for a joint structure to share data between scientific institutions in Europe and beyond. It is the aim of large networks and infrastructures such as the Global Biodiversity Information Facility (GBIF), the Biodiversity Information Standards (TDWG), the Distributed System of Scientific collections (DiSSCo) and the Consortium of European Taxonomic Facilities (CETAF) to further implement and improve these efforts, thereby gaining ever increasing efficiencies. In this context, the three institutions mentioned above, submitted the NaturalHeritage project (http://www.belspo.be/belspo/brain-be/themes_3_HebrHistoScien_en.stm) granted in 2017 by the Belgian Science Policy Service, which runs from 2017 to 2020. The project provides links among databases and services. The unique qualities of each database are maintained, while the information can be concentrated and exposed in a structured way via one access point. This approach aims also to link data that are unconnected at present (e.g. relationship between soil/substrate, vegetation and associated fauna) and to improve the cross-validation of data. (1) The NaturalHeritage prototype (http://www.naturalheritage.be) is a shared research portal with an open access infrastructure, which is still in the development phase. Its backbone is an ElasticSearch catalogue, with Kibana, and a Python aggregator gathering several types of (re)sources: relational databases, REpresentational State Transfer (REST) services of objects databases and bibliographical data, collections metadata and the GBIF Internet Publishing Toolkit (IPT) for observational and taxonomical data. Semi-structured data in English are semantically analysed and linked to a rich autocomplete mechanism. Keywords and identifiers are indexed and grouped in four categories (“what”, “who”, “where”, “when”). The portal can act also as an Open Archives Initiatives Protocol for Metadata Harvesting (OAI-PMH) service and ease indexing of the original webpage on the internet with microdata enrichment. (2) The collection data management system of DaRWIN (Data Research Warehouse Information Network) of RBINS and RMCA has been improved as well. External (meta)data requirements, i.e. foremost publication into or according to the practices and standards of GBIF and OBIS (Ocean Biogeographic Information System: https://obis.org) for biodiversity data, and INSPIRE (https://inspire.ec.europa.eu) for geological data, have been identified and evaluated. New and extended data structures have been created to be compliant with these standards, as well as the necessary procedures developed to expose the data. Quality control tools for taxonomic and geographic names have been developed. Geographic names can be hard to confirm as their lack of context often requires human validation. To address this a similarity measure is used to help map the result. Species, locations, sampling devices and other properties have been mapped to the World Register of Marine Species and DarwinCore (http://www.marinespecies.org), Marine Regions and GeoNames, the AGRO Agronomy and Vertebrate trait ontologies and the British Oceanographic Data Centre (BODC) vocabularies (http://www.obofoundry.org/ontology/agro.html). Extensive mapping is necessary to make use of the ExtendedMeasurementOrFact Extension of DarwinCore (https://tools.gbif.org/dwca-validator/extensions.do).}, keywords = {data analysis, data quality and cleaning, interoperable databases, natural history collections, search portal, standardisation, webservices}, pubstate = {published}, tppubtype = {article} } The Royal Belgian Institute of Natural Sciences (RBINS), the Royal Museum for Central Africa (RMCA) and Meise Botanic Garden house more than 50 million specimens covering all fields of natural history. While many different research topics have their own specificities, throughout the years it became apparent that with regards to collection data management, data publication and exchange via community standards, collection holding institutions face similar challenges (James et al. 2018, Rocha et al. 2014). In the past, these have been tackled in different ways by Belgian natural history institutions. In addition to local and national collaborations, there is a great need for a joint structure to share data between scientific institutions in Europe and beyond. It is the aim of large networks and infrastructures such as the Global Biodiversity Information Facility (GBIF), the Biodiversity Information Standards (TDWG), the Distributed System of Scientific collections (DiSSCo) and the Consortium of European Taxonomic Facilities (CETAF) to further implement and improve these efforts, thereby gaining ever increasing efficiencies. In this context, the three institutions mentioned above, submitted the NaturalHeritage project (http://www.belspo.be/belspo/brain-be/themes_3_HebrHistoScien_en.stm) granted in 2017 by the Belgian Science Policy Service, which runs from 2017 to 2020. The project provides links among databases and services. The unique qualities of each database are maintained, while the information can be concentrated and exposed in a structured way via one access point. This approach aims also to link data that are unconnected at present (e.g. relationship between soil/substrate, vegetation and associated fauna) and to improve the cross-validation of data. (1) The NaturalHeritage prototype (http://www.naturalheritage.be) is a shared research portal with an open access infrastructure, which is still in the development phase. Its backbone is an ElasticSearch catalogue, with Kibana, and a Python aggregator gathering several types of (re)sources: relational databases, REpresentational State Transfer (REST) services of objects databases and bibliographical data, collections metadata and the GBIF Internet Publishing Toolkit (IPT) for observational and taxonomical data. Semi-structured data in English are semantically analysed and linked to a rich autocomplete mechanism. Keywords and identifiers are indexed and grouped in four categories (“what”, “who”, “where”, “when”). The portal can act also as an Open Archives Initiatives Protocol for Metadata Harvesting (OAI-PMH) service and ease indexing of the original webpage on the internet with microdata enrichment. (2) The collection data management system of DaRWIN (Data Research Warehouse Information Network) of RBINS and RMCA has been improved as well. External (meta)data requirements, i.e. foremost publication into or according to the practices and standards of GBIF and OBIS (Ocean Biogeographic Information System: https://obis.org) for biodiversity data, and INSPIRE (https://inspire.ec.europa.eu) for geological data, have been identified and evaluated. New and extended data structures have been created to be compliant with these standards, as well as the necessary procedures developed to expose the data. Quality control tools for taxonomic and geographic names have been developed. Geographic names can be hard to confirm as their lack of context often requires human validation. To address this a similarity measure is used to help map the result. Species, locations, sampling devices and other properties have been mapped to the World Register of Marine Species and DarwinCore (http://www.marinespecies.org), Marine Regions and GeoNames, the AGRO Agronomy and Vertebrate trait ontologies and the British Oceanographic Data Centre (BODC) vocabularies (http://www.obofoundry.org/ontology/agro.html). Extensive mapping is necessary to make use of the ExtendedMeasurementOrFact Extension of DarwinCore (https://tools.gbif.org/dwca-validator/extensions.do). |
ResearchGate Link : https://www.researchgate.net/project/MOBILISE-COST-Action-CA17106-Mobilising-Data-Policies-and-Experts-in-Scientific-Collections