2020 |
Hardy, Helen; Knapp, Sandra; Allan, Louise E; Berger, Frederik; Dixey, Katherine; Döme, Bernadette; Gagnier, Pierre-Yves; Frank, Jiri; Margaret Haston, Elspeth; Holstein, Joachim; Kiel, Steffen; Marschler, Maria; Mergen, Patricia; Phillips, Sarah; Rabinovich, Rivka; Chillón, Begoña Sanchez; V Sorensen, Martin; Thines, Marco; Trekels, Maarten; Vogt, Robert; Wilson, Scott; Wiltschke-Schrotta, Karin SYNTHESYS+ Virtual Access - Report on the Ideas Call (October to November 2019) Journal Article Research Ideas and Outcomes, 6 , pp. e50354, 2020. Abstract | Links | BibTeX | Tags: access, collaboration, digital data, digitisation, digitization, natural history collections, virtual data @article{10.3897/rio.6.e50354, title = {SYNTHESYS+ Virtual Access - Report on the Ideas Call (October to November 2019)}, author = {Helen Hardy and Sandra Knapp and Louise E Allan and Frederik Berger and Katherine Dixey and Bernadette Döme and Pierre-Yves Gagnier and Jiri Frank and Elspeth Margaret Haston and Joachim Holstein and Steffen Kiel and Maria Marschler and Patricia Mergen and Sarah Phillips and Rivka Rabinovich and Begoña Sanchez Chillón and Martin V Sorensen and Marco Thines and Maarten Trekels and Robert Vogt and Scott Wilson and Karin Wiltschke-Schrotta}, url = {https://doi.org/10.3897/rio.6.e50354}, doi = {10.3897/rio.6.e50354}, year = {2020}, date = {2020-01-01}, journal = {Research Ideas and Outcomes}, volume = {6}, pages = {e50354}, publisher = {Pensoft Publishers}, abstract = {The SYNTHESYS consortium has been operational since 2004, and has facilitated physical access by individual researchers to European natural history collections through its Transnational Access programme (TA). For the first time, SYNTHESYS+ will be offering virtual access to collections through digitisation, with two calls for the programme, the first in 2020 and the second in 2021. The Virtual Access (VA) programme is not a direct digital parallel of Transnational Access - proposals for collections digitisation will be prioritised and carried out based on community demand, and data must be made openly available immediately. A key feature of Virtual Access is that, unlike TA, it does not select the researchers to whom access is provided. Because Virtual Access in this way is new to the community and to the collections-holding institutions, the SYNTHESYS+ consortium invited ideas through an Ideas Call, that opened on 7th October 2019 and closed on 22nd November 2019, in order to assess interest and to trial procedures. This report is intended to provide feedback to those who participated in the Ideas Call and to help all applicants to the first SYNTHESYS+Virtual Access Call that will be launched on 20th of February 2020.}, keywords = {access, collaboration, digital data, digitisation, digitization, natural history collections, virtual data}, pubstate = {published}, tppubtype = {article} } The SYNTHESYS consortium has been operational since 2004, and has facilitated physical access by individual researchers to European natural history collections through its Transnational Access programme (TA). For the first time, SYNTHESYS+ will be offering virtual access to collections through digitisation, with two calls for the programme, the first in 2020 and the second in 2021. The Virtual Access (VA) programme is not a direct digital parallel of Transnational Access - proposals for collections digitisation will be prioritised and carried out based on community demand, and data must be made openly available immediately. A key feature of Virtual Access is that, unlike TA, it does not select the researchers to whom access is provided. Because Virtual Access in this way is new to the community and to the collections-holding institutions, the SYNTHESYS+ consortium invited ideas through an Ideas Call, that opened on 7th October 2019 and closed on 22nd November 2019, in order to assess interest and to trial procedures. This report is intended to provide feedback to those who participated in the Ideas Call and to help all applicants to the first SYNTHESYS+Virtual Access Call that will be launched on 20th of February 2020. |
2019 |
Casino, Ana; Raes, Niels; Addink, Wouter; Woodburn, Matt Collections Digitization and Assessment Dashboard, a Tool for Supporting Informed Decisions Journal Article Biodiversity Information Science and Standards, 3 , pp. e37505, 2019. Abstract | Links | BibTeX | Tags: alignment, biodiversity and geodiversity, dashboard, digitization, DiSSCo, high-level information, informed decision-making, institutional description, mechanisms, natural science collections, research infrastructure, tools, visualization @article{10.3897/biss.3.37505, title = {Collections Digitization and Assessment Dashboard, a Tool for Supporting Informed Decisions}, author = {Ana Casino and Niels Raes and Wouter Addink and Matt Woodburn}, url = {https://doi.org/10.3897/biss.3.37505}, doi = {10.3897/biss.3.37505}, year = {2019}, date = {2019-01-01}, journal = {Biodiversity Information Science and Standards}, volume = {3}, pages = {e37505}, publisher = {Pensoft Publishers}, abstract = {Natural Science Collections (NSCs) contain specimen-related data from which we extract valuable information for science and policy. Openness of those collections facilitates development of science. Moreover, virtual accessibility to physical containers by means of their digitization will allow an exponential increase in the level of available information. Digitization of collections will allow us to set a comprehensive registry of reliable, accurate, updated, comparable and interconnected information. Equally, the scope of interested potential users will largely expand and so will the different levels of granularity required by researchers, institutions and governmental bodies. Meeting diverse needs entails a special effort in data management and data analysis to extract, digest and present information on a compressed but still precise and objective-oriented format. The Collections Digitisation Dashboard (CDD) underpins such an attempt. The CDD stands as a practical tool that specifically aims to support high-level decisions with a wide coverage of data, by providing a visual, simplified and structured arrangement that will allow discovery of key indicators concerning digitization of bio- and geodiversity collections. The realm of possible approaches to the CDD covers levels of digitization, collection exceptionality, resourceavailability and many others. Still all those different angles need to be aligned and processed at once to provide an overall overview of the status of NSCs in the digitization process and analyse its further development. The CDD is a powerful mechanism to identify priorities, specialisation lines together with regional development, gaps and niches and future capabilities as well, and strengths and weaknesses across collections, institutions, countries and regions. It can perfectly underpin measurable and comparable assessments, with evolution indexes and progress indicators, all under an overarching homogenous approach. The Distributed System of Scientific Collections (DiSSCo) Research Infrastructure, currently in its preparatory phase, is built on top of the largest ever community of collections-related institutions across Europe and anchored on the Consortium of European Taxonomic Facilities (CETAF). It aims to provide a unique virtual access point to NSCs by facilitating a large and massive digitisation effort throughout Europe. Setting up priorities and specialization areas is pivotal to its success. To that end, the DiSSCo CDD will provide a valuation tool to summarize and showcase NSC's digitization status on a first-hand visualization. Different projects and initiatives will contribute, jointly and on a synergetic basis, to the production of the DiSSCo CDD. The ICEDIG project will address its basics features, terms of classification and tiers of information, and will produce a prototype and a set of recommendations on how to better attempt a massive dashboard by collating specific collections-based information and defining global strategic representations. CETAF working groups on collections and digitization will provide the desired homogeneity in describing and capturing the different implementation requirements from the users’ perspectives, which will be complemented by the contributions made under the umbrella of the COST Action MOBILISE. The Action will use networking activities to identify the right standards and policies to enable enlarging the scope of the DiSSCo CDD and its broader implementation by linking to the TDWG criteria and adopted standards. Complementarily, the ELViS platform to be developed under the SYNTHESYS+ project will provide the right virtual environment. Furthermore, SYNTHESYS+ will address the assessment capabilities of the CDD to enable the visual representation becoming a practical assessment mechanism and endow it with a dynamic feature for analysis over the time. The DiSSCo CDD will thus become an instrumental mechanism for decision-taking that will be embedded into the clustering initiative of products and services provided to the EOSC by the ENVRI-FAIR project in the environmental domain.}, keywords = {alignment, biodiversity and geodiversity, dashboard, digitization, DiSSCo, high-level information, informed decision-making, institutional description, mechanisms, natural science collections, research infrastructure, tools, visualization}, pubstate = {published}, tppubtype = {article} } Natural Science Collections (NSCs) contain specimen-related data from which we extract valuable information for science and policy. Openness of those collections facilitates development of science. Moreover, virtual accessibility to physical containers by means of their digitization will allow an exponential increase in the level of available information. Digitization of collections will allow us to set a comprehensive registry of reliable, accurate, updated, comparable and interconnected information. Equally, the scope of interested potential users will largely expand and so will the different levels of granularity required by researchers, institutions and governmental bodies. Meeting diverse needs entails a special effort in data management and data analysis to extract, digest and present information on a compressed but still precise and objective-oriented format. The Collections Digitisation Dashboard (CDD) underpins such an attempt. The CDD stands as a practical tool that specifically aims to support high-level decisions with a wide coverage of data, by providing a visual, simplified and structured arrangement that will allow discovery of key indicators concerning digitization of bio- and geodiversity collections. The realm of possible approaches to the CDD covers levels of digitization, collection exceptionality, resourceavailability and many others. Still all those different angles need to be aligned and processed at once to provide an overall overview of the status of NSCs in the digitization process and analyse its further development. The CDD is a powerful mechanism to identify priorities, specialisation lines together with regional development, gaps and niches and future capabilities as well, and strengths and weaknesses across collections, institutions, countries and regions. It can perfectly underpin measurable and comparable assessments, with evolution indexes and progress indicators, all under an overarching homogenous approach. The Distributed System of Scientific Collections (DiSSCo) Research Infrastructure, currently in its preparatory phase, is built on top of the largest ever community of collections-related institutions across Europe and anchored on the Consortium of European Taxonomic Facilities (CETAF). It aims to provide a unique virtual access point to NSCs by facilitating a large and massive digitisation effort throughout Europe. Setting up priorities and specialization areas is pivotal to its success. To that end, the DiSSCo CDD will provide a valuation tool to summarize and showcase NSC's digitization status on a first-hand visualization. Different projects and initiatives will contribute, jointly and on a synergetic basis, to the production of the DiSSCo CDD. The ICEDIG project will address its basics features, terms of classification and tiers of information, and will produce a prototype and a set of recommendations on how to better attempt a massive dashboard by collating specific collections-based information and defining global strategic representations. CETAF working groups on collections and digitization will provide the desired homogeneity in describing and capturing the different implementation requirements from the users’ perspectives, which will be complemented by the contributions made under the umbrella of the COST Action MOBILISE. The Action will use networking activities to identify the right standards and policies to enable enlarging the scope of the DiSSCo CDD and its broader implementation by linking to the TDWG criteria and adopted standards. Complementarily, the ELViS platform to be developed under the SYNTHESYS+ project will provide the right virtual environment. Furthermore, SYNTHESYS+ will address the assessment capabilities of the CDD to enable the visual representation becoming a practical assessment mechanism and endow it with a dynamic feature for analysis over the time. The DiSSCo CDD will thus become an instrumental mechanism for decision-taking that will be embedded into the clustering initiative of products and services provided to the EOSC by the ENVRI-FAIR project in the environmental domain. |
2018 |
H. Ariño, Arturo Putting your Finger upon the Simplest Data Journal Article Biodiversity Information Science and Standards, 2 , pp. e26300, 2018. Abstract | Links | BibTeX | Tags: bias, digitally accessible knowledge (DAK), digitization, Natural History Collections (NHC), primary biodiversity data records (PBR), trends @article{10.3897/biss.2.26300, title = {Putting your Finger upon the Simplest Data}, author = {Arturo H. Ariño}, url = {https://doi.org/10.3897/biss.2.26300}, doi = {10.3897/biss.2.26300}, year = {2018}, date = {2018-01-01}, journal = {Biodiversity Information Science and Standards}, volume = {2}, pages = {e26300}, publisher = {Pensoft Publishers}, abstract = {Over the past decades, digitization endeavors across many institutions holding natural history collections (NHCs) have multiplied with three broad aims: first, to facilitate collection management by moving existing analog catalogues into digital form; second, to efficiently document and inventory specimens in collections, including imaging them as taxonomical surrogates; and third, to enable discovery of, and access to, the resulting collection data. NHCs contain a unique wealth of potential knowledge in the form of primary biodiversity data records (PBR): at its most basic level, the “what, where and when” of occurrences of the specimens in the collections. But as T.S. Eliot famously said, “knowledge is invariably a matter of degree”. For such data to be transformed into digitally accessible knowledge (DAK) that is conducive to an understanding about how the natural world works, release of digitized data (the “this we know”) is necessary. At least two billion specimens are estimated to exist in NHCs already, but only a small fraction can be considered properly DAK: most have either not been digitized yet, or not released through a discovery facility. Digitizing is relatively costly as it often entails manually processing each specimen unit (e.g. a herbarium sheet, a pinned insect, or a vial full of invertebrates). How long could it take us to transform all NHCs into DAK? Can we keep up with the natural growth in collections? The Global Biodiversity Information Facility (GBIF) has become the de facto main index of PBR, both originated in NHCs or as field observations. Digitized NHC that are standards-compliant and can be connected to, or harvested by, GBIF, effectively become DAK. I have examined GBIF growth data looking for a pattern of DAK generation. I found that the rate of NHC-based PBR accrual is remarkably constant: the total DAK shows a strongly linear growth, as opposed to the exponential growth exhibited by cumulative observation data. Projecting the trend to the estimated holdings shoots the completion many decades ahead. In addition, digitized data appear to be taxonomically biased. Digitization efforts must therefore step up qualitatively in order to enable processing the backlog, let alone newly-acquired accessions, within one generation. Among several possible solutions, emerging, industrial-scale mass-digitization techniques may help harnessing this otherwise daunting task—but there’s also a risk that DAK becomes even more uneven across taxon groups because of the narrow application specificity of such techniques, thus potentially biasing our knowledge of nature.}, keywords = {bias, digitally accessible knowledge (DAK), digitization, Natural History Collections (NHC), primary biodiversity data records (PBR), trends}, pubstate = {published}, tppubtype = {article} } Over the past decades, digitization endeavors across many institutions holding natural history collections (NHCs) have multiplied with three broad aims: first, to facilitate collection management by moving existing analog catalogues into digital form; second, to efficiently document and inventory specimens in collections, including imaging them as taxonomical surrogates; and third, to enable discovery of, and access to, the resulting collection data. NHCs contain a unique wealth of potential knowledge in the form of primary biodiversity data records (PBR): at its most basic level, the “what, where and when” of occurrences of the specimens in the collections. But as T.S. Eliot famously said, “knowledge is invariably a matter of degree”. For such data to be transformed into digitally accessible knowledge (DAK) that is conducive to an understanding about how the natural world works, release of digitized data (the “this we know”) is necessary. At least two billion specimens are estimated to exist in NHCs already, but only a small fraction can be considered properly DAK: most have either not been digitized yet, or not released through a discovery facility. Digitizing is relatively costly as it often entails manually processing each specimen unit (e.g. a herbarium sheet, a pinned insect, or a vial full of invertebrates). How long could it take us to transform all NHCs into DAK? Can we keep up with the natural growth in collections? The Global Biodiversity Information Facility (GBIF) has become the de facto main index of PBR, both originated in NHCs or as field observations. Digitized NHC that are standards-compliant and can be connected to, or harvested by, GBIF, effectively become DAK. I have examined GBIF growth data looking for a pattern of DAK generation. I found that the rate of NHC-based PBR accrual is remarkably constant: the total DAK shows a strongly linear growth, as opposed to the exponential growth exhibited by cumulative observation data. Projecting the trend to the estimated holdings shoots the completion many decades ahead. In addition, digitized data appear to be taxonomically biased. Digitization efforts must therefore step up qualitatively in order to enable processing the backlog, let alone newly-acquired accessions, within one generation. Among several possible solutions, emerging, industrial-scale mass-digitization techniques may help harnessing this otherwise daunting task—but there’s also a risk that DAK becomes even more uneven across taxon groups because of the narrow application specificity of such techniques, thus potentially biasing our knowledge of nature. |
2016 |
Dauby, Gilles; Zaiss, Rainer; Blach-Overgaard, Anne; Catarino, Luís; Damen, Theo; Deblauwe, Vincent; Dessein, Steven; Dransfield, John; Droissart, Vincent; Duarte, Maria Cristina; Engledow, Henry; Fadeur, Geoffrey; Figueira, Rui; Gereau, Roy E; Hardy, Olivier J; Harris, David J; de Heij, Janneke; Janssens, Steven; Klomberg, Yannick; Ley, Alexandra C; MacKinder, Barbara A; Meerts, Pierre; van de Poel, Jeike L; Sonké, Bonaventure; Sosef, Marc S M; Stévart, Tariq; Stoffelen, Piet; Svenning, Jens-Christian; Sepulchre, Pierre; van der Burgt, Xander; Wieringa, Jan J; Couvreur, Thomas L P RAINBIO: a mega-database of tropical African vascular plants distributions Journal Article PhytoKeys, 74 , pp. 1-18, 2016, ISSN: 1314-2011. Abstract | Links | BibTeX | Tags: biodiversity assessment, cultivated species, digitization, georeferencing, habit, Herbarium specimens, native species, taxonomic backbone, tropical forests @article{10.3897/phytokeys.74.9723, title = {RAINBIO: a mega-database of tropical African vascular plants distributions}, author = {Gilles Dauby and Rainer Zaiss and Anne Blach-Overgaard and Luís Catarino and Theo Damen and Vincent Deblauwe and Steven Dessein and John Dransfield and Vincent Droissart and Maria Cristina Duarte and Henry Engledow and Geoffrey Fadeur and Rui Figueira and Roy E Gereau and Olivier J Hardy and David J Harris and Janneke de Heij and Steven Janssens and Yannick Klomberg and Alexandra C Ley and Barbara A MacKinder and Pierre Meerts and Jeike L van de Poel and Bonaventure Sonké and Marc S M Sosef and Tariq Stévart and Piet Stoffelen and Jens-Christian Svenning and Pierre Sepulchre and Xander van der Burgt and Jan J Wieringa and Thomas L P Couvreur}, url = {https://doi.org/10.3897/phytokeys.74.9723}, doi = {10.3897/phytokeys.74.9723}, issn = {1314-2011}, year = {2016}, date = {2016-01-01}, journal = {PhytoKeys}, volume = {74}, pages = {1-18}, publisher = {Pensoft Publishers}, abstract = {The tropical vegetation of Africa is characterized by high levels of species diversity but is undergoing important shifts in response to ongoing climate change and increasing anthropogenic pressures. Although our knowledge of plant species distribution patterns in the African tropics has been improving over the years, it remains limited. Here we present RAINBIO, a unique comprehensive mega-database of georeferenced records for vascular plants in continental tropical Africa. The geographic focus of the database is the region south of the Sahel and north of Southern Africa, and the majority of data originate from tropical forest regions. RAINBIO is a compilation of 13 datasets either publicly available or personal ones. Numerous in depth data quality checks, automatic and manual via several African flora experts, were undertaken for georeferencing, standardization of taxonomic names and identification and merging of duplicated records. The resulting RAINBIO data allows exploration and extraction of distribution data for 25,356 native tropical African vascular plant species, which represents ca. 89% of all known plant species in the area of interest. Habit information is also provided for 91% of these species.}, keywords = {biodiversity assessment, cultivated species, digitization, georeferencing, habit, Herbarium specimens, native species, taxonomic backbone, tropical forests}, pubstate = {published}, tppubtype = {article} } The tropical vegetation of Africa is characterized by high levels of species diversity but is undergoing important shifts in response to ongoing climate change and increasing anthropogenic pressures. Although our knowledge of plant species distribution patterns in the African tropics has been improving over the years, it remains limited. Here we present RAINBIO, a unique comprehensive mega-database of georeferenced records for vascular plants in continental tropical Africa. The geographic focus of the database is the region south of the Sahel and north of Southern Africa, and the majority of data originate from tropical forest regions. RAINBIO is a compilation of 13 datasets either publicly available or personal ones. Numerous in depth data quality checks, automatic and manual via several African flora experts, were undertaken for georeferencing, standardization of taxonomic names and identification and merging of duplicated records. The resulting RAINBIO data allows exploration and extraction of distribution data for 25,356 native tropical African vascular plant species, which represents ca. 89% of all known plant species in the area of interest. Habit information is also provided for 91% of these species. |
ResearchGate Link : https://www.researchgate.net/project/MOBILISE-COST-Action-CA17106-Mobilising-Data-Policies-and-Experts-in-Scientific-Collections