The Alien Species VRE provides users with data relative to 13500 species belonging to 500 different terrestrial, freshwater, lagoon and marine sites. The data include several taxonomic groups from cyanobacteria to fish, amphibians, reptiles and mammals, inhabiting different habitats and their surroundings. This dataset was checked for quality: data cleaning and validation for taxonomic reliability and taxonomic consistency using different zoological and botanical taxonomic index and database (PESI, WORMS, and Catalogue of Life), and a semi-automatic data cleaning, through the tools available on the LifeWatch portal.
The Alien Species VRE provides users with the implementation of the scientific workflow used by the research team. In an Open Science perspective, the analysis is reproducible through a workflow management system, which can run the orchestration of services composed for the Alien Species Showcase. In the Alien Species VRE, the software system that allows the execution is Taverna WorkBench (see www.taverna.org.uk). The ICT working group and the modelling working group jointly composed the workflow by developing and combining a series of executable stand-alone modules. The workflow can also be recombined to perform further analyses.
The picture presents the original workflow. The detail of the sub-modules follows.
Taverna services (sub-modules)
Dataportal interaction module: interactive selection of a dataset from the LifeWatch Service Centre data portal (the module is run by the execution of the workflow). This is a first example of interactive module and advanced user interface developed by LifeWatch Italia ICT and modelling working groups to ease the effort needed by non-ICT scientists in using Taverna and to enable run-time interactions of users;
REST service retrieve dataset: download the selected dataset;
Reshape dataset module: this module allows the users to reorganize the dataset in order to fit with the requirements of the analyses;
The modelling module: core of the analysis, is a two-step module. First, it performs the selection of the best model on the basis of the Akaike information Criteria (AIC) and outputs the resulting AIC table. Then, the best fitting module is calculated and tables of estimated values as well plots of predictors are reported;
Rarefaction curve plot module: This module provide a rarefaction plot that allow the users to understand if the dataset is sufficient to feed a model and eventually to give a better interpretation of the results of the modelling module.