40 research outputs found
End-to-End Workflows for Climate Science: Integrating HPC Simulations, Big Data Processing, and Machine Learning
Current scientific workflow systems do not typically integrate simulation-centric and data-centric aspects due to their very different software/infrastructure requirements. A transparent integration of such components into a single end-to-end workflow would lead to a more efficient and automated way for generating insights from large simulation data. This work presents a complex case study related to extreme events analysis of future climate data that integrates in the same workflow numerical simulations, Big Data analytics and Machine Learning models. The case study is being implemented in the context of the eFlows4HPC project using the project’s software stack for deployment and orchestration of the workflow. The solution implemented in the project has shown to simplify the development and execution of end-to-end climate workflows with heterogeneous software requirements. Moreover, such an approach can, in the long term, increase the reuse of workflows by scientists and their portability over different HPC infrastructures.Thiswork has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 955558.
The JU receives support from the European Union’s Horizon 2020 research and innovation programme and Spain, Germany, France,
Italy, Poland, Switzerland and Norway. In Spain, it has received complementary funding from MCIN/AEI/10.13039/501100011033,
Spain and the European Union NextGenerationEU/PRTR (contracts PCI2021-121957, PCI2021-121931, PCI2021-121944, and PCI2021-
121927). In Italy, it has been preliminary approved for complimentary funding by Ministero dello Sviluppo Economico (MiSE) (ref.
project prop. 2659). The authors also acknowledge financial support by MCIN/AEI /10.13039/501100011033, Spain through the "Severo
Ochoa Programme for Centres of Excellence in R&D" under Grant CEX2021-001148-S, the Spanish Government (contract PID2019-
107255 GB) and by Generalitat de Catalunya (contract 2021-SGR-00412).Peer ReviewedPostprint (author's final draft
A multi-service data management platform for scientific oceanographic products
Abstract. An efficient, secure and interoperable data platform solution has been developed in the TESSA project to provide fast navigation and access to the data stored in the data archive, as well as a standard-based metadata management support. The platform mainly targets scientific users and the situational sea awareness high-level services such as the decision support systems (DSS). These datasets are accessible through the following three main components: the Data Access Service (DAS), the Metadata Service and the Complex Data Analysis Module (CDAM). The DAS allows access to data stored in the archive by providing interfaces for different protocols and services for downloading, variables selection, data subsetting or map generation. Metadata Service is the heart of the information system of the TESSA products and completes the overall infrastructure for data and metadata management. This component enables data search and discovery and addresses interoperability by exploiting widely adopted standards for geospatial data. Finally, the CDAM represents the back-end of the TESSA DSS by performing on-demand complex data analysis tasks
Big Data Analytics on Large-Scale Scientific Datasets in the INDIGO-DataCloud Project
In the context of the EU H2020 INDIGO-DataCloud project several use case on large scale scientfic data analysis regarding different research communities have been implemented. All of them require the availability of large amount of data related to either output of imulations or observed data from sensors and need scientic (big) data solutions to run data analysis experiments. More specically,the paper presents the case studies related to the following research communities: (i) the European Multidisciplinary Seaoor and water column Observatory (INGV-EMSO), (ii) the Large Binocular Tele-scope, (iii) LifeWatch, and (iv) the European Network for Earth System Modelling (ENES).EGI Foundation, IBM ResearchPublishedUniversity of Siena, Palazzo del Rettorato, Banchi di Sotto, 55, 53100 Siena (SI), Italy1VV. Altr
SeaConditions: a web and mobile service for safer professional and recreational activities in the Mediterranean Sea
Abstract. Reliable and timely information on the environmental conditions at sea is key to the safety of professional and recreational users as well as to the optimal execution of their activities. The possibility of users obtaining environmental information in due time and with adequate accuracy in the marine and coastal environment is defined as sea situational awareness (SSA). Without adequate information on the environmental meteorological and oceanographic conditions, users have a limited capacity to respond, which has led to loss of lives and to large environmental disasters with enormous consequent damage to the economy, society and ecosystems. Within the framework of the TESSA project, new SSA services for the Mediterranean Sea have been developed. In this paper we present SeaConditions, which is a web and mobile application for the provision of meteorological and oceanographic observation and forecasting products. Model forecasts and satellite products from operational services, such as ECMWF and CMEMS, can be visualized in SeaConditions. In addition, layers of information related to bathymetry, sea level and ocean-colour data (chl a and water transparency) are displayed. Ocean forecasts at high spatial resolutions are included in the version of SeaConditions presented here. SeaConditions provides a user-friendly experience with a fluid zoom capability, facilitating the appropriate display of data with different levels of detail. SeaConditions is a single point of access to interactive maps from different geophysical fields, providing high-quality information based on advanced oceanographic models. The SeaConditions services are available through both web and mobile applications. The web application is available at www.sea-conditions.com and is accessible and compatible with present-day browsers. Interoperability with GIS software is implemented. User feedback has been collected and taken into account in order to improve the service. The SeaConditions iOS and Android apps have been downloaded by more than 105 000 users to date (May 2016), and more than 100 000 users have visited the web version
Common Pitfalls Coding a Parallel Model
The process for developing a climate model often involves a wide community of developers. All of the code releases can be classified into two groups: (i) improvements and updates related to modeling aspects (new parameterizations, new and more detailed equations, remove of model approximations, and so on); (ii) improvement related to the computational aspects (performance enhancement, porting on new computing architectures, fixing of known bugs, and so on). The developing process involves both programmers, scientific experts, and rarely also computer scientists. The new improvements and developments are mainly focused on the scientific aspects and, in second stage, on the computing performance. The developments to improve the physic model often does not care about its impacts on the computational performances. This poses some issue in the developing process; after a new implementation, the code must be revised after new implementation to face out with the performance issues. In this work we analyze 5 different releases starting from the NEMO v3.2 (to be considered as our reference) and evaluate how new developments impact on the computational performances
Nemo Benchmarking: V.3.3 vs V.3.3.1 Comparison
he report describes the activities carried out within the NEMO Consortium commitment. They refer to the performance evaluation and the comparison of the NEMO version 3.3 and the NEMO version 3.3.1. The main differences between two versions are related to the memory management and the allocation of the data structures. The NEMO ver. 3.3.1 replaces the static memory array allocation with a dynamic one. This approach brings some relevant benefits, such as the run time evaluation of the best domain decomposition. On the other hand, the dynamic array allocation introduces some lose of computational performance. The aim of this work is to evaluate the difference between the two versions from a computational point of view
