Search CORE

49 research outputs found

D-VRE: From a Jupyter-enabled Private Research Environment to Decentralized Collaborative Research Ecosystem

Author: Farshidi Siamak
Tripathi Sheejan
Wang Yuandou
Zhao Zhiming
Publication venue
Publication date: 26/06/2024
Field of study

Today, scientific research is increasingly data-centric and compute-intensive, relying on data and models across distributed sources. However, it still faces challenges in the traditional cooperation mode, due to the high storage and computing cost, geo-location barriers, and local confidentiality regulations. The Jupyter environment has recently emerged and evolved as a vital virtual research environment for scientific computing, which researchers can use to scale computational analyses up to larger datasets and high-performance computing resources. Nevertheless, existing approaches lack robust support of a decentralized cooperation mode to unlock the full potential of decentralized collaborative scientific research, e.g., seamlessly secure data sharing. In this work, we change the basic structure and legacy norms of current research environments via the seamless integration of Jupyter with Ethereum blockchain capabilities. As such, it creates a Decentralized Virtual Research Environment (D-VRE) from private computational notebooks to decentralized collaborative research ecosystem. We propose a novel architecture for the D-VRE and prototype some essential D-VRE elements for enabling secure data sharing with decentralized identity, user-centric agreement-making, membership, and research asset management. To validate our method, we conducted an experimental study to test all functionalities of D-VRE smart contracts and their gas consumption. In addition, we deployed the D-VRE prototype on a test net of the Ethereum blockchain for demonstration. The feedback from the studies showcases the current prototype's usability, ease of use, and potential and suggests further improvements.Comment: We revised the manuscript draft and submitted the revised manuscript to the journal Blockchain: Research and Application

arXiv.org e-Print Archive

Towards Seamless Serverless Computing Across an Edge-Cloud Continuum

Author: Odyurt Uraz
Simion Emilian
Tai Hsiang-ling
Wang Yuandou
Zhao Zhiming
Publication venue
Publication date: 04/01/2024
Field of study

Serverless computing has emerged as an attractive paradigm due to the efficiency of development and the ease of deployment without managing any underlying infrastructure. Nevertheless, serverless computing approaches face numerous challenges to unlock their full potential in hybrid environments. To gain a deeper understanding and firsthand knowledge of serverless computing in edge-cloud deployments, we review the current state of open-source serverless platforms and compare them based on predefined requirements. We then design and implement a serverless computing platform with a novel edge orchestration technique that seamlessly deploys serverless functions across the edge and cloud environments on top of the Knative serverless platform. Moreover, we propose an offloading strategy for edge environments and four different functions for experimentation and showcase the performance benefits of our solution. Our results demonstrate that such an approach can efficiently utilize both cloud and edge resources by dynamically offloading functions from the edge to the cloud during high activity, while reducing the overall application latency and increasing request throughput compared to an edge-only deployment

arXiv.org e-Print Archive

PriCE: Privacy-Preserving and Cost-Effective Scheduling for Parallelizing the Large Medical Image Processing Workflow over Hybrid Clouds

Author: Engan Kjersti
Grosso Paola
Kanwal Neel
Rong Chunming
Wang Yuandou
Zhao Zhiming
Publication venue
Publication date: 24/05/2024
Field of study

Running deep neural networks for large medical images is a resource-hungry and time-consuming task with centralized computing. Outsourcing such medical image processing tasks to hybrid clouds has benefits, such as a significant reduction of execution time and monetary cost. However, due to privacy concerns, it is still challenging to process sensitive medical images over clouds, which would hinder their deployment in many real-world applications. To overcome this, we first formulate the overall optimization objectives of the privacy-preserving distributed system model, i.e., minimizing the amount of information about the private data learned by the adversaries throughout the process, reducing the maximum execution time and cost under the user budget constraint. We propose a novel privacy-preserving and cost-effective method called PriCE to solve this multi-objective optimization problem. We performed extensive simulation experiments for artifact detection tasks on medical images using an ensemble of five deep convolutional neural network inferences as the workflow task. Experimental results show that PriCE successfully splits a wide range of input gigapixel medical images with graph-coloring-based strategies, yielding desired output utility and lowering the privacy risk, makespan, and monetary cost under user's budget.Comment: Acccepted at Europar 202

arXiv.org e-Print Archive

Towards Privacy-, Budget-, and Deadline-Aware Service Optimization for Large Medical Image Processing across Hybrid Clouds

Author: Engan Kjersti
Grosso Paola
Kanwal Neel
Rong Chunming
Wang Yuandou
Zhao Zhiming
Publication venue
Publication date: 23/01/2024
Field of study

Efficiently processing medical images, such as whole slide images in digital pathology, is essential for timely diagnosing high-risk diseases. However, this demands advanced computing infrastructure, e.g., GPU servers for deep learning inferencing, and local processing is time-consuming and costly. Besides, privacy concerns further complicate the employment of remote cloud infrastructures. While previous research has explored privacy and security-aware workflow scheduling in hybrid clouds for distributed processing, privacy-preserving data splitting, optimizing the service allocation of outsourcing computation on split data to the cloud, and privacy evaluation for large medical images still need to be addressed. This study focuses on tailoring a virtual infrastructure within a hybrid cloud environment and scheduling the image processing services while preserving privacy. We aim to minimize the use of untrusted nodes, lower monetary costs, and reduce execution time under privacy, budget, and deadline requirements. We consider a two-phase solution and develop 1) a privacy-preserving data splitting algorithm and 2) a greedy Pareto front-based algorithm for optimizing the service allocation. We conducted experiments with real and simulated data to validate and compare our method with a baseline. The results show that our privacy mechanism design outperforms the baseline regarding the average lower band on individual privacy and information gain for privacy evaluation. In addition, our approach can obtain various Pareto optimal-based allocations with users' preferences on the maximum number of untrusted nodes, budget, and time threshold. Our solutions often dominate the baseline's solution and are superior on a tight budget. Specifically, our approach has been ahead of baseline, up to 85.2% and 6.8% in terms of the total financial and time costs, respectively

arXiv.org e-Print Archive

A Survey on Dataset Distillation: Approaches, Applications and Future Directions

Author: Chen Zongxiong
Geng Jiahui
Mayer Ruben
Rong Chunming
Schimmler Sonja
Wang Yuandou
Woisetschlaeger Herbert
Zhao Zhiming
Publication venue
Publication date: 24/08/2023
Field of study

Dataset distillation is attracting more attention in machine learning as training sets continue to grow and the cost of training state-of-the-art models becomes increasingly high. By synthesizing datasets with high information density, dataset distillation offers a range of potential applications, including support for continual learning, neural architecture search, and privacy protection. Despite recent advances, we lack a holistic understanding of the approaches and applications. Our survey aims to bridge this gap by first proposing a taxonomy of dataset distillation, characterizing existing approaches, and then systematically reviewing the data modalities, and related applications. In addition, we summarize the challenges and discuss future directions for this field of research

arXiv.org e-Print Archive

Federating Medical Deep Learning Models from Private Jupyter Notebooks to Distributed Institutions

Author: Bianchi Riccardo
Colomer Adrián
Igual García Jorge
Koulouzis Spiros
Launet Laetitia Mariana
Monteagudo Carlos
Mosquera-Zamudio Andrés
Naranjo Ornedo Valeriana
Pulgarín-Ospina Cristian Camilo
Wang Yuandou
Zhao Zhiming
Publication venue: MDPI AG
Publication date: 01/01/2023
Field of study

[EN] Deep learning-based algorithms have led to tremendous progress over the last years, but they face a bottleneck as their optimal development highly relies on access to large datasets. To mitigate this limitation, cross-silo federated learning has emerged as a way to train collaborative models among multiple institutions without having to share the raw data used for model training. However, although artificial intelligence experts have the expertise to develop state-of-the-art models and actively share their code through notebook environments, implementing a federated learning system in real-world applications entails significant engineering and deployment efforts. To reduce the complexity of federation setups and bridge the gap between federated learning and notebook users, this paper introduces a solution that leverages the Jupyter environment as part of the federated learning pipeline and simplifies its automation, the Notebook Federator. The feasibility of this approach is then demonstrated with a collaborative model solving a digital pathology image analysis task in which the federated model reaches an accuracy of 0.8633 on the test set, as compared to the centralized configurations for each institution obtaining 0.7881, 0.6514, and 0.8096, respectively. As a fast and reproducible tool, the proposed solution enables the deployment of a cross-country federated environment in only a few minutes.This work has been partially funded by the European Union s Horizon 2020 research and innovation programme with the project CLARIFY under Marie Sklodowska-Curie (860627), ENVRI-FAIR (824068), BlueCloud (862409), and ARTICONF (825134). This work is also supported by LifeWatch ERIC, GVA through projects PROMETEO/2019/109 and INNEST/2021/321 (SAMUEL), and the Spanish Ministry of Economy and Competitiveness through project PID2019-105142RB-C21 (AI4SKIN). The work of Adrián Colomer has been supported by the ValgrAI Valencian Graduate School and Research Network for Artificial Intelligence & Generalitat Valenciana and Universitat Politècnica de València (PAID-PD-22).Launet, LM.; Wang, Y.; Colomer, A.; Igual García, J.; Pulgarín-Ospina, CC.; Koulouzis, S.; Bianchi, R.... (2023). Federating Medical Deep Learning Models from Private Jupyter Notebooks to Distributed Institutions. Applied Sciences. 13(2). https://doi.org/10.3390/app1302091913

RiuNet

Effect of cognitive-behavioral therapy with music therapy in reducing physics test anxiety among students as measured by generalized test anxiety scale

Author: Adeyemi
Adeyemi
Akagi
Akpan
Bott
Bradshaw
Butler
Carlbring
Cassady
Dingle
Eifediyi
Essau
Ezegbe
Farooqi
Fayand
Hacking
Hakvoort
Han
Han
Huberty
Kilgarriff-Foster
Ogungbamila
Olanipekun
Quinton
Salend
Sandeep
Saunders
Sena
Serafini
Shni
Suinn
Szentagotai
Trimmer
Wu
Wu
Wu
Yoosefi
Yuandou
Zhang
Publication venue
Publication date: 01/01/2020
Field of study

Abstract: Background: The study determined the effect of cognitive-behavioral therapy (CBT) with music in reducing physics test anxiety among secondary school students as measured by generalized test anxiety scale. Methods:Pre-test post-test randomized control trial experimental design was adopted in this study. A total of 83 senior secondary students including male (n=46) and female (n=37) from sampled secondary schools in Enugu State, Nigeria, who met the inclusion criteria constituted participants for the study. A demographic questionnaire and a 48-item generalized test anxiety scale were used for data collection for the study. Subjects were randomized into treatment and control groups. The treatment group was exposed to a 12-week CBT-music program. Thereafter, the participants in the treatment group were evaluated at 3 time points. Data collected were analyzed using repeated measures analysis of variance. Results: The participants who were exposed to CBT-music intervention program significantly had lower test anxiety scores at the post-treatment than the participants in the control group. Furthermore, the test anxiety scores of the participants in the CBT-music group were significantly lower than those in the control group at the follow-up measure. Thus, the results showed a significant effect of CBT with music in reducing physics test anxiety among secondary school students. Conclusion:We concluded that CBT-music program has a significant benefit in improving the management of physics test anxiety among secondary school students. Abbreviations: DR2 = adjusted R2, CBT = cognitive-behavioral therapy, CBT-music = CBT-based music group, CI = confidence interval, GTAI = Generalized Test Anxiety Inventory

Crossref

Berklee Research Media and Information Exchange (REMIX)

University of Johannesburg Institutional Repository

Workflows Community Summit 2024:Future Trends and Challenges in Scientific Workflows

Author: Al Bkhetan Ziad
Amaris Marcos
Babuji Yadu
Bader Jonathan
Balin Riccardo
Balouek Daniel
Bard Deborah
Beecroft Sarah
Belhajjame Khalid
Bhattarai Rajat
Brewer Wes
Brunk Paul
Caino-Lores Silvina
Casanova Henri
Cassol Daniela
Chard Kyle
Coleman Jared
Coleman Taina
Colonnelli Iacopo
Da Silva Anderson Andrei
de Oliveira Daniel
Elahi Pascal
Elfaramawy Nour
Elwasif Wael
Etz Brian
Fahringer Thomas
Ferreira da Silva Rafael
Ferreira Wesley
Filgueira Rosa
Fosso Tande Jacob
Foster Ian T.
Gadelha Luiz
Gallo Andy
Garijo Daniel
Georgiou Yiannis
Gibbs Tom
Goble Carole
Godoy William
Gritsch Philipp
Grubel Patricia
Gueroudji Amal
Guilloteau Quentin
Gustafsson Johan
Hamalainen Carlo
Haus Utz-Uwe
Hong Enriquez Rolando
Hudson Stephen
Huet Lauren
Hunter Kesling Kevin
Iborra Paula
Jahangiri Shiva
Janssen Jan
Jha Shantenu
Jordan Joe
Kanwal Sehrish
Kunstmann Liliane
Lehmann Fabian
Leser Ulf
Li Chen
Liu Peini
Los Laila
Luettgau Jakob
Lupat Richard
M. Fernandez Jose
Maheshwari Ketan
Malik Tanu
Marquez Jack
Matsuda Motohiko
Medic Doriana
Mohammadi Somayeh
Mulone Alberto
Navarro John-Luke
Ng Kin Wai
Noelp Klaus
P. Kinoshita Bruno
Paine Drew
Prout Ryan
R. Crusoe Michael
Ristov Sashko
Robila Stefan
Rosendo Daniel
Rowell Billy
Rybicki Jedrzej
Sanchez Hector
Saurabh Nishant
Saurav Sumit Kumar
Scogland Tom
Senanayake Dinindu
Shaun de Witt
Shin Woong
Sirvent Raul
Skluzacek Tyler
Sly-Delgado Barry
Soiland-Reyes Stian
Souza Abel
Souza Renan
Suter Frédéric
Talia Domenico
Tallent Nathan
Thamsen Lauritz
Titov Mikhail
Tovar Benjamin
Vahi Karan
Vardar-Irrgang Eric
Vartina Edite
Wang Yuandou
Ward Logan
Wilkinson Sean
Wouters Merridee
Yu Qi
Zulfiqar Mahnoor
Publication venue
Publication date: 01/01/2024
Field of study

The Workflows Community Summit gathered 111 participants from 18 countries to discuss emerging trends and challenges in scientific workflows, focusing on six key areas: time-sensitive workflows, AI-HPC convergence, multi-facility workflows, heterogeneous HPC environments, user experience, and FAIR computational workflows. The integration of AI and exascale computing has revolutionized scientific workflows, enabling higher-fidelity models and complex, time-sensitive processes, while introducing challenges in managing heterogeneous environments and multi-facility data dependencies. The rise of large language models is driving computational demands to zettaflop scales, necessitating modular, adaptable systems and cloud-service models to optimize resource utilization and ensure reproducibility. Multi-facility workflows present challenges in data movement, curation, and overcoming institutional silos, while diverse hardware architectures require integrating workflow considerations into early system design and developing standardized resource management tools. The summit emphasized improving user experience in workflow systems and ensuring FAIR workflows to enhance collaboration and accelerate scientific discovery. Key recommendations include developing standardized metrics for time-sensitive workflows, creating frameworks for cloud-HPC integration, implementing distributed-by-design workflow modeling, establishing multi-facility authentication protocols, and accelerating AI integration in HPC workflow management. The summit also called for comprehensive workflow benchmarks, workflow-specific UX principles, and a FAIR workflow maturity model, highlighting the need for continued collaboration in addressing the complex challenges posed by the convergence of AI, HPC, and multi-facility research environments

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Juelich Shared Electronic Resources

Synthesis and Self-Assembly of Hydroxypropyl Methyl Cellulose-block-Poly(-caprolactone) Copolymers as Nanocarriers of Lipophilic Drugs

Author: Aijing Lu
Eddy Petit
Feng Su
Suming Li
Yuandou Wang
Publication venue: American Chemical Society (ACS)
Publication date: 22/05/2020
Field of study

Crossref

Research on digital image's color-difference threshold under different lighting levels

Author: Ao Jiao
Haoxue Liu
Ningning Zheng
Yuandou Chen
Yunlong Hao
Publication venue: IEEE
Publication date: 01/10/2011
Field of study

Crossref