49 research outputs found

    D-VRE: From a Jupyter-enabled Private Research Environment to Decentralized Collaborative Research Ecosystem

    Full text link
    Today, scientific research is increasingly data-centric and compute-intensive, relying on data and models across distributed sources. However, it still faces challenges in the traditional cooperation mode, due to the high storage and computing cost, geo-location barriers, and local confidentiality regulations. The Jupyter environment has recently emerged and evolved as a vital virtual research environment for scientific computing, which researchers can use to scale computational analyses up to larger datasets and high-performance computing resources. Nevertheless, existing approaches lack robust support of a decentralized cooperation mode to unlock the full potential of decentralized collaborative scientific research, e.g., seamlessly secure data sharing. In this work, we change the basic structure and legacy norms of current research environments via the seamless integration of Jupyter with Ethereum blockchain capabilities. As such, it creates a Decentralized Virtual Research Environment (D-VRE) from private computational notebooks to decentralized collaborative research ecosystem. We propose a novel architecture for the D-VRE and prototype some essential D-VRE elements for enabling secure data sharing with decentralized identity, user-centric agreement-making, membership, and research asset management. To validate our method, we conducted an experimental study to test all functionalities of D-VRE smart contracts and their gas consumption. In addition, we deployed the D-VRE prototype on a test net of the Ethereum blockchain for demonstration. The feedback from the studies showcases the current prototype's usability, ease of use, and potential and suggests further improvements.Comment: We revised the manuscript draft and submitted the revised manuscript to the journal Blockchain: Research and Application

    Towards Seamless Serverless Computing Across an Edge-Cloud Continuum

    Full text link
    Serverless computing has emerged as an attractive paradigm due to the efficiency of development and the ease of deployment without managing any underlying infrastructure. Nevertheless, serverless computing approaches face numerous challenges to unlock their full potential in hybrid environments. To gain a deeper understanding and firsthand knowledge of serverless computing in edge-cloud deployments, we review the current state of open-source serverless platforms and compare them based on predefined requirements. We then design and implement a serverless computing platform with a novel edge orchestration technique that seamlessly deploys serverless functions across the edge and cloud environments on top of the Knative serverless platform. Moreover, we propose an offloading strategy for edge environments and four different functions for experimentation and showcase the performance benefits of our solution. Our results demonstrate that such an approach can efficiently utilize both cloud and edge resources by dynamically offloading functions from the edge to the cloud during high activity, while reducing the overall application latency and increasing request throughput compared to an edge-only deployment

    PriCE: Privacy-Preserving and Cost-Effective Scheduling for Parallelizing the Large Medical Image Processing Workflow over Hybrid Clouds

    Full text link
    Running deep neural networks for large medical images is a resource-hungry and time-consuming task with centralized computing. Outsourcing such medical image processing tasks to hybrid clouds has benefits, such as a significant reduction of execution time and monetary cost. However, due to privacy concerns, it is still challenging to process sensitive medical images over clouds, which would hinder their deployment in many real-world applications. To overcome this, we first formulate the overall optimization objectives of the privacy-preserving distributed system model, i.e., minimizing the amount of information about the private data learned by the adversaries throughout the process, reducing the maximum execution time and cost under the user budget constraint. We propose a novel privacy-preserving and cost-effective method called PriCE to solve this multi-objective optimization problem. We performed extensive simulation experiments for artifact detection tasks on medical images using an ensemble of five deep convolutional neural network inferences as the workflow task. Experimental results show that PriCE successfully splits a wide range of input gigapixel medical images with graph-coloring-based strategies, yielding desired output utility and lowering the privacy risk, makespan, and monetary cost under user's budget.Comment: Acccepted at Europar 202

    Towards Privacy-, Budget-, and Deadline-Aware Service Optimization for Large Medical Image Processing across Hybrid Clouds

    Full text link
    Efficiently processing medical images, such as whole slide images in digital pathology, is essential for timely diagnosing high-risk diseases. However, this demands advanced computing infrastructure, e.g., GPU servers for deep learning inferencing, and local processing is time-consuming and costly. Besides, privacy concerns further complicate the employment of remote cloud infrastructures. While previous research has explored privacy and security-aware workflow scheduling in hybrid clouds for distributed processing, privacy-preserving data splitting, optimizing the service allocation of outsourcing computation on split data to the cloud, and privacy evaluation for large medical images still need to be addressed. This study focuses on tailoring a virtual infrastructure within a hybrid cloud environment and scheduling the image processing services while preserving privacy. We aim to minimize the use of untrusted nodes, lower monetary costs, and reduce execution time under privacy, budget, and deadline requirements. We consider a two-phase solution and develop 1) a privacy-preserving data splitting algorithm and 2) a greedy Pareto front-based algorithm for optimizing the service allocation. We conducted experiments with real and simulated data to validate and compare our method with a baseline. The results show that our privacy mechanism design outperforms the baseline regarding the average lower band on individual privacy and information gain for privacy evaluation. In addition, our approach can obtain various Pareto optimal-based allocations with users' preferences on the maximum number of untrusted nodes, budget, and time threshold. Our solutions often dominate the baseline's solution and are superior on a tight budget. Specifically, our approach has been ahead of baseline, up to 85.2% and 6.8% in terms of the total financial and time costs, respectively

    A Survey on Dataset Distillation: Approaches, Applications and Future Directions

    Full text link
    Dataset distillation is attracting more attention in machine learning as training sets continue to grow and the cost of training state-of-the-art models becomes increasingly high. By synthesizing datasets with high information density, dataset distillation offers a range of potential applications, including support for continual learning, neural architecture search, and privacy protection. Despite recent advances, we lack a holistic understanding of the approaches and applications. Our survey aims to bridge this gap by first proposing a taxonomy of dataset distillation, characterizing existing approaches, and then systematically reviewing the data modalities, and related applications. In addition, we summarize the challenges and discuss future directions for this field of research

    Federating Medical Deep Learning Models from Private Jupyter Notebooks to Distributed Institutions

    Get PDF
    [EN] Deep learning-based algorithms have led to tremendous progress over the last years, but they face a bottleneck as their optimal development highly relies on access to large datasets. To mitigate this limitation, cross-silo federated learning has emerged as a way to train collaborative models among multiple institutions without having to share the raw data used for model training. However, although artificial intelligence experts have the expertise to develop state-of-the-art models and actively share their code through notebook environments, implementing a federated learning system in real-world applications entails significant engineering and deployment efforts. To reduce the complexity of federation setups and bridge the gap between federated learning and notebook users, this paper introduces a solution that leverages the Jupyter environment as part of the federated learning pipeline and simplifies its automation, the Notebook Federator. The feasibility of this approach is then demonstrated with a collaborative model solving a digital pathology image analysis task in which the federated model reaches an accuracy of 0.8633 on the test set, as compared to the centralized configurations for each institution obtaining 0.7881, 0.6514, and 0.8096, respectively. As a fast and reproducible tool, the proposed solution enables the deployment of a cross-country federated environment in only a few minutes.This work has been partially funded by the European Union s Horizon 2020 research and innovation programme with the project CLARIFY under Marie Sklodowska-Curie (860627), ENVRI-FAIR (824068), BlueCloud (862409), and ARTICONF (825134). This work is also supported by LifeWatch ERIC, GVA through projects PROMETEO/2019/109 and INNEST/2021/321 (SAMUEL), and the Spanish Ministry of Economy and Competitiveness through project PID2019-105142RB-C21 (AI4SKIN). The work of Adrián Colomer has been supported by the ValgrAI Valencian Graduate School and Research Network for Artificial Intelligence & Generalitat Valenciana and Universitat Politècnica de València (PAID-PD-22).Launet, LM.; Wang, Y.; Colomer, A.; Igual García, J.; Pulgarín-Ospina, CC.; Koulouzis, S.; Bianchi, R.... (2023). Federating Medical Deep Learning Models from Private Jupyter Notebooks to Distributed Institutions. Applied Sciences. 13(2). https://doi.org/10.3390/app1302091913

    Effect of cognitive-behavioral therapy with music therapy in reducing physics test anxiety among students as measured by generalized test anxiety scale

    Get PDF
    Abstract: Background: The study determined the effect of cognitive-behavioral therapy (CBT) with music in reducing physics test anxiety among secondary school students as measured by generalized test anxiety scale. Methods:Pre-test post-test randomized control trial experimental design was adopted in this study. A total of 83 senior secondary students including male (n=46) and female (n=37) from sampled secondary schools in Enugu State, Nigeria, who met the inclusion criteria constituted participants for the study. A demographic questionnaire and a 48-item generalized test anxiety scale were used for data collection for the study. Subjects were randomized into treatment and control groups. The treatment group was exposed to a 12-week CBT-music program. Thereafter, the participants in the treatment group were evaluated at 3 time points. Data collected were analyzed using repeated measures analysis of variance. Results: The participants who were exposed to CBT-music intervention program significantly had lower test anxiety scores at the post-treatment than the participants in the control group. Furthermore, the test anxiety scores of the participants in the CBT-music group were significantly lower than those in the control group at the follow-up measure. Thus, the results showed a significant effect of CBT with music in reducing physics test anxiety among secondary school students. Conclusion:We concluded that CBT-music program has a significant benefit in improving the management of physics test anxiety among secondary school students. Abbreviations: DR2 = adjusted R2, CBT = cognitive-behavioral therapy, CBT-music = CBT-based music group, CI = confidence interval, GTAI = Generalized Test Anxiety Inventory

    Workflows Community Summit 2024:Future Trends and Challenges in Scientific Workflows

    Get PDF
    The Workflows Community Summit gathered 111 participants from 18 countries to discuss emerging trends and challenges in scientific workflows, focusing on six key areas: time-sensitive workflows, AI-HPC convergence, multi-facility workflows, heterogeneous HPC environments, user experience, and FAIR computational workflows. The integration of AI and exascale computing has revolutionized scientific workflows, enabling higher-fidelity models and complex, time-sensitive processes, while introducing challenges in managing heterogeneous environments and multi-facility data dependencies. The rise of large language models is driving computational demands to zettaflop scales, necessitating modular, adaptable systems and cloud-service models to optimize resource utilization and ensure reproducibility. Multi-facility workflows present challenges in data movement, curation, and overcoming institutional silos, while diverse hardware architectures require integrating workflow considerations into early system design and developing standardized resource management tools. The summit emphasized improving user experience in workflow systems and ensuring FAIR workflows to enhance collaboration and accelerate scientific discovery. Key recommendations include developing standardized metrics for time-sensitive workflows, creating frameworks for cloud-HPC integration, implementing distributed-by-design workflow modeling, establishing multi-facility authentication protocols, and accelerating AI integration in HPC workflow management. The summit also called for comprehensive workflow benchmarks, workflow-specific UX principles, and a FAIR workflow maturity model, highlighting the need for continued collaboration in addressing the complex challenges posed by the convergence of AI, HPC, and multi-facility research environments
    corecore