206 research outputs found

    Three-level performance optimization for heterogeneous systems based on software prefetching under power constraints

    Get PDF
    High power consumption has become one of the critical problems restricting the development of high-performance computers. Recently, there are numerous studies on optimizing the execution performance while satisfying the power constraint in recent years. However, these methods mainly focus on homogeneous systems without considering the power or speed difference of heterogeneous processors, so it is difficult to apply these methods in the heterogeneous systems with an accelerator. In this paper, by abstracting the current execution model of a heterogeneous system, we propose a new framework for managing the system power consumption with a three-level power control mechanism. The three levels from top to bottom are: system-level power controller (SPC), group-level power controller (GPC) and unit-level power controller (UPC). The study establishes a power management method for software prefetch in UPC to scale frequency and voltage of programs, select the optimal prefetch distance and guide optimization process to satisfy the constraint boundary according to power constraints. The strategy for dividing power based on key threads is put forward in GPC to preferentially allocate power to threads in key paths. In SPC, a method for evaluating the performance of heterogeneous processing engines is designed for dividing power in order to improve the overall execution performance of the system while sustaining the fairness between concurrent applications. Finally, the proposed framework is verified on a central processing unit (CPU)-graphics processing unit (GPU) heterogeneous system.submittedVersionPublisher embargo until September 2020 (c) This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0

    Energy optimization of parallel programs in a heterogeneous system by combining processor core-shutdown and dynamic voltage scaling

    Get PDF
    Reducing power consumption and improving efficiency are important aspects of the development of supercomputers into large-scale systems. As a result, heterogeneous systems have become an important development trend in high-performance computing. From the perspective of heterogeneous systems, this study establishes a model for energy optimization of parallel programs (EOPP) and puts forward a method of using it. By considering the energy overheads caused by re-synchronization, voltage switching, and operations in critical sections, the model effectively combines processor core-shutdown and dynamic voltage scaling technologies, which can be applied in a heterogeneous system to guide the optimization process. The results show that the proposed model can effectively reduce the energy consumption of parallel programs. Moreover, increasing the proportion of operations in the critical section enhances the optimal frequency of a processor while decreasing the probability of conflicts in the critical section. It can thus provide optimization space for reducing the frequency of a processor which ultimately reduces the energy overhead of the system.acceptedVersionPublisher embargo until March 2021 (c) This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0

    Stackelberg Game-Based Joint Computing Resource Allocation and Task Offloading Method in Edge Computing

    Get PDF
    Edge computing (EC) has emerged as an important technology to support the low-delay request of massive devices nowadays. Task offloading is an essential part in EC because it can influence the use of network resources and network performance dramatically. Most existing task offloading works are only from the view of users. To effectively considering the features and objectives of both users and edge nodes from their different perspectives, a Stackelberg game-based joint computing resource allocation and task offloading method is proposed in this paper. For the nature in EC where edge nodes and users play different roles, the problem is formulated as a bi-level optimization model with multiple leaders and multiple followers. The edge nodes can be seen as leaders and the users are followers. When jointly allocating computing resource and offloading tasks, edge nodes and users have different objectives. The objective of edge nodes is to achieve the most revenue and least energy cost, and the objective of users is to obtain short delay, consume little energy and pay less. Further, considering the particular features of EC, unlike existing Stackelberg game-based task offloading research, we focus on the computing resource allocation rather than pricing. The edge nodes decide the amount of computing resources to be allocated to each user. The users will then react according to such allocation to decide task offloading strategies. Interference, delay, energy, and payoff are all considered. Evolutionary optimization method BLEAQ-II is applied to solve the designed Stackelberg game-based task offloading model. Numerical results have shown the effectiveness of the proposed method.<br/

    Whole procedure heterogeneous multiprocessors low-power optimization at algorithm-level

    Get PDF
    Power consumption reduction is the primary problem for the design and implementation of heterogeneous parallel systems. As it is difficult to make progress in the low-power optimization in the hardware layer to meet the increasing need for power optimization, more attention has been paid to low-power optimization in the hardware layer. The relationship between the execution time and dynamic power consumption of programs divided between homogeneous and heterogeneous computing sections is analysed. In addition, the communication power consumption for data transmission and dynamic multi-task allocation are described. Afterwards, this study establishes a power model for the whole procedure of heterogeneous parallel systems. By using this model, a selection algorithm is designed for the optimal frequency of processors with optimal power consumption under time constraints, optimal descent-based time allocation algorithms in multiple computing sections, and profiling dynamic analysis-based integral linear programming at algorithm-level, separately. Finally, the validity of the power optimization algorithm is ascertained using typical applications.submittedVersionhis is a pre-print of an article published in Cluster Comput (2018). The final authenticated version is available online at: https://doi.org/10.1007/s10586-018-1920-

    Energy Optimization by Software Prefetching for Task Granularity in GPU-based Embedded Systems

    Get PDF
    Energy saving and optimization play an increasingly important role in industrial electronic systems. A heterogeneous embedded system is composed of a general-purpose central processing unit (CPU) with an enhanced module of graphics processing units (GPU). This paper explores the effective strategies of task granularity and software prefetching for energy optimization. We propose a novel energy optimization model for GPU-based embedded systems by harnessing a communication-based pipeline spatial and temporal relation. We analyze the characteristics of a multiple thread execution of parallel GPUs. We present an effective algorithm for the dynamic power optimization with the adaptively adjusted distance of software prefetching. The experimental results show that the dynamic energy consumption can be saved by 22.1% and 21.8% respectively under two prefetching strategies (register and shared memory) without loss of performance. We demonstrate the effectiveness of the proposed methods for energy saving and consumption reduction of performance driven computing in industrial scenarios.acceptedVersion© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Warp-Aware Adaptive Energy Efficiency Calibration for Multi-GPU Systems

    Get PDF
    Massive GPU acceleration processors have been used in high-performance computing systems. The Dennard-scaling has led to power and thermal constraints limiting the performance of such systems. The demand for both increased performance and energy-efficiency is highly desired. This paper presents a multi-layer low-power optimisation method for warps and tasks parallelisms. We present a dynamic frequency regulation scheme for performance parameters in terms of load balance and load imbalance. The method monitors the energy parameters in runtime and adjusts adaptively the voltage level to ensure the performance efficiency with energy reduction. The experimental results show that the multi-layer low-power optimisation with dynamic frequency regulation can achieve 40% energy consumption reduction with only 1.6% performance degradation, thus reducing 59% maximum energy consumption. It can further save about 30% energy consumption in comparison with the single-layer energy optimisation

    Spectrum resource allocation method of maximizing transmission rate in cognitive heterogeneous wireless networks

    Get PDF
    Aiming at the problem that it is difficult to allocate spectrum resources to secondary users efficiently in cognitive heterogeneous wireless networks with heterogeneous spectrum attributes,dynamic channel conditions and diverse service requirements,a spectrum resource allocation strategy with maximum transmission rate was proposed.Firstly,the strategy aimed at maximizing the total transmission rate,and constrained the limited spectrum resources and user service requirements to construct a non-linear multi-constrained spectrum resource allocation 0-1 planning model.Then a polynomial time complexity simplification method was designed.According to idle spectrum information,channel conditions,business requirements and allocation decision history information,and the benefit matrix was constructed and modified to achieve constraint simplification,and the execution efficiency was improved by improving the coefficient matrix transformation strategy of the traditional Hungarian algorithm.Finally,the performance of the method was compared and analyzed by experiments.Experimental results show that the proposed method has higher transmission rate and execution efficiency

    Model-agnostic meta-learning for fault diagnosis of industrial robots

    Get PDF
    The success of deep learning in the field of fault diagnosis depends on a large number of training data, but it is a challenge to achieve fault diagnosis of multi-axis industrial robots in the case of few-shot. To address this issue, this paper proposes a method called Model-Agnostic Meta-Learning (MAML) for fault diagnosis of industrial robots. Its goal is to train an effective industrial robot fault classifier using minimal training data. Additionally, it can learn to recognize faults in new scenarios with high accuracy based on the training data. Experimental results based on a six-axis industrial robot dataset show that the proposed method is superior to traditional convolutional neural network (CNN) and transfer learning, and that the diagnostic results with the same amount of data in few-shot cases are better than existing intelligent fault diagnosis methods
    corecore