8 research outputs found

    High Performance Broadcast Support in LA-MPI over Quadrics

    No full text
    LA-MPI is a unique MPI implementation that provides end-to-end reliable message passing between application processes. LA-MPI collective operations are implemented on top of its point-to-point operations, using generic spanning tree-based collective algorithms. The performance of the collective operations scales in a logarithmic order over that of the point-to-point operations. Thus, it is desirable to provide more efficient and more scalable collective operations while maintaining the end-to-end reliability. To this end, we investigate the feasibility of utilizing Quadrics hardware broadcast in this paper. We explore several challenging issues such as broadcast buffer management, broadcast over arbitrary processes, retransmission and reliability. Accordingly, a low-latency, highly scalable, fault-tolerant broadcast algorithm is designed and implemented over Quadrics hardware broadcast. Our evaluation shows that this implementation reduces broadcast latency and achieves higher scalability relative to the generic version of this operation. In addition, we observe that the performance of our implementation is comparable to that of the high performance implementation by Quadric

    Linking Material Models Between Codes: Establishing Thermodynamic Consistency

    Full text link
    Abstract One increasingly important workflow for multiphysics simulations is linking simulation codes that have different physics models and different regimes for which they have been optimized. The science question for this scoping work was evaluating the compatibility of physics models on both sides of a link to ensure a smooth simulation continuation was possible. The VVUQ aspects were establishing the most important physics aspects for a credible simulation. The most important aspect was determined to be thermodynamic consistency such that nothing unphysical would be encountered during the simulation. The second most important aspect is ensuring adequate handling of mechanical deformation. The specific problem was driving a Taylor cylinder into an infinitely hard wall. The material was cerium, which has a complicated enough phase diagram to show some interesting thermodynamic behavior during the deformation. The main software involved is Abaqus for the initial simulation, Zelda (a LANL code) for linking, and FLAG (a LANL Lagrangian finite volume code). The basic process is using nominally the same material models in both Abaqus and FLAG to: • perform a calculation in Abaqus • output an ODB file from Abaqus with fields (e.g., density, stress) • use Zelda to extract fields and remap them onto a new mesh suitable for FLAG • continue the simulation in FLAG The remapping of fields onto the new mesh is a negligible source of error. Thermodynamic consistency is a much larger source of overall error and can be large enough to prevent initialization in the receiving code. The situation arises because of the way that the two codes treat different fields. Both codes have interpolation processes for evaluating the thermodynamics. Differences in which variable is primary and which is interpolated lead to numerical errors that can be irrelevant in one code and unusably large in the other code. This paper will explain the VVUQ issues in linking the codes, even with nominally the same material models, and propose some activities to answer some important VVUQ questions.</jats:p

    Monte Carlo Application ToolKit (MCATK)

    No full text
    The Monte Carlo Application ToolKit (MCATK) is a component-based software library designed to build specialized applications and to provide new functionality for existing general purpose Monte Carlo radiation transport codes. We will describe MCATK and its capabilities along with presenting some verification and validations results

    High Performance Broadcast Support in LA-MPI over Quadrics

    No full text
    LA-MPI is a unique MPI implementation that provides network-level fault-tolerant message passing. This paper describes the efficient implementation of a scalable MPI broadcast algorithm. LA-MPI implements a generic version of the broadcast algorithm using a spanning tree method built on top of point-to-point messaging. However, the Quadrics network, with it’s hardware broadcast support, provides an opportunity for a much more efficient implementation of this collective. We describe the design challenges encountered while making use of the hardware broadcast capability, explore design alternatives and describe the approach taken to design a low-latency, highly scalable, fault-tolerant broadcast algorithm. Our evaluation shows that this implementation reduces broadcast latency and achieves higher scalability relative to the generic version of this operation. In addition, we observe that performance of the implementation is comparable to that of the high performance implementation by QSW [13] for MPICH, and HP’s for Alaska MPI, while providing fault tolerance to network errors not provided by these. 1

    Architecture of LA-MPI, a network-fault-tolerant MPI

    No full text
    We discuss the unique architectural elements of the Los Alamos Message Passing Interface (LA-MPI), a high-performance, network-fault-tolerant, thread-safe MPI library. LA-MPI is designed for use on terascale clusters which are inherently unreliable due to their sheer number of system components and tradeoffs between cost and performance. We examine in detail the design concepts used to implement LA-MPI. These include reliability features, such as applicationlevel checksumming, message retransmission, and automatic message re-routing. Other key performance enhancing features, such as concurrent message routing over multiple, diverse network adapters and protocols, and communication-specific optimizations (e.g., shared memory) are examined.
    corecore