83 research outputs found
Interleaving and lock-step semantics for analysis and verification of GPU kernels
Graphics Processing Units (GPUs) from leading vendors employ predicated (or guarded) execution to eliminate branching and increase performance. Similarly, a recent GPU verification technique uses predication to reduce verification of GPU kernels (the massively parallel programs that run on GPUs) to verification of a sequential program. Prior work on the formal semantics of lock-step predicated execution for kernels focused on structured programs, where control is organised using if- and while-statements. We provide lock-step execution semantics for GPU kernels that are represented by arbitrary reducible control flow graphs. We present a traditional interleaving semantics and a novel lock-step semantics based on predication, and show that for terminating kernels either both semantics compute identical results or both behave erroneously. The method allows reducing GPU kernel verification to the verification of a sequential, lock-step program to be applied to GPU kernels with arbitrary reducible control flow. We have implemented the method in the GPUVerify tool, and present an evaluation using a set of 163 open source and commercial GPU kernels. Among these kernels, 42 exhibit unstructured control flow which our novel lock-step predication technique can handle fully automatically. This generality comes at a modest price: verification across our benchmark set was on average 2.25 times slower than using an existing approach that specifically targets structured kernels
An empirical prediction method for non-linear normal force on thin wings at supersonic speeds
GPUVerify: A Verifier for GPU Kernels
We present a technique for verifying race- and divergence-freedom of GPU kernels that are written in mainstream ker-nel programming languages such as OpenCL and CUDA. Our approach is founded on a novel formal operational se-mantics for GPU programming termed synchronous, delayed visibility (SDV) semantics. The SDV semantics provides a precise definition of barrier divergence in GPU kernels and allows kernel verification to be reduced to analysis of a sequential program, thereby completely avoiding the need to reason about thread interleavings, and allowing existing modular techniques for program verification to be leveraged. We describe an efficient encoding for data race detection and propose a method for automatically inferring loop invari-ants required for verification. We have implemented these techniques as a practical verification tool, GPUVerify, which can be applied directly to OpenCL and CUDA source code. We evaluate GPUVerify with respect to a set of 163 kernels drawn from public and commercial sources. Our evaluation demonstrates that GPUVerify is capable of efficient, auto-matic verification of a large number of real-world kernels
Engineering a static verification tool for GPU kernels
We report on practical experiences over the last 2.5 years related to the engineering of GPUVerify, a static verification tool for OpenCL and CUDA GPU kernels, plotting the progress of GPUVerify from a prototype to a fully functional and relatively efficient analysis tool. Our hope is that this experience report will serve the verification community by helping to inform future tooling efforts. © 2014 Springer International Publishing
Conservation genetics of traditional and commercial pig breeds, and evaluation of their crossbreeding potential for productivity improvement
The Food and Agriculture Organization have emphasised the importance of farm animal genetic diversity for the assurance of future global food security. Modern pig production has concentrated on a small number of commercialised breeds. This has significantly contributed to genetic erosion and loss of native breeds, deemed productively inefficient. It has been recommended to conserve the unique traits of traditional breeds as genetic insurance against future challenges. In order to ascertain the commercial viability of traditional breeds, genetic and productivity analyses were completed, using the Large White (LW) and Landrace (LR) as the commercial comparison.
Genetic diversity was assessed using a D-loop fragment of mitochondrial DNA for comparison between three purebred traditional breeds: Gloucester Old Spot (GOS), British Lop (BL) and Welsh (W), and commercial LW x LR. The traditional breeds greatly differed from the commercial hybrid, and possessed high variability at this genetic region. The BL and W demonstrated the greatest potential for crossbreeding to increase the diversity of commercial populations.
The crossing of LW x LR dams with GOS, BL and W terminal sires produced traditional crossbreds for comparison with LW sired crossbreds. Nuclear DNA diversity was assessed using a region of the iodothyronine deiodinase type 3 (DIO3) gene. This demonstrated that crossbreeding could improve future productivity, by utilising traditional variation to maximise heterozygosity in the progeny. The productivity assessment established that the traditional and commercial crossbreds performed comparably for most of the growth variables measured, however there were highly significant differences for birth weight, weaning weight, back fat and production length. The traditional crossbreds have shown potential for future application, with the W most suited for commercial production, due to the equivalence with the LW.
To conclude, the crossbreeding of traditional and commercial pig breeds is a viable genetic management strategy to conserve and genetically improve both groups
Velocity distribution on thin tapered wings with fore-and-aft symmetry and spanwise constant thickness ratio at zero incidence
This report is a continuation of three earlier ones by the present authors (1947-9) and contains a theoretical investigation of subsonic flow past thin tapered unswept wings (of full or cropped-rhombus plan form), at zero incidence. Only the case of spanwise constant thickness ratio is considered in this first attempt although alternative cases also merit attention. The first order method of linear perturbation based on continuous systems of sources and sinks is shown to be still applicable to tapered wings, although mathematical difficulties are greatly increased. These have been overcome, at least in the simple case of the biconvex parabolic profile, so as to give general solutions and computable formulae for the velocity distribution over the entire wing area. Complete detailed solutions for the mid-chord line have been worked out numerically and two examples of complete numerical solutions, with corresponding isobar patterns, for the entire wing area are presented. These results are sufficient to illustrate the effect of uniform taper on the velocity field of unswept wings, and lead to a number of general conclusions. The most important of these is that, although taper brings about noticeable decrease of supervelocities at the centre, higher values are encountered further outboard so that, for cropped plan forms, two symmetrically placed maximum suction areas arise inside the two half-wings. These are relevant for determining critical Mach numbers, and the effect of taper may be, according to choice of geometrical parameters, either beneficial or detrimental as to the values of Motet, but practically never very considerable. The method wilt still be applicable to the more general, and more important, case of tapered swept-back wings, especially for delta wings, and a general solution for the velocity distribution in the central sections of such wings is given in Appendix I and shown to be consistent with the earlier solution for untapered swept wings. However, for applying the method successfully (up to detailed numerical investigation) to the more general case, automatic high-speed integrating machinery seems indispensable--to replace classical methods of transforming integrals and manual computing, as used in the past and in the present report
- …
