22 research outputs found
Detailed Heap Profiling
Modern software systems heavily use the memory heap. As systems grow more complex and compute with increasing amounts of data, it can be difficult for developers to understand how their programs actually use the bytes that they allocate on the heap and whether improvements are possible. To answer this question of heap usage efficiency, we have built a new, detailed heap profiler called Memoro. Memoro uses a combination of static instrumentation, subroutine interception, and runtime data collection to build a clear picture of exactly when and where a program performs heap allocation, and crucially how it actually uses that memory. Memoro also introduces a new visualization application that can distill collected data into scores and visual cues that allow developers to quickly pinpoint and eliminate inefficient heap usage in their software. Our evaluation and experience with several applications demonstrates that Memoro can reduce heap usage and produce runtime improvements of 10%
Semantic Fuzzing with Zest
Programs expecting structured inputs often consist of both a syntactic
analysis stage, which parses raw input, and a semantic analysis stage, which
conducts checks on the parsed input and executes the core logic of the program.
Generator-based testing tools in the lineage of QuickCheck are a promising way
to generate random syntactically valid test inputs for these programs. We
present Zest, a technique which automatically guides QuickCheck-like
randominput generators to better explore the semantic analysis stage of test
programs. Zest converts random-input generators into deterministic parametric
generators. We present the key insight that mutations in the untyped parameter
domain map to structural mutations in the input domain. Zest leverages program
feedback in the form of code coverage and input validity to perform
feedback-directed parameter search. We evaluate Zest against AFL and QuickCheck
on five Java programs: Maven, Ant, BCEL, Closure, and Rhino. Zest covers
1.03x-2.81x as many branches within the benchmarks semantic analysis stages as
baseline techniques. Further, we find 10 new bugs in the semantic analysis
stages of these benchmarks. Zest is the most effective technique in finding
these bugs reliably and quickly, requiring at most 10 minutes on average to
find each bug.Comment: To appear in Proceedings of 28th ACM SIGSOFT International Symposium
on Software Testing and Analysis (ISSTA'19
Automatic and Portable Mapping of Data Parallel Programs to OpenCL for GPU-Based Heterogeneous Systems
General-purpose GPU-based systems are highly attractive, as they give potentially massive performance at little cost. Realizing such potential is challenging due to the complexity of programming. This article presents a compiler-based approach to automatically generate optimized OpenCL code from data parallel OpenMP programs for GPUs. A key feature of our scheme is that it leverages existing transformations, especially data transformations, to improve performance on GPU architectures and uses automatic machine learning to build a predictive model to determine if it is worthwhile running the OpenCL code on the GPU or OpenMP code on the multicore host. We applied our approach to the entire NAS parallel benchmark suite and evaluated it on distinct GPU-based systems. We achieved average (up to) speedups of 4.51× and 4.20× (143× and 67×) on Core i7/NVIDIA GeForce GTX580 and Core i7/AMD Radeon 7970 platforms, respectively, over a sequential baseline. Our approach achieves, on average, greater than 10× speedups over two state-of-the-art automatic GPU code generators
Assessing the State and Improving the Art of Parallel Testing for C
The execution latency of a test suite strongly depends on the degree of concurrency with which test cases are executed. However, if test cases are not designed for concurrent execution, they may interfere, causing result deviations compared to sequential execution. To prevent this, each test case can be provided with an isolated execution environment, but the resulting overheads diminish the merit of parallel testing. Our large-scale analysis of the Debian Buster package repository shows that existing test suites in C projects make limited use of parallelization. We present an approach to (a) analyze the potential of C test suites for safe concurrent execution, i.e., result invariance compared to sequential execution, and (b) execute tests concurrently with different parallelization strategies using processes or threads if it is found to be safe. Applying our approach to 9 C projects, we find that most of them cannot safely execute tests in parallel due to unsafe test code or unsafe usage of shared variables or files within the program code. Parallel test execution shows a significant acceleration over sequential execution for most projects. We find that multi-threading rarely outperforms multi-processing. Finally, we observe that the lack of a common test framework for C leaves make as the standard driver for running tests, which introduces unnecessary performance overheads for test execution
