217 research outputs found
Overcoming Language Dichotomies: Toward Effective Program Comprehension for Mobile App Development
Mobile devices and platforms have become an established target for modern
software developers due to performant hardware and a large and growing user
base numbering in the billions. Despite their popularity, the software
development process for mobile apps comes with a set of unique, domain-specific
challenges rooted in program comprehension. Many of these challenges stem from
developer difficulties in reasoning about different representations of a
program, a phenomenon we define as a "language dichotomy". In this paper, we
reflect upon the various language dichotomies that contribute to open problems
in program comprehension and development for mobile apps. Furthermore, to help
guide the research community towards effective solutions for these problems, we
provide a roadmap of directions for future work.Comment: Invited Keynote Paper for the 26th IEEE/ACM International Conference
on Program Comprehension (ICPC'18
A Generic Approach to Fix Test Flakiness in Real-World Projects
Test flakiness, a non-deterministic behavior of builds irrelevant to code
changes, is a major and continuing impediment to delivering reliable software.
The very few techniques for the automated repair of test flakiness are
specifically crafted to repair either Order-Dependent (OD) or
Implementation-Dependent (ID) flakiness. They are also all symbolic approaches,
i.e., leverage program analysis to detect and repair known test flakiness
patterns and root causes, failing to generalize. To bridge the gap, we propose
FlakyDoctor, a neuro-symbolic technique that combines the power of
LLMs-generalizability-and program analysis-soundness-to fix different types of
test flakiness. Our extensive evaluation using 873 confirmed flaky tests (332
OD and 541 ID) from 243 real-world projects demonstrates the ability of
FlakyDoctor in repairing flakiness, achieving 57% (OD) and 59% (ID) success
rate. Comparing to three alternative flakiness repair approaches, FlakyDoctor
can repair 8% more ID tests than DexFix, 12% more OD flaky tests than ODRepair,
and 17% more OD flaky tests than iFixFlakies. Regardless of underlying LLM, the
non-LLM components of FlakyDoctor contribute to 12-31% of the overall
performance, i.e., while part of the FlakyDoctor power is from using LLMs, they
are not good enough to repair flaky tests in real-world projects alone. What
makes the proposed technique superior to related research on test flakiness
mitigation specifically and program repair, in general, is repairing 79
previously unfixed flaky tests in real-world projects. We opened pull requests
for all cases with corresponding patches; 19 of them were accepted and merged
at the time of submission
Therapeutic Possibilities of Ceftazidime Nanoparticles in Devastating Pseudomonas Ophthalmic Infections; Keratitis and Endophthalmitis
As the number of contact‐lens wearers rises worldwide, Pseudomonas aeruginosa (PA) keratitis is attracting more attention as a major public health issue. Corneal lesions of PA, being the most intimidating complication of contact‐lens wearer, can progress rapidly in spite of local antibiotic treatment, and may result in perforation and the permanent loss of vision. One of the explanations proposed for the evasion of the pathogen from immune responses of the host as well as antibacterial treatment is the fact that invasive clinical isolates of PA have the unusual ability to invade and replicate within surface corneal epithelial cells. In this manner, PA is left with an intracellular sanctuary. Endophthalmitis, albeit rare, is another ophthalmic infection faced by the challenge of drug delivery that can be potentially catastrophic. The present hypothesis is that nanoparticles can carry anti‐pseudomonas antibiotics (e.g. ceftazidime) through the membranes, into the “hidden zone” of the pathogen, hence being an effective and potent therapeutic approach against pseudomonas keratitis and endophthalmitis
Perfect is the enemy of test oracle
Automation of test oracles is one of the most challenging facets of software
testing, but remains comparatively less addressed compared to automated test
input generation. Test oracles rely on a ground-truth that can distinguish
between the correct and buggy behavior to determine whether a test fails
(detects a bug) or passes. What makes the oracle problem challenging and
undecidable is the assumption that the ground-truth should know the exact
expected, correct, or buggy behavior. However, we argue that one can still
build an accurate oracle without knowing the exact correct or buggy behavior,
but how these two might differ. This paper presents SEER, a learning-based
approach that in the absence of test assertions or other types of oracle, can
determine whether a unit test passes or fails on a given method under test
(MUT). To build the ground-truth, SEER jointly embeds unit tests and the
implementation of MUTs into a unified vector space, in such a way that the
neural representation of tests are similar to that of MUTs they pass on them,
but dissimilar to MUTs they fail on them. The classifier built on top of this
vector representation serves as the oracle to generate "fail" labels, when test
inputs detect a bug in MUT or "pass" labels, otherwise. Our extensive
experiments on applying SEER to more than 5K unit tests from a diverse set of
open-source Java projects show that the produced oracle is (1) effective in
predicting the fail or pass labels, achieving an overall accuracy, precision,
recall, and F1 measure of 93%, 86%, 94%, and 90%, (2) generalizable, predicting
the labels for the unit test of projects that were not in training or
validation set with negligible performance drop, and (3) efficient, detecting
the existence of bugs in only 6.5 milliseconds on average.Comment: Published in ESEC/FSE 202
CodeMind: A Framework to Challenge Large Language Models for Code Reasoning
Solely relying on test passing to evaluate Large Language Models (LLMs) for
code synthesis may result in unfair assessment or promoting models with data
leakage. As an alternative, we introduce CodeMind, a framework designed to
gauge the code reasoning abilities of LLMs. CodeMind currently supports three
code reasoning tasks: Independent Execution Reasoning (IER), Dependent
Execution Reasoning (DER), and Specification Reasoning (SR). The first two
evaluate models to predict the execution output of an arbitrary code or code
the model could correctly synthesize. The third one evaluates the extent to
which LLMs implement the specified expected behavior.
Our extensive evaluation of nine LLMs across five benchmarks in two different
programming languages using CodeMind shows that LLMs fairly follow control flow
constructs and, in general, explain how inputs evolve to output, specifically
for simple programs and the ones they can correctly synthesize. However, their
performance drops for code with higher complexity, non-trivial logical and
arithmetic operators, non-primitive types, and API calls. Furthermore, we
observe that, while correlated, specification reasoning (essential for code
synthesis) does not imply execution reasoning (essential for broader
programming tasks such as testing and debugging): ranking LLMs based on test
passing can be different compared to code reasoning
Comparison of the corneal power measurements with the TMS4-topographer, pentacam HR, IOL master, and javal keratometer
Purpose: The aim was to compare the corneal curvature and power measured with a corneal topographer, Scheimpflug camera, optical biometer, and Javal keratometer. Materials and Methods: A total of 76 myopic individuals who were candidates for photorefractive keratectomy were selected in a cross-sectional study. Manual keratometry (Javal Schiotz type; Haag-Streit AG, Koeniz, Switzerland), automated keratometry (IOL Master version 3.02, Carl Zeiss Meditec, Jena, Germany), topography (TMS4, Tomey, Erlangen, Germany), and Pentacam HR (Oculus, Wetzlar, Germany) were performed for all participants. The 95 limits of agreement (LOAs) were reported to evaluate the agreement between devices. Results: The mean corneal power measurements were 44.3 ± 1.59, 44.25 ± 1.59, 43.68 ± 1.44, and 44.31 ± 1.61 D with a Javal keratometer, TMS4-topographer, the Pentacam and IOL Master respectively. Only the IOL Master showed no significant difference with Javal keratometer in measuring the corneal power (P = 0.965). The correlations of the Javal keratometer with TMS4-topography, Pentacam, and IOL Master was 0.991. 0.982, and 0.993 respectively. The 95 LOAs of the Javal keratometer with TMS4-topography, Pentacam, and IOL Master were - 0.361 to 0.49, -0.01 to 1.14, and - 0.36 to 0.36 D, respectively. Conclusion: Although the correlation of Pentacam, TMS4-topography, IOL Master, and Javal keratometer in measuring keratometry was high, only the IOL Master showed no significant difference with the Javal keratometer. The IOL Master had the best agreement with Javal keratometry
White-box Compiler Fuzzing Empowered by Large Language Models
Compiler correctness is crucial, as miscompilation falsifying the program
behaviors can lead to serious consequences. In the literature, fuzzing has been
extensively studied to uncover compiler defects. However, compiler fuzzing
remains challenging: Existing arts focus on black- and grey-box fuzzing, which
generates tests without sufficient understanding of internal compiler
behaviors. As such, they often fail to construct programs to exercise
conditions of intricate optimizations. Meanwhile, traditional white-box
techniques are computationally inapplicable to the giant codebase of compilers.
Recent advances demonstrate that Large Language Models (LLMs) excel in code
generation/understanding tasks and have achieved state-of-the-art performance
in black-box fuzzing. Nonetheless, prompting LLMs with compiler source-code
information remains a missing piece of research in compiler testing.
To this end, we propose WhiteFox, the first white-box compiler fuzzer using
LLMs with source-code information to test compiler optimization. WhiteFox
adopts a dual-model framework: (i) an analysis LLM examines the low-level
optimization source code and produces requirements on the high-level test
programs that can trigger the optimization; (ii) a generation LLM produces test
programs based on the summarized requirements. Additionally,
optimization-triggering tests are used as feedback to further enhance the test
generation on the fly. Our evaluation on four popular compilers shows that
WhiteFox can generate high-quality tests to exercise deep optimizations
requiring intricate conditions, practicing up to 80 more optimizations than
state-of-the-art fuzzers. To date, WhiteFox has found in total 96 bugs, with 80
confirmed as previously unknown and 51 already fixed. Beyond compiler testing,
WhiteFox can also be adapted for white-box fuzzing of other complex, real-world
software systems in general
Energy Wars - Chrome vs. Firefox Which browser is more energy efficient?
This paper presents a preliminary study on the energy consump-
tion of two popular web browsers. In order to properly measure
the energy consumption of both environments, we simulate the
usage of various applications, which the goal to mimic typical user
interactions and usage.
Our preliminary results show interesting findings based on ob-
servation, such as what type of interactions generate high peaks
of energy consumption, and which browser is overall the most
efficient. Our goal with this preliminary study is to show to users
how very different the efficiency of web browsers can be, and may
serve with advances in this area of study.FCT -Fundação para a Ciência e a Tecnologia (UIDB/50014/2020
Therapeutic Possibilities of Ceftazidime Nanoparticles in Devastating Pseudomonas Ophthalmic Infections; Keratitis and Endophthalmitis
As the number of contactâ€lens wearers rises worldwide, Pseudomonas aeruginosa (PA) keratitis is attracting more attention as a major public health issue. Corneal lesions of PA, being the most intimidating complication of contactâ€lens wearer, can progress rapidly in spite of local antibiotic treatment, and may result in perforation and the permanent loss of vision. One of the explanations proposed for the evasion of the pathogen from immune responses of the host as well as antibacterial treatment is the fact that invasive clinical isolates of PA have the unusual ability to invade and replicate within surface corneal epithelial cells. In this manner, PA is left with an intracellular sanctuary. Endophthalmitis, albeit rare, is another ophthalmic infection faced by the challenge of drug delivery that can be potentially catastrophic. The present hypothesis is that nanoparticles can carry antiâ€pseudomonas antibiotics (e.g. ceftazidime) through the membranes, into the “hidden zone†of the pathogen, hence being an effective and potent therapeutic approach against pseudomonas keratitis and endophthalmitis
- …
