715 research outputs found
Semantic Source Code Models Using Identifier Embeddings
The emergence of online open source repositories in the recent years has led
to an explosion in the volume of openly available source code, coupled with
metadata that relate to a variety of software development activities. As an
effect, in line with recent advances in machine learning research, software
maintenance activities are switching from symbolic formal methods to
data-driven methods. In this context, the rich semantics hidden in source code
identifiers provide opportunities for building semantic representations of code
which can assist tasks of code search and reuse. To this end, we deliver in the
form of pretrained vector space models, distributed code representations for
six popular programming languages, namely, Java, Python, PHP, C, C++, and C#.
The models are produced using fastText, a state-of-the-art library for learning
word representations. Each model is trained on data from a single programming
language; the code mined for producing all models amounts to over 13.000
repositories. We indicate dissimilarities between natural language and source
code, as well as variations in coding conventions in between the different
programming languages we processed. We describe how these heterogeneities
guided the data preprocessing decisions we took and the selection of the
training parameters in the released models. Finally, we propose potential
applications of the models and discuss limitations of the models.Comment: 16th International Conference on Mining Software Repositories (MSR
2019): Data Showcase Trac
On the Feasibility of Transfer-learning Code Smells using Deep Learning
Context: A substantial amount of work has been done to detect smells in
source code using metrics-based and heuristics-based methods. Machine learning
methods have been recently applied to detect source code smells; however, the
current practices are considered far from mature. Objective: First, explore the
feasibility of applying deep learning models to detect smells without extensive
feature engineering, just by feeding the source code in tokenized form. Second,
investigate the possibility of applying transfer-learning in the context of
deep learning models for smell detection. Method: We use existing metric-based
state-of-the-art methods for detecting three implementation smells and one
design smell in C# code. Using these results as the annotated gold standard, we
train smell detection models on three different deep learning architectures.
These architectures use Convolution Neural Networks (CNNs) of one or two
dimensions, or Recurrent Neural Networks (RNNs) as their principal hidden
layers. For the first objective of our study, we perform training and
evaluation on C# samples, whereas for the second objective, we train the models
from C# code and evaluate the models over Java code samples. We perform the
experiments with various combinations of hyper-parameters for each model.
Results: We find it feasible to detect smells using deep learning methods. Our
comparative experiments find that there is no clearly superior method between
CNN-1D and CNN-2D. We also observe that performance of the deep learning models
is smell-specific. Our transfer-learning experiments show that
transfer-learning is definitely feasible for implementation smells with
performance comparable to that of direct-learning. This work opens up a new
paradigm to detect code smells by transfer-learning especially for the
programming languages where the comprehensive code smell detection tools are
not available
Identifying Bugs in Make and JVM-Oriented Builds
Incremental and parallel builds are crucial features of modern build systems.
Parallelism enables fast builds by running independent tasks simultaneously,
while incrementality saves time and computing resources by processing the build
operations that were affected by a particular code change. Writing build
definitions that lead to error-free incremental and parallel builds is a
challenging task. This is mainly because developers are often unable to predict
the effects of build operations on the file system and how different build
operations interact with each other. Faulty build scripts may seriously degrade
the reliability of automated builds, as they cause build failures, and
non-deterministic and incorrect build results.
To reason about arbitrary build executions, we present buildfs, a
generally-applicable model that takes into account the specification (as
declared in build scripts) and the actual behavior (low-level file system
operation) of build operations. We then formally define different types of
faults related to incremental and parallel builds in terms of the conditions
under which a file system operation violates the specification of a build
operation. Our testing approach, which relies on the proposed model, analyzes
the execution of single full build, translates it into buildfs, and uncovers
faults by checking for corresponding violations.
We evaluate the effectiveness, efficiency, and applicability of our approach
by examining hundreds of Make and Gradle projects. Notably, our method is the
first to handle Java-oriented build systems. The results indicate that our
approach is (1) able to uncover several important issues (245 issues found in
45 open-source projects have been confirmed and fixed by the upstream
developers), and (2) orders of magnitude faster than a state-of-the-art tool
for Make builds
Detecting Missing Dependencies and Notifiers in Puppet Programs
Puppet is a popular computer system configuration management tool. It
provides abstractions that enable administrators to setup their computer
systems declaratively. Its use suffers from two potential pitfalls. First, if
ordering constraints are not specified whenever an abstraction depends on
another, the non-deterministic application of abstractions can lead to race
conditions. Second, if a service is not tied to its resources through
notification constructs, the system may operate in a stale state whenever a
resource gets modified. Such faults can degrade a computing infrastructure's
availability and functionality.
We have developed an approach that identifies these issues through the
analysis of a Puppet program and its system call trace. Specifically, we
present a formal model for traces, which allows us to capture the interactions
of Puppet abstractions with the file system. By analyzing these interactions we
identify (1) abstractions that are related to each other (e.g., operate on the
same file), and (2) abstractions that should act as notifiers so that changes
are correctly propagated. We then check the relationships from the trace's
analysis against the program's dependency graph: a representation containing
all the ordering constraints and notifications declared in the program. If a
mismatch is detected, our system reports a potential fault.
We have evaluated our method on a large set of Puppet modules, and discovered
57 previously unknown issues in 30 of them. Benchmarking further shows that our
approach can analyze in minutes real-world configurations with a magnitude
measured in thousands of lines and millions of system calls
Improving the quality of APIs through the analysis of software crash reports
Modern programs depend on APIS to implement a significant part of their functionality. Apart from the way developers use APIS to build their software, the stability of these programs relies on the APIS design and implementation. In this work, we evaluate the reliability of APIS, by examining software telemetry data, in the form of stack traces, coming from Android application crashes. We got 4.9 GB worth of crash data that thousands of applications send to a centralized crash report management service. We processed that data to extract approximately a million stack traces, stitching together parts of chained exceptions, and established heuristic rules to draw the border between applications and API calls.
We examined 80% of the stack traces to map the space of the most common application failure reasons. Our findings show that the top ones can be attributed to memory exhaustion, race conditions or deadlocks, and missing or corrupt resources. At the same time,
a significant number of our stack traces (over 10%) remains unclassified due to generic unchecked exceptions, which do not highlight the problems that lead to crashes. Finally, given the classes of crash causes we found, we argue that API design and implementation improvements, such as specific exceptions, non-blocking algorithms, and default resources, can eliminate common failures
Evaluating Maintainability Prejudices with a Large-Scale Study of Open-Source Projects
Exaggeration or context changes can render maintainability experience into
prejudice. For example, JavaScript is often seen as least elegant language and
hence of lowest maintainability. Such prejudice should not guide decisions
without prior empirical validation. We formulated 10 hypotheses about
maintainability based on prejudices and test them in a large set of open-source
projects (6,897 GitHub repositories, 402 million lines, 5 programming
languages). We operationalize maintainability with five static analysis
metrics. We found that JavaScript code is not worse than other code, Java code
shows higher maintainability than C# code and C code has longer methods than
other code. The quality of interface documentation is better in Java code than
in other code. Code developed by teams is not of higher and large code bases
not of lower maintainability. Projects with high maintainability are not more
popular or more often forked. Overall, most hypotheses are not supported by
open-source data.Comment: 20 page
Commands as AI Conversations
Developers and data scientists often struggle to write command-line inputs,
even though graphical interfaces or tools like ChatGPT can assist. The
solution? "ai-cli," an open-source system inspired by GitHub Copilot that
converts natural language prompts into executable commands for various Linux
command-line tools. By tapping into OpenAI's API, which allows interaction
through JSON HTTP requests, "ai-cli" transforms user queries into actionable
command-line instructions. However, integrating AI assistance across multiple
command-line tools, especially in open source settings, can be complex.
Historically, operating systems could mediate, but individual tool
functionality and the lack of a unified approach have made centralized
integration challenging. The "ai-cli" tool, by bridging this gap through
dynamic loading and linking with each program's Readline library API, makes
command-line interfaces smarter and more user-friendly, opening avenues for
further enhancement and cross-platform applicability.Comment: 5 page
Undocumented and unchecked: Exceptions that spell trouble
Modern programs rely on large application programming interfaces (APIs). The Android framework comprises 231 core APIs, and is used by countless developers. We examine a sample of 4,900 distinct crash stack traces from 1,800 different Android applications, looking for Android API methods with undocumented exceptions that are part of application crashes. For the purposes of this study, we take as a reference the version 15 of the Android API, which matches our stack traces. Our results show that a significant number of crashes (19%) might have been avoided if these methods had the corresponding exceptions documented as part of their interface
- …
