5,527 research outputs found
Revisiting Visual Question Answering Baselines
Visual question answering (VQA) is an interesting learning setting for
evaluating the abilities and shortcomings of current systems for image
understanding. Many of the recently proposed VQA systems include attention or
memory mechanisms designed to support "reasoning". For multiple-choice VQA,
nearly all of these systems train a multi-class classifier on image and
question features to predict an answer. This paper questions the value of these
common practices and develops a simple alternative model based on binary
classification. Instead of treating answers as competing choices, our model
receives the answer as input and predicts whether or not an
image-question-answer triplet is correct. We evaluate our model on the Visual7W
Telling and the VQA Real Multiple Choice tasks, and find that even simple
versions of our model perform competitively. Our best model achieves
state-of-the-art performance on the Visual7W Telling task and compares
surprisingly well with the most complex systems proposed for the VQA Real
Multiple Choice task. We explore variants of the model and study its
transferability between both datasets. We also present an error analysis of our
model that suggests a key problem of current VQA systems lies in the lack of
visual grounding of concepts that occur in the questions and answers. Overall,
our results suggest that the performance of current VQA systems is not
significantly better than that of systems designed to exploit dataset biases.Comment: European Conference on Computer Visio
Bandit Models of Human Behavior: Reward Processing in Mental Disorders
Drawing an inspiration from behavioral studies of human decision making, we
propose here a general parametric framework for multi-armed bandit problem,
which extends the standard Thompson Sampling approach to incorporate reward
processing biases associated with several neurological and psychiatric
conditions, including Parkinson's and Alzheimer's diseases,
attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain.
We demonstrate empirically that the proposed parametric approach can often
outperform the baseline Thompson Sampling on a variety of datasets. Moreover,
from the behavioral modeling perspective, our parametric framework can be
viewed as a first step towards a unifying computational model capturing reward
processing abnormalities across multiple mental conditions.Comment: Conference on Artificial General Intelligence, AGI-1
Colloidal crystal growth at externally imposed nucleation clusters
We study the conditions under which and how an imposed cluster of fixed
colloidal particles at prescribed positions triggers crystal nucleation from a
metastable colloidal fluid. Dynamical density functional theory of freezing and
Brownian dynamics simulations are applied to a two-dimensional colloidal system
with dipolar interactions. The externally imposed nucleation clusters involve
colloidal particles either on a rhombic lattice or along two linear arrays
separated by a gap. Crystal growth occurs after the peaks of the nucleation
cluster have first relaxed to a cutout of the stable bulk crystal.Comment: 4 pages, accepted for publication in Phys. Rev. Let
Phase Diagram of alpha-Helical and beta-Sheet Forming Peptides
The intrinsic property of proteins to form structural motifs such as
alpha-helices and beta-sheets leads to a complex phase behavior in which
proteins can assemble into various types of aggregates including crystals,
liquidlike phases of unfolded or natively folded proteins, and amyloid fibrils.
Here we use a coarse-grained protein model that enables us to perform Monte
Carlo simulations for determining the phase diagram of natively folded
alpha-helical and unfolded beta-sheet forming peptides. The simulations reveal
the existence of various metastable peptide phases. The liquidlike phases are
metastable with respect to the fibrillar phases, and there is a hierarchy of
metastability
Hi-Val: Iterative Learning of Hierarchical Value Functions for Policy Generation
Task decomposition is effective in manifold applications where the global complexity of a problem makes planning and decision-making too demanding. This is true, for example, in high-dimensional robotics domains, where (1) unpredictabilities and modeling limitations typically prevent the manual specification of robust behaviors, and (2) learning an action policy is challenging due to the curse of dimensionality. In this work, we borrow the concept of Hierarchical Task Networks (HTNs) to decompose the learning procedure, and we exploit Upper Confidence Tree (UCT) search to introduce HOP, a novel iterative algorithm for hierarchical optimistic planning with learned value functions. To obtain better generalization and generate policies, HOP simultaneously learns and uses action values. These are used to formalize constraints within the search space and to reduce the dimensionality of the problem. We evaluate our algorithm both on a fetching task using a simulated 7-DOF KUKA light weight arm and, on a pick and delivery task with a Pioneer robot
Optical manipulation of Berry phase in a solid-state spin qubit
The phase relation between quantum states represents an essential resource
for the storage and processing of quantum information. While quantum phases are
commonly controlled dynamically by tuning energetic interactions, utilizing
geometric phases that accumulate during cyclic evolution may offer superior
robustness to noise. To date, demonstrations of geometric phase control in
solid-state systems rely on microwave fields that have limited spatial
resolution. Here, we demonstrate an all-optical method based on stimulated
Raman adiabatic passage to accumulate a geometric phase, the Berry phase, in an
individual nitrogen-vacancy (NV) center in diamond. Using diffraction-limited
laser light, we guide the NV center's spin along loops on the Bloch sphere to
enclose arbitrary Berry phase and characterize these trajectories through
time-resolved state tomography. We investigate the limits of this control due
to loss of adiabiaticity and decoherence, as well as its robustness to noise
intentionally introduced into the experimental control parameters, finding its
resilience to be independent of the amount of Berry phase enclosed. These
techniques set the foundation for optical geometric manipulation in future
implementations of photonic networks of solid state qubits linked and
controlled by light.Comment: 18 pages, 5 figure
ELAN as flexible annotation framework for sound and image processing detectors
Annotation of digital recordings in humanities research still is, to a largeextend, a process that is performed manually. This paper describes the firstpattern recognition based software components developed in the AVATecH projectand their integration in the annotation tool ELAN. AVATecH (AdvancingVideo/Audio Technology in Humanities Research) is a project that involves twoMax Planck Institutes (Max Planck Institute for Psycholinguistics, Nijmegen,Max Planck Institute for Social Anthropology, Halle) and two FraunhoferInstitutes (Fraunhofer-Institut für Intelligente Analyse- undInformationssysteme IAIS, Sankt Augustin, Fraunhofer Heinrich-Hertz-Institute,Berlin) and that aims to develop and implement audio and video technology forsemi-automatic annotation of heterogeneous media collections as they occur inmultimedia based research. The highly diverse nature of the digital recordingsstored in the archives of both Max Planck Institutes, poses a huge challenge tomost of the existing pattern recognition solutions and is a motivation to makesuch technology available to researchers in the humanities
- …
