5,527 research outputs found

    MP 2009-08

    Get PDF

    Revisiting Visual Question Answering Baselines

    Full text link
    Visual question answering (VQA) is an interesting learning setting for evaluating the abilities and shortcomings of current systems for image understanding. Many of the recently proposed VQA systems include attention or memory mechanisms designed to support "reasoning". For multiple-choice VQA, nearly all of these systems train a multi-class classifier on image and question features to predict an answer. This paper questions the value of these common practices and develops a simple alternative model based on binary classification. Instead of treating answers as competing choices, our model receives the answer as input and predicts whether or not an image-question-answer triplet is correct. We evaluate our model on the Visual7W Telling and the VQA Real Multiple Choice tasks, and find that even simple versions of our model perform competitively. Our best model achieves state-of-the-art performance on the Visual7W Telling task and compares surprisingly well with the most complex systems proposed for the VQA Real Multiple Choice task. We explore variants of the model and study its transferability between both datasets. We also present an error analysis of our model that suggests a key problem of current VQA systems lies in the lack of visual grounding of concepts that occur in the questions and answers. Overall, our results suggest that the performance of current VQA systems is not significantly better than that of systems designed to exploit dataset biases.Comment: European Conference on Computer Visio

    AFES Miscellaneous Publication 2008-03

    Get PDF

    Bandit Models of Human Behavior: Reward Processing in Mental Disorders

    Full text link
    Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions.Comment: Conference on Artificial General Intelligence, AGI-1

    Colloidal crystal growth at externally imposed nucleation clusters

    Full text link
    We study the conditions under which and how an imposed cluster of fixed colloidal particles at prescribed positions triggers crystal nucleation from a metastable colloidal fluid. Dynamical density functional theory of freezing and Brownian dynamics simulations are applied to a two-dimensional colloidal system with dipolar interactions. The externally imposed nucleation clusters involve colloidal particles either on a rhombic lattice or along two linear arrays separated by a gap. Crystal growth occurs after the peaks of the nucleation cluster have first relaxed to a cutout of the stable bulk crystal.Comment: 4 pages, accepted for publication in Phys. Rev. Let

    Phase Diagram of alpha-Helical and beta-Sheet Forming Peptides

    Full text link
    The intrinsic property of proteins to form structural motifs such as alpha-helices and beta-sheets leads to a complex phase behavior in which proteins can assemble into various types of aggregates including crystals, liquidlike phases of unfolded or natively folded proteins, and amyloid fibrils. Here we use a coarse-grained protein model that enables us to perform Monte Carlo simulations for determining the phase diagram of natively folded alpha-helical and unfolded beta-sheet forming peptides. The simulations reveal the existence of various metastable peptide phases. The liquidlike phases are metastable with respect to the fibrillar phases, and there is a hierarchy of metastability

    Hi-Val: Iterative Learning of Hierarchical Value Functions for Policy Generation

    Get PDF
    Task decomposition is effective in manifold applications where the global complexity of a problem makes planning and decision-making too demanding. This is true, for example, in high-dimensional robotics domains, where (1) unpredictabilities and modeling limitations typically prevent the manual specification of robust behaviors, and (2) learning an action policy is challenging due to the curse of dimensionality. In this work, we borrow the concept of Hierarchical Task Networks (HTNs) to decompose the learning procedure, and we exploit Upper Confidence Tree (UCT) search to introduce HOP, a novel iterative algorithm for hierarchical optimistic planning with learned value functions. To obtain better generalization and generate policies, HOP simultaneously learns and uses action values. These are used to formalize constraints within the search space and to reduce the dimensionality of the problem. We evaluate our algorithm both on a fetching task using a simulated 7-DOF KUKA light weight arm and, on a pick and delivery task with a Pioneer robot

    Optical manipulation of Berry phase in a solid-state spin qubit

    Full text link
    The phase relation between quantum states represents an essential resource for the storage and processing of quantum information. While quantum phases are commonly controlled dynamically by tuning energetic interactions, utilizing geometric phases that accumulate during cyclic evolution may offer superior robustness to noise. To date, demonstrations of geometric phase control in solid-state systems rely on microwave fields that have limited spatial resolution. Here, we demonstrate an all-optical method based on stimulated Raman adiabatic passage to accumulate a geometric phase, the Berry phase, in an individual nitrogen-vacancy (NV) center in diamond. Using diffraction-limited laser light, we guide the NV center's spin along loops on the Bloch sphere to enclose arbitrary Berry phase and characterize these trajectories through time-resolved state tomography. We investigate the limits of this control due to loss of adiabiaticity and decoherence, as well as its robustness to noise intentionally introduced into the experimental control parameters, finding its resilience to be independent of the amount of Berry phase enclosed. These techniques set the foundation for optical geometric manipulation in future implementations of photonic networks of solid state qubits linked and controlled by light.Comment: 18 pages, 5 figure

    ELAN as flexible annotation framework for sound and image processing detectors

    Get PDF
    Annotation of digital recordings in humanities research still is, to a largeextend, a process that is performed manually. This paper describes the firstpattern recognition based software components developed in the AVATecH projectand their integration in the annotation tool ELAN. AVATecH (AdvancingVideo/Audio Technology in Humanities Research) is a project that involves twoMax Planck Institutes (Max Planck Institute for Psycholinguistics, Nijmegen,Max Planck Institute for Social Anthropology, Halle) and two FraunhoferInstitutes (Fraunhofer-Institut für Intelligente Analyse- undInformationssysteme IAIS, Sankt Augustin, Fraunhofer Heinrich-Hertz-Institute,Berlin) and that aims to develop and implement audio and video technology forsemi-automatic annotation of heterogeneous media collections as they occur inmultimedia based research. The highly diverse nature of the digital recordingsstored in the archives of both Max Planck Institutes, poses a huge challenge tomost of the existing pattern recognition solutions and is a motivation to makesuch technology available to researchers in the humanities
    corecore