442 research outputs found

    Combining Experience Replay with Exploration by Random Network Distillation

    Get PDF
    Our work is a simple extension of the paper "Exploration by Random Network Distillation". More in detail, we show how to efficiently combine Intrinsic Rewards with Experience Replay in order to achieve more efficient and robust exploration (with respect to PPO/RND) and consequently better results in terms of agent performances and sample efficiency. We are able to do it by using a new technique named Prioritized Oversampled Experience Replay (POER), that has been built upon the definition of what is the important experience useful to replay. Finally, we evaluate our technique on the famous Atari game Montezuma's Revenge and some other hard exploration Atari games.Comment: 8 pages, 6 figures, accepted as full-paper at IEEE Conference on Games (CoG) 201

    Dolcher fixed point theorem and its connections with recent developments on compressive/expansive maps

    Get PDF
    Elisa Sovrano and Fabio Zanolin, "Dolcher fixed point theorem and its connections with recent developments on compressive/expansive maps", in: Rendiconti dell’Istituto di Matematica dell’Università di Trieste. An International Journal of Mathematics, 46 (2014), pp.101-121In 1948 Mario Dolcher proposed an expansive version of the Brouwer fixed point theorem for planar maps. In this article we reconsider Dolcher's result in connection with some properties, such as covering relations, which appear in the study of chaotic dynamics

    Reorientation ability in redtail splitfin (Xenotoca eiseni): Role of environmental shape, rearing in group and exposure time

    Get PDF
    When passively disoriented in an enclosed space, animals use the geometry of the environment (angular cues and metrically distinct surfaces) to find a position. Whether the ability to deal with geometry is a mechanism available at birth, with little influence of previous experience with the same kind of information, is still debated. We reared fish (Xenotoca eiseni) in tanks of different shape (circular or rectangular) either singly or in group and tested at different ages (at one week or one, five or ten months). Fish were trained to reorient in an enclosure with a distinctive geometry (a rectangular arena) and a blue wall providing non-geometric, featural information. Then, they were tested after an affine transformation that created conflict between geometric and non-geometric information as learned during training. We found that all fish, since one-week old, use significantly more the geometry of the enclosure for reorientation independently from the experience in circular or rectangular tanks. At one month of age, we observed a modulatory effect of rearing experience during learning with an advantage of individuals reared singly in rectangular cages, but no difference was evident at test. Furthermore, such effect on learning propensity disappeared later in development, i.e., when fish were trained at five or ten months of age. These results confirm that the use of geometric information provided by the shape of an enclosure is spontaneous and inborn, and that a modulatory effect of experience can appear briefly during ontogeny, but experience is not essentially needed to deal with geometry

    Remarks on Dirichlet problems with sub linear growth at infinity

    Get PDF
    We present some existence and multiplicity results for positive solutions to the Dirichlet problem associated with; under suitable conditions on the nonlinearity g(u)and thew eight function a(x): The assumptions considered are related to classical theorems about positive solutions to a sublinear elliptic equation due to Brezis-Oswald and Brown-Hess

    Navigation as a Source of Geometric Knowledge: Young Children’s Use of Length, Angle, Distance, and Direction in a Reorientation Task

    Get PDF
    Geometry is one of the highest achievements of our species, but its foundations are obscure. Consistent with longstanding suggestions that geometrical knowledge is rooted in processes guiding navigation, the present study examines potential sources of geometrical knowledge in the navigation processes by which young children establish their sense of orientation. Past research reveals that children reorient both by the shape of the surface layout and the shapes of distinctive landmarks, but it fails to clarify what shape properties children use. The present study explores 2-year-old children’s sensitivity to angle, length, distance and direction by testing disoriented children’s search in a variety of fragmented rhombic and rectangular environments. Children reoriented themselves in accord with surface distances and directions, but they failed to use surface lengths or corner angles either for directional reorientation or as local landmarks. Thus, navigating children navigate by some but not all of the abstract properties captured by formal Euclidean geometry. While navigation systems may contribute to children’s developing geometric understanding, they likely are not the sole source of abstract geometric intuitions.Psycholog

    Nonlinear differential equations having non-sign-definite weights

    Get PDF
    In the present PhD thesis we deal with the study of the existence, multiplicity and complex behaviors of solutions for some classes of boundary value problems associated with second order nonlinear ordinary differential equations of the form u+f(u)u+g(t,u)=s,u''+f(u)u'+g(t,u)=s, or u+g(t,u)=0,u''+g(t,u)=0, tIt\in I, where II is a bounded interval, f ⁣:RRf\colon\mathbb{R}\to\mathbb{R} is continuous, sRs\in\mathbb{R} and g:I×RRg: I\times \mathbb{R}\to\mathbb{R} is a perturbation term characterizing the problems. The results carried out in this dissertation are mainly based on dynamical and topological approaches. The issues we address have arisen in the field of partial differential equations. For this reason, we do not treat only the case of ordinary differential equations, but also we take advantage of some results achieved in the one dimensional setting to give applications to nonlinear boundary value problems associated with partial differential equations. In the first part of the thesis, we are interested on a problem suggested by Antonio Ambrosetti in ``Observations on global inversion theorems'' (2011). In more detail, we deal with a periodic boundary value problem associated with the first differential equation where the perturbation term is given by g(t,u):=a(t)ϕ(u)p(t)g(t,u):=a(t)\phi(u)-p(t). We assume that a,a, pL(I)p\in L^{\infty}(I) and ϕ ⁣:RR\phi\colon\mathbb{R}\to\mathbb{R} is a continuous function satisfying limξϕ(ξ)=+\lim_{|\xi|\to\infty}\phi(\xi)=+\infty. In this context, if the weight term a(t)a(t) is such that a(t)0a(t)\geq 0 for a.e. tIt\in I and Ia(t)dt>0\int_{I}a(t)\,dt>0, we generalize the result of multiplicity of solutions given by Fabry, Mawhin and Nakashama in ``A multiplicity result for periodic solutions of forced nonlinear second order ordinary differential equations'' (1986). We extend this kind of improvement also to more general nonlinear terms under local coercivity conditions. In this framework, we also treat in the same spirit Neumann problems associated with second order ordinary differential equations and periodic problems associated with first order ones. Furthermore, we face the classical case of a periodic Ambrosetti-Prodi problem with a weight term a(t)a(t) which is constant and positive. Here, considering in the second differential equation a nonlinearity g(t,u):=ϕ(u)h(t)g(t,u):=\phi(u)-h(t), we provide several conditions on the nonlinearity and the perturbative term that ensure the presence of complex behaviors for the solutions of the associated TT-periodic problem. We also compare these outcomes with the result of stability carried out by Ortega in ``Stability of a periodic problem of Ambrosetti-Prodi type'' (1990). The case with damping term is discussed as well. In the second part of this work, we solve a conjecture by Yuan Lou and Thomas Nagylaki stated in ``A semilinear parabolic system for migration and selection in population genetics'' (2002). The problem refers to the number of positive solutions for Neumann boundary value problems associated with the second differential equation when the perturbation term is given by g(t,u):=λw(t)ψ(u)g(t,u):=\lambda w(t)\psi(u) with λ>0\lambda>0, wL(I)w\in L^{\infty}(I) a sign-changing weight term such that Iw(t)dt<0\int_{I}w(t)\,dt<0 and ψ ⁣:[0,1][0,[\psi\colon[0,1]\to[0,\infty[ a non-concave continuous function satisfying ψ(0)=0=ψ(1)\psi(0)=0=\psi(1) and such that the map ξψ(ξ)/ξ\xi\mapsto \psi(\xi)/\xi is monotone decreasing. In addition to this outcome, other new results of multiplicity of positive solutions are presented as well, for both Neumann or Dirichlet boundary value problems, by means of a particular choice of indefinite weight terms w(t)w(t) and different positive nonlinear terms ψ(u)\psi(u) defined on the interval [0,1][0,1] or on the positive real semi-axis [0,+[[0,+\infty[

    Deep Reinforcement Learning and sub-problem decomposition using Hierarchical Architectures in partially observable environments

    Get PDF
    Reinforcement Learning (RL) is based on the Markov Decision Process (MDP) framework, but not all the problems of interest can be modeled with MDPs because some of them have non-markovian temporal dependencies. To handle them, one of the solutions proposed in literature is Hierarchical Reinforcement Learning (HRL). HRL takes inspiration from hierarchical planning in artificial intelligence literature and it is an emerging sub-discipline for RL, in which RL methods are augmented with some kind of prior knowledge about the high-level structure of behavior in order to decompose the underlying problem into simpler sub-problems. The high-level goal of our thesis is to investigate the advantages that a HRL approach may have over a simple RL approach. Thus, we study problems of interest (rarely tackled by mean of RL) like Sentiment Analysis, Rogue and Car Controller, showing how the ability of RL algorithms to solve them in a partially observable environment is affected by using (or not) generic hierarchical architectures based on RL algorithms of the Actor-Critic family. Remarkably, we claim that especially our work in Sentiment Analysis is very innovative for RL, resulting in state-of-the-art performances; as far as the author knows, Reinforcement Learning approach is only rarely applied to the domain of computational linguistic and sentiment analysis. Furthermore, our work on the famous video-game Rogue is probably the first example of Deep RL architecture able to explore Rogue dungeons and fight against its monsters achieving a success rate of more than 75% on the first game level. While our work on Car Controller allowed us to make some interesting considerations on the nature of some components of the policy gradient equation

    Wavefronts for a degenerate reaction-diffusion system with application to bacterial growth models

    Full text link
    We investigate wavefront solutions in a nonlinear system of two coupled reaction-diffusion equations with degenerate diffusivity: nt=nxxnb,bt=[Dnbbx]x+nb,n_t = n_{xx} - nb, \quad b_t = [D nbb_x]_x + nb, where t0,t\geq0, xRx\in\mathbb{R}, and DD is a positive diffusion coefficient. This model, introduced by Kawasaki et al. (J. Theor. Biol. 188, 1997), describes the spatial-temporal dynamics of bacterial colonies b=b(x,t)b=b(x,t) and nutrients n=n(x,t)n=n(x,t) on agar plates. Kawasaki et al. provided numerical evidence for wavefronts, leaving the analytical confirmation of these solutions an open problem. We prove the existence of an infinite family of wavefronts parameterized by their wave speed, which varies on a closed positive half-line. We provide an upper bound for the threshold speed and a lower bound for it when DD is sufficiently large. The proofs are based on several analytical tools, including the shooting method and the fixed-point theory in Fr\'echet spaces, to establish existence, and the central manifold theorem to ascertain uniqueness

    How to Quantify the Degree of Explainability: Experiments and Practical Implications

    Get PDF
    Explainable AI was born as a pathway to allow humans to explore and understand the inner working of complex systems. Though, establishing what is an explanation and objectively evaluating explainability, are not trivial tasks. With this paper, we present a new model-agnostic metric to measure the Degree of Explainability of (correct) information in an objective way, exploiting a specific theoretical model from Ordinary Language Philosophy called the Achinstein’s Theory of Explanations, implemented with an algorithm relying on deep language models for knowledge graph extraction and information retrieval. In order to understand whether this metric is actually behaving as explainability is expected to, we have devised an experiment on two realistic Explainable AI-based systems for healthcare and finance, using famous AI technology including Artificial Neural Networks and TreeSHAP. The results we obtained suggest that our proposed metric for measuring the Degree of Explainability is robust on several scenario
    corecore