204,136 research outputs found

    Smoothing Policies and Safe Policy Gradients

    Full text link
    Policy gradient algorithms are among the best candidates for the much anticipated application of reinforcement learning to real-world control tasks, such as the ones arising in robotics. However, the trial-and-error nature of these methods introduces safety issues whenever the learning phase itself must be performed on a physical system. In this paper, we address a specific safety formulation, where danger is encoded in the reward signal and the learning agent is constrained to never worsen its performance. By studying actor-only policy gradient from a stochastic optimization perspective, we establish improvement guarantees for a wide class of parametric policies, generalizing existing results on Gaussian policies. This, together with novel upper bounds on the variance of policy gradient estimators, allows to identify those meta-parameter schedules that guarantee monotonic improvement with high probability. The two key meta-parameters are the step size of the parameter updates and the batch size of the gradient estimators. By a joint, adaptive selection of these meta-parameters, we obtain a safe policy gradient algorithm

    Portfolio of compositions

    Get PDF
    This text contains a short general description of my experience at the University of Birmingham. This is an attempt to communicate how my perception in composing music has changed and evolved. The focus is to introduce briefly my experience before I arrived at the University of Birmingham, then go through all the compositions I have worked on during my PhD programme. This attempt is to explain the main processes I have used for composing, giving a wider view of the issues that I was interested in developing. Furthermore, I will consider some technical aspects with reference to facilities that the University of Birmingham offers to students. This appears to be the right opportunity for them to explore technology almost without any restrictions. I also give some information about other nonmusical issues, which I was interested in developing in order to look into personal aesthetic directions. My main reason for being at the University of Birmingham was to explore compositional processes different from my previous experiences, in order to enlarge my abilities and perspectives in music composition

    Heavy Quark Production: Theory vs. Experiment

    Full text link
    The current status of the comparisons between some experimental results and theoretical predictions for heavy quark production is reviewed. It is shown that the combination of new theoretical tools and better experimental input allows for a good description of charm, bottom and top hadroproduction, with no significant discrepancies between theory and experiment. Theoretical progress in the resummation of large logarithms and inclusion of power corrections for the heavy quark fragmentation function is also discussed.Comment: 6 pages, LaTeX, Talk given at the Workshop on High Energy Physics IFAE 2003, Lecce, Italy, 23-26 April 200

    Regularity results and Harnack inequalities for minimizers and solutions of nonlocal problems: a unified approach via fractional De Giorgi classes

    Full text link
    We study energy functionals obtained by adding a possibly discontinuous potential to an interaction term modeled upon a Gagliardo-type fractional seminorm. We prove that minimizers of such non-differentiable functionals are locally bounded, H\"older continuous, and that they satisfy a suitable Harnack inequality. Hence, we provide an extension of celebrated results of M. Giaquinta and E. Giusti to the nonlocal setting. To do this, we introduce a particular class of fractional Sobolev functions, reminiscent of that considered by E. De Giorgi in his seminal paper of 1957. The flexibility of these classes allows us to also establish regularity of solutions to rather general nonlinear integral equations.Comment: 59 page

    Combinatorially two-orbit convex polytopes

    Full text link
    Any convex polytope whose combinatorial automorphism group has two orbits on the flags is isomorphic to one whose group of Euclidean symmetries has two orbits on the flags (equivalently, to one whose automorphism group and symmetry group coincide.) Hence, a combinatorially two-orbit convex polytope is isomorphic to one of a known finite list, all of which are 3-dimensional: the cuboctahedron, icosidodecahedron, rhombic dodecahedron, or rhombic triacontahedron. The same is true of combinatorially two-orbit normal face-to-face tilings by convex polytopes.Comment: 20 page

    Elliptic Genus Derivation of 4d Holomorphic Blocks

    Full text link
    We study elliptic vortices on C×T2\mathbb{C}\times T^2 by considering the 2d quiver gauge theory describing their moduli spaces. The elliptic genus of these moduli spaces is the elliptic version of vortex partition function of the 4d theory. We focus on two examples: the first is a N=1\mathcal{N}=1, U(N)\mathrm{U}(N) gauge theory with fundamental and anti-fundamental matter; the second is a N=2\mathcal{N}=2, U(N)\mathrm{U}(N) gauge theory with matter in the fundamental representation. The results are instances of 4d "holomorphic blocks" into which partition functions on more complicated surfaces factorize. They can also be interpreted as free-field representations of elliptic Virasoro algebrae.Comment: 15 pages, 2 figure
    corecore