78 research outputs found

    A dynamic programming algorithm for the optimal control of piecewise deterministic Markov processes

    Get PDF
    Publisher's version/PDFA piecewise deterministic Markov process (PDP) is a continuous time Markov process consisting of continuous, deterministic trajectories interrupted by random jumps. The trajectories may be controlled with the object of minimizing the expected costs associated with the process. A method of representing this controlled PDP as a discrete time decision process is presented, allowing the value function for the problem to be expressed as the fixed point of a dynamic programming operator. Decisions take the form of trajectory segments. The expected costs may then be minimized through a dynamic programming algorithm, rather than through the solution of the Bellman–Hamilton–Jacobi equation, assuming the trajectory segments are numerically tractable. The technique is applied to the optimal capacity expansion problem, that is, the problem of planning the construction of new production facilities to meet rising demand

    Testing Sparsity Assumptions in Bayesian Networks

    Full text link
    Bayesian network (BN) structure discovery algorithms typically either make assumptions about the sparsity of the true underlying network, or are limited by computational constraints to networks with a small number of variables. While these sparsity assumptions can take various forms, frequently the assumptions focus on an upper bound for the maximum in-degree of the underlying graph G\nabla_G. Theorem 2 in Duttweiler et. al. (2023) demonstrates that the largest eigenvalue of the normalized inverse covariance matrix (Ω\Omega) of a linear BN is a lower bound for G\nabla_G. Building on this result, this paper provides the asymptotic properties of, and a debiasing procedure for, the sample eigenvalues of Ω\Omega, leading to a hypothesis test that may be used to determine if the BN has max in-degree greater than 1. A linear BN structure discovery workflow is suggested in which the investigator uses this hypothesis test to aid in selecting an appropriate structure discovery algorithm. The hypothesis test performance is evaluated through simulations and the workflow is demonstrated on data from a human psoriasis study

    Statistical evaluation of improvement in RNA secondary structure prediction

    Get PDF
    With discovery of diverse roles for RNA, its centrality in cellular functions has become increasingly apparent. A number of algorithms have been developed to predict RNA secondary structure. Their performance has been benchmarked by comparing structure predictions to reference secondary structures. Generally, algorithms are compared against each other and one is selected as best without statistical testing to determine whether the improvement is significant. In this work, it is demonstrated that the prediction accuracies of methods correlate with each other over sets of sequences. One possible reason for this correlation is that many algorithms use the same underlying principles. A set of benchmarks published previously for programs that predict a structure common to three or more sequences is statistically analyzed as an example to show that it can be rigorously evaluated using paired two-sample t-tests. Finally, a pipeline of statistical analyses is proposed to guide the choice of data set size and performance assessment for benchmarks of structure prediction. The pipeline is applied using 5S rRNA sequences as an example

    Selection of Statistical Thresholds in Graphical Models

    No full text
    Reconstruction of gene regulatory networks based on experimental data usually relies on statistical evidence, necessitating the choice of a statistical threshold which defines a significant biological effect. Approaches to this problem found in the literature range from rigorous multiple testing procedures to ad hoc P-value cut-off points. However, when the data implies graphical structure, it should be possible to exploit this feature in the threshold selection process. In this article we propose a procedure based on this principle. Using coding theory we devise a measure of graphical structure, for example, highly connected nodes or chain structure. The measure for a particular graph can be compared to that of a random graph and structure inferred on that basis. By varying the statistical threshold the maximum deviation from random structure can be estimated, and the threshold is then chosen on that basis. A global test for graph structure follows naturally

    An information theoretic approach to pedigree reconstruction

    Full text link

    A NOTE ON THE CALCULATION OF N-STATISTICS

    Full text link
    A class of statistics suitable for testing against equality of multivariate distributions is described by Klebanov and co-workers in 2007. Referred to as N-statistics, their discriminating ability is based on various forms of distance kernels in ℝd, the intention being to capture distinct forms of deviation from equality. This makes them particularly suitable for large-scale genomic screening applications, in which such variety of alternatives can be anticipated. One of these kernels, denoted as L4, introduces weighting by directional densities, hence the evaluation of L4 requires integration on the unit sphere in ℝd. In this note we introduce a methodology for the evaluation of integrals related to L4. It is shown that for a class of directional densities including, but not limited to, the uniform density L4 reduces to Euclidean distance. For other cases, the methodology permits a direct interpretation of L4 in terms of the directional weighting. </jats:p

    A Regulatory Principle for Robust Reciprocal-Time Decay of the Adaptive Immune Response

    No full text

    A Simplification of the Calculation of the Joint Genotype Distribution in Two Noninbred Individuals

    No full text
    this paper we consider the special case of four genes from two individuals m 1 ; p 1 ; m 2 ; p 2 , where m i and p i are the maternally and paternally inherited genes of individuals i = 1; 2. We assume that neither individual is inbred. In this context, the probabilistic consequence of this assumption is that for any a 2 A at most one of m i and p i can assume the value a for each i = 1; 2, hence at most one of P (m i = a) and P (p i = a) can be nonzero. Lemma 1 If individuals 1 and 2 are noninbred, then P (fm 1 = m 2 g &quot; fp 1 = p 2 g) = P (m 1 = m 2 ) \Theta P (p 1 = p 2 ) (2.8) an
    corecore