78 research outputs found
A dynamic programming algorithm for the optimal control of piecewise deterministic Markov processes
Publisher's version/PDFA piecewise deterministic Markov process (PDP) is a continuous time Markov process consisting of continuous, deterministic trajectories interrupted by random jumps. The trajectories may be controlled with the object of minimizing the expected costs associated with the process. A method of representing this controlled PDP as a discrete time decision process is presented, allowing the value function for the problem to be expressed as the fixed point of a dynamic programming operator. Decisions take the form of trajectory segments. The expected costs may then be minimized through a dynamic programming algorithm, rather than through the solution of the Bellman–Hamilton–Jacobi equation, assuming the trajectory segments are numerically tractable. The technique is applied to the optimal capacity expansion problem, that is, the problem of planning the construction of new production facilities to meet rising demand
Testing Sparsity Assumptions in Bayesian Networks
Bayesian network (BN) structure discovery algorithms typically either make
assumptions about the sparsity of the true underlying network, or are limited
by computational constraints to networks with a small number of variables.
While these sparsity assumptions can take various forms, frequently the
assumptions focus on an upper bound for the maximum in-degree of the underlying
graph . Theorem 2 in Duttweiler et. al. (2023) demonstrates that the
largest eigenvalue of the normalized inverse covariance matrix () of a
linear BN is a lower bound for . Building on this result, this paper
provides the asymptotic properties of, and a debiasing procedure for, the
sample eigenvalues of , leading to a hypothesis test that may be used
to determine if the BN has max in-degree greater than 1. A linear BN structure
discovery workflow is suggested in which the investigator uses this hypothesis
test to aid in selecting an appropriate structure discovery algorithm. The
hypothesis test performance is evaluated through simulations and the workflow
is demonstrated on data from a human psoriasis study
Statistical evaluation of improvement in RNA secondary structure prediction
With discovery of diverse roles for RNA, its centrality in cellular functions has become increasingly apparent. A number of algorithms have been developed to predict RNA secondary structure. Their performance has been benchmarked by comparing structure predictions to reference secondary structures. Generally, algorithms are compared against each other and one is selected as best without statistical testing to determine whether the improvement is significant. In this work, it is demonstrated that the prediction accuracies of methods correlate with each other over sets of sequences. One possible reason for this correlation is that many algorithms use the same underlying principles. A set of benchmarks published previously for programs that predict a structure common to three or more sequences is statistically analyzed as an example to show that it can be rigorously evaluated using paired two-sample t-tests. Finally, a pipeline of statistical analyses is proposed to guide the choice of data set size and performance assessment for benchmarks of structure prediction. The pipeline is applied using 5S rRNA sequences as an example
Selection of Statistical Thresholds in Graphical Models
Reconstruction of gene regulatory networks based on experimental data usually relies on statistical evidence, necessitating the choice of a statistical threshold which defines a significant biological effect. Approaches to this problem found in the literature range from rigorous multiple testing procedures to ad hoc P-value cut-off points. However, when the data implies graphical structure, it should be possible to exploit this feature in the threshold selection process. In this article we propose a procedure based on this principle. Using coding theory we devise a measure of graphical structure, for example, highly connected nodes or chain structure. The measure for a particular graph can be compared to that of a random graph and structure inferred on that basis. By varying the statistical threshold the maximum deviation from random structure can be estimated, and the threshold is then chosen on that basis. A global test for graph structure follows naturally
A model for the regulation of follicular dendritic cells predicts invariant reciprocal‐time decay of post‐vaccine antibody response
A NOTE ON THE CALCULATION OF N-STATISTICS
A class of statistics suitable for testing against equality of multivariate distributions is described by Klebanov and co-workers in 2007. Referred to as N-statistics, their discriminating ability is based on various forms of distance kernels in ℝd, the intention being to capture distinct forms of deviation from equality. This makes them particularly suitable for large-scale genomic screening applications, in which such variety of alternatives can be anticipated. One of these kernels, denoted as L4, introduces weighting by directional densities, hence the evaluation of L4 requires integration on the unit sphere in ℝd. In this note we introduce a methodology for the evaluation of integrals related to L4. It is shown that for a class of directional densities including, but not limited to, the uniform density L4 reduces to Euclidean distance. For other cases, the methodology permits a direct interpretation of L4 in terms of the directional weighting. </jats:p
A commentary on some recent methods in sibling group reconstruction based on set coverings
A Simplification of the Calculation of the Joint Genotype Distribution in Two Noninbred Individuals
this paper we consider the special case of four genes from two individuals m 1 ; p 1 ; m 2 ; p 2 , where m i and p i are the maternally and paternally inherited genes of individuals i = 1; 2. We assume that neither individual is inbred. In this context, the probabilistic consequence of this assumption is that for any a 2 A at most one of m i and p i can assume the value a for each i = 1; 2, hence at most one of P (m i = a) and P (p i = a) can be nonzero. Lemma 1 If individuals 1 and 2 are noninbred, then P (fm 1 = m 2 g " fp 1 = p 2 g) = P (m 1 = m 2 ) \Theta P (p 1 = p 2 ) (2.8) an
- …
