493 research outputs found
Expected degree for RNA secondary structure networks
Consider the network of all secondary structures of a given RNA sequence,
where nodes are connected when the corresponding structures have base pair
distance one. The expected degree of the network is the average number of
neighbors, where average may be computed with respect to the either the uniform
or Boltzmann probability. Here we describe the first algorithm, RNAexpNumNbors,
that can compute the expected number of neighbors, or expected network degree,
of an input sequence. For RNA sequences from the Rfam database, the expected
degree is significantly less than the CMFE structure, defined to have minimum
free energy over all structures consistent with the Rfam consensus structure.
The expected degree of structural RNAs, such as purine riboswitches,
paradoxically appears to be smaller than that of random RNA, yet the difference
between the degree of the MFE structure and the expected degree is larger than
that of random RNA. Expected degree does not seem to correlate with standard
structural diversity measures of RNA, such as positional entropy, ensemble
defect, etc. The program {\tt RNAexpNumNbors} is written in C, runs in cubic
time and quadratic space, and is publicly available at
http://bioinformatics.bc.edu/clotelab/RNAexpNumNbors.Comment: 25 pages, 5 figures, 5 table
The weak pigeonhole principle for function classes in S^1_2
It is well known that S^1_2 cannot prove the injective weak pigeonhole
principle for polynomial time functions unless RSA is insecure. In this note we
investigate the provability of the surjective (dual) weak pigeonhole principle
in S^1_2 for provably weaker function classes.Comment: 11 page
An IP Algorithm for RNA Folding Trajectories
Vienna RNA Package software Kinfold implements the Gillespie algorithm for RNA secondary structure folding kinetics, for the move sets MS1 [resp. MS2], consisting of base pair additions and removals [resp. base pair addition, removals and shifts]. In this paper, for arbitrary secondary structures s, t of a given RNA sequence, we present the first optimal algorithm to compute the shortest MS2 folding trajectory s = s0, s1, . . .sm = t, where each intermediate structure si+1 is obtained from its predecessor by the addition, removal or shift of a single base pair. The shortest MS1 trajectory between s and t is trivially equal to the number of base pairs belonging to s but not t, plus the number of base pairs belonging to t but not s. Our optimal algorithm applies integer programming (IP) to solve (essentially) the minimum feedback vertex set (FVS) problem for the "conflict digraph" associated with input secondary structures s, t, and then applies topological sort, in order to generate an optimal MS2 folding pathway from s to t that maximizes the use of shift moves. Since the optimal algorithm may require excessive run time, we also sketch a fast, near-optimal algorithm (details to appear elsewhere). Software for our algorithm will be publicly available at http://bioinformatics.bc.edu/clotelab/MS2distance/
Introduction to clarithmetic II
The earlier paper "Introduction to clarithmetic I" constructed an axiomatic
system of arithmetic based on computability logic (see
http://www.cis.upenn.edu/~giorgi/cl.html), and proved its soundness and
extensional completeness with respect to polynomial time computability. The
present paper elaborates three additional sound and complete systems in the
same style and sense: one for polynomial space computability, one for
elementary recursive time (and/or space) computability, and one for primitive
recursive time (and/or space) computability
Expected distance between terminal nucleotides of RNA secondary structures.
International audienceIn "The ends of a large RNA molecule are necessarily close", Yoffe et al. (Nucleic Acids Res 39(1):292-299, 2011) used the programs RNAfold [resp. RNAsubopt] from Vienna RNA Package to calculate the distance between 5' and 3' ends of the minimum free energy secondary structure [resp. thermal equilibrium structures] of viral and random RNA sequences. Here, the 5'-3' distance is defined to be the length of the shortest path from 5' node to 3' node in the undirected graph, whose edge set consists of edges {i, i + 1} corresponding to covalent backbone bonds and of edges {i, j} corresponding to canonical base pairs. From repeated simulations and using a heuristic theoretical argument, Yoffe et al. conclude that the 5'-3' distance is less than a fixed constant, independent of RNA sequence length. In this paper, we provide a rigorous, mathematical framework to study the expected distance from 5' to 3' ends of an RNA sequence. We present recurrence relations that precisely define the expected distance from 5' to 3' ends of an RNA sequence, both for the Turner nearest neighbor energy model, as well as for a simple homopolymer model first defined by Stein and Waterman. We implement dynamic programming algorithms to compute (rather than approximate by repeated application of Vienna RNA Package) the expected distance between 5' and 3' ends of a given RNA sequence, with respect to the Turner energy model. Using methods of analytical combinatorics, that depend on complex analysis, we prove that the asymptotic expected 5'-3' distance of length n homopolymers is approximately equal to the constant 5.47211, while the asymptotic distance is 6.771096 if hairpins have a minimum of 3 unpaired bases and the probability that any two positions can form a base pair is 1/4. Finally, we analyze the 5'-3' distance for secondary structures from the STRAND database, and conclude that the 5'-3' distance is correlated with RNA sequence length
- …
