RNA secondary structure design

Phys Rev E Stat Nonlin Soft Matter Phys. 2007 Feb;75(2 Pt 1):021920. doi: 10.1103/PhysRevE.75.021920. Epub 2007 Feb 28.

Abstract

We consider the inverse-folding problem for RNA secondary structures: for a given (pseudo-knot-free) secondary structure we want to find a sequence that has a certain structure as its ground state. If such a sequence exists, the structure is called designable. We have implemented a branch-and-bound algorithm that is able to do an exhaustive search within the sequence space, i.e., gives an exact answer as to whether such a sequence exists. The bounds required by the branch-and-bound algorithm are calculated by a dynamic programming algorithm. We consider different alphabet sizes and an ensemble of random structures, which we want to design. We find that for two letters almost none of these structures are designable. The designability improves for the three-letter case, but still a significant fraction of structures is undesignable. This changes when we look at the natural four-letter case with two pairs of complementary bases: undesignable structures are the exception, although they still exist. Finally, we also study the relation between designability and the algorithmic complexity of the branch-and-bound algorithm. Within the ensemble of structures, a high average degree of undesignability is correlated with a long time to prove that a given structure is (un-)designable. In the four-letter case, where the designability is high everywhere, the algorithmic complexity is highest in the region of naturally occurring RNA.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computer Simulation
  • Models, Chemical*
  • Models, Molecular*
  • Models, Statistical
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • RNA / chemistry*
  • RNA / ultrastructure*
  • Sequence Analysis, RNA / methods*

Substances

  • RNA