Statistics of noncoding RNAs: alignment and secondary structure prediction

S. K. Nechaev 1, 2, 3, M. V. Tamm 4, O. V. Valba 1, 5

Journal of Physics A Mathematical and Theoretical 44 (2011) 195001

A new statistical method of alignment of two heteropolymers which can form hierarchical cloverleaf-like secondary structures is proposed. This offers a new constructive algorithm for quantitative determination of binding free energy of two noncoding RNAs with arbitrary primary sequences. The alignment of ncRNAs differs from the complete alignment of two RNA sequences: in ncRNA case we align only the sequences of nucleotides which constitute pairs between two different RNAs, while the secondary structure of each RNA comes into play only by the combinatorial factors affecting the entropc contribution of each molecule to the total cost function. The proposed algorithm is based on two observations: i) the standard alignment problem is considered as a zero-temperature limit of a more general statistical problem of binding of two associating heteropolymer chains; ii) this last problem is generalized onto the sequences with hierarchical cloverleaf-like structures (i.e. of RNA-type). Taking zero-temperature limit at the very end we arrive at the desired 'cost function' of the system with account for entropy of side cactus-like loops. Moreover, we have demonstrated in detail how our algorithm enables to solve the 'structure recovery' problem. Namely, we can predict in zero-temperature limit the cloverleaf-like (i.e. secondary) structure of interacting ncRNAs by knowing only their primary sequences.

  • 1. Laboratoire de Physique Théorique et Modèles Statistiques (LPTMS),
    CNRS : UMR8626 – Université Paris XI - Paris Sud
  • 2. P. N. Lebedev Physical Institute,
    Russian Academy of Science
  • 3. JV Poncelet Laboratory,
    Independant University
  • 4. Physics Department,
    Moscow State University
  • 5. Moscow Institute of Physics and Technology (MIPT),
    Moscow Institute of Physics and Technology