Title: Unique folding of Precursor MicroRNAs: Quantitative Evidence and Implications for De Novo Identification

Supplementary materials:

  1. PDF (357 Kbytes; Algorithm details of RNAspectral, Table S1, S2, and S3)
  2. Datasets (744 Kbytes; fasta files of precursor miRNAs, ncRNAs, mRNAs, and pseudo hairpins).
  3. Results (4.1 Mbytes; excel files; Please see the Readme.txt contained within for more details).
Computational pipeline requirements:
Please ensure the following software are installed before proceeding:
  1. Linux-based system.
  2. Perl 5.8.3 (at least).
  3. GNU GCC 3.0 (at least)
  4. Perl module source codes for Bioperl (Version 1.4, stable release) . Original link is at http://www.bioperl.org/wiki/Getting_BioPerl.
  5. Perl module source codes for Statistics::Basic (Version 0.42) . Original link is at http://search.cpan.org/~jettero/Statistics-Basic-0.42/Basic.pm.
  6. Perl module source codes for Algorithm::Numerical::Shuffle (Version 1.4). Original link is at http://search.cpan.org/~abigail/Shuffle-1.4/Shuffle.pm.
  7. Source codes for RNAlib and RNAfold from ViennaRNA (Version 1.4). I have patched RNAfold so that it won't generate ps files. Original link is at http://www.tbi.univie.ac.at/~ivo/RNA/.
Please download the following scripts, executable, and datasets:

  1. genRNAStats.pl
  2. genRandomRNA.pl
  3. genRNARandomStats.pl
  4. RNAspectral . This runs on Linux-based system x86 32-bit Pentium, alternatively please email me for other platforms.
  5. Datasets. These are the fasta files of precursor miRNAs, ncRNAs, mRNAs, and pseudo hairpins.
How to use the scripts and executables:
Given a RNA sequence (fasta format) TPP.fasta:
>AC084406.7
AAGUUGCACCAGGGGUGCCUGUAUUCUCAACGAUCUGAAGGCCUCUUGGCCUGGAUUGUUGUGAAUUGGGCUGAGAAAGUCCCUUUGAA
CCUGAACAGGAUAAUGCCUGCGAAGGGAGUGUGCAUUUCUACUUUU

To compute the sequence and structural statistics from TPP.fasta:
  1. "perl genRNAStats.pl < TPP.fasta".
To predict the seconday structure from TPP.fasta:
  1. "RNAfold < TPP.fasta > TPP.rnafold"
To predict the topological properties from TPP.rnafold
  1. "RNAspectral -v1 < TPP.rnafold" or "RNAspectral < TPP.rnafold".
  2. The former is identical to that obtained from "RNA Matrix Computer Program" by uploading TPP.ct (TPP.rnafold in another format); the latter more concised.
To predict the seconday structures from 10 random RNA sequences of TPP.fasta using mononucleotide shuffling algorithm:
  1. "perl genRandomRNA.pl -n 10 -m m < TPP.fasta | RNAfold > TPP10.mrnafold"
  2. m, mononucleotide shuffling; d, dinucleotide shuffling; z, zero-order markov model; f, first-order markov model
To compute the z-scores from TPP10.mrnafold:
  1. "perl genRNARandomStats.pl -n 10 -i TPP10.mrnafold -m TPP.rnafold".
Their use is free purely for non-profit and academic purpose adhering to the GNU General Public License (GPL).


Back to Publications and Working Papers