Title: Unique folding of Precursor
MicroRNAs: Quantitative Evidence and Implications for De Novo Identification
Supplementary
materials:
- PDF (357 Kbytes; Algorithm details of
RNAspectral, Table S1, S2, and S3)
- Datasets (744
Kbytes; fasta files of precursor miRNAs, ncRNAs, mRNAs, and pseudo
hairpins).
- Results (4.1
Mbytes; excel
files;
Please
see the Readme.txt contained within for more details).
Computational
pipeline requirements:
Please ensure the following software
are installed before
proceeding:
- Linux-based system.
- Perl 5.8.3 (at least).
- GNU GCC 3.0 (at least)
- Perl module source codes for Bioperl
(Version 1.4, stable release) . Original link is at http://www.bioperl.org/wiki/Getting_BioPerl.
- Perl module source codes for Statistics::Basic
(Version 0.42) . Original link is at http://search.cpan.org/~jettero/Statistics-Basic-0.42/Basic.pm.
- Perl module source codes for Algorithm::Numerical::Shuffle
(Version 1.4). Original link is at http://search.cpan.org/~abigail/Shuffle-1.4/Shuffle.pm.
- Source codes for RNAlib and RNAfold from ViennaRNA (Version
1.4). I have patched RNAfold so that it won't generate ps files. Original
link is at http://www.tbi.univie.ac.at/~ivo/RNA/.
Please
download the following
scripts, executable, and datasets:
- genRNAStats.pl
- genRandomRNA.pl
- genRNARandomStats.pl
- RNAspectral . This
runs on Linux-based system x86 32-bit Pentium, alternatively please email me for
other platforms.
- Datasets. These are the
fasta files of precursor miRNAs, ncRNAs, mRNAs, and pseudo
hairpins.
How to use the scripts and executables:
Given a RNA
sequence (fasta format) TPP.fasta:
>AC084406.7
AAGUUGCACCAGGGGUGCCUGUAUUCUCAACGAUCUGAAGGCCUCUUGGCCUGGAUUGUUGUGAAUUGGGCUGAGAAAGUCCCUUUGAA
CCUGAACAGGAUAAUGCCUGCGAAGGGAGUGUGCAUUUCUACUUUU
To
compute the sequence and structural statistics from TPP.fasta:
- "perl genRNAStats.pl < TPP.fasta".
To predict the
seconday structure from TPP.fasta:
- "RNAfold < TPP.fasta > TPP.rnafold"
To predict
the topological properties from TPP.rnafold
- "RNAspectral -v1 < TPP.rnafold" or "RNAspectral < TPP.rnafold".
- The former is identical to that obtained from "RNA Matrix Computer
Program" by uploading TPP.ct (TPP.rnafold
in another format); the latter more concised.
To predict the seconday structures from 10 random RNA
sequences of TPP.fasta using
mononucleotide shuffling algorithm:
- "perl genRandomRNA.pl
-n 10 -m m < TPP.fasta | RNAfold >
TPP10.mrnafold"
- m, mononucleotide shuffling; d, dinucleotide shuffling; z, zero-order
markov model; f, first-order markov model
To compute the z-scores from
TPP10.mrnafold:
- "perl genRNARandomStats.pl -n
10 -i TPP10.mrnafold -m TPP.rnafold".
Their use is free purely for non-profit and academic
purpose adhering to the GNU General Public License (GPL).
Back to Publications and Working Papers