MIReStruC can be run in three different ways. It is characterized by specific pre-treatments of the different kinds of data that it can handle. It predicts structural clusters of miRNAs by providing a list of positions of miRNA/pre-miRNA pairs composing corresponding structural clusters. Structural clusters of miRNAs are identified along a genomic sequence either 1. with an ab initio sequence analysis by looking for repeated sequences in palindromic regions (black path, Figure 1 of the article), or 2. with a structural analysis by considering deep sequencing reads as potential miRNAs (red path, Figure 1 of the article), or 3. with a combination of sequence analysis and deep sequencing data by finding structural clusters from deep sequencing reads and from multiple palindromic sequences (green path, Figure 1 of the article). The algorithm starts with pre-treatments adequate to the type of input data and filters afterwards potential miRNA structural clusters based on five combinatorial and structural criteria describing acceptable pre-miRNAs.
The MIReStruC-1.0 package can be downloaded here.
You can unpack the archive through the command
tar xzf MIReStruC-1.0.tar.gz
To compile C sources useful for MIReStruC, type the following commands (you can also read the INSTALL file for informations):
cd [MIReStruC-1.0 repository]
./configure
make
To know how to execute MIReStruC, you can type the following command:
./MIReStruC.sh -h
Look at the files contained in the repository named "dataset/" for examples of input files. Go into the dataset repository and type one of the following commands to obtain the corresponding out.txt file:
../MIReStruC.sh -P
../MIReStruC.sh -D
../MIReStruC.sh -C
When invoking MIReStruC to predict structural clusters from a genomic sequence, you need a single file. The file contains a DNA sequence (e.g. dataset/par_seq.txt) to apply method 1. (black path, Figure 1 of the article) on it. The genomic sequence is given on a single line.
When invoking MIReStruC to predict structural clusters from deep sequencing reads, you need two files. The first file contains a DNA sequence (e.g. dataset/deep_seq.txt) and the second file contains an output file of the MicroRazerS algorithm (e.g. dataset/deep_mraz.txt). The output of MicroRazerS is obtained by applying MicroRazerS software on the DNA sequence with deep sequencing reads.
When invoking MIReStruC to predict structural clusters by using the combination of sequence analysis and deep sequencing reads, you need two files. The first file contains a DNA sequence (e.g. dataset/deep_seq.txt) and the second file contains an output file of the MicroRazerS algorithm (e.g. dataset/deep_mraz.txt). The output of MicroRazerS is obtained by applying MicroRazerS software on the DNA sequence with deep sequencing reads.
When executing MIReStruC, the output file contains structural clusters predicted by the corresponding algorithm. It corresponds to a list of clusters composed of several miRNAs/pre-miRNAs whose positions are given following the format:
Cluster n°A:
>B-C D E
>F-G H I
>J-K L M
Where A stands to the number of the corresponding cluster in the list. The positions of miRNAs (in the corresponding genomic sequence) composing the cluster are given by numbers B, C, F, G, J and K where B, F and J stands for starting positions and where C, G and K stands for ending positions. Given a miRNA, positions of its corresponding predicted pre-miRNA can be obtained the two following numbers. For instance, ">B-C D E" indicates that the pre-miRNA is obtained by adding D nt before the miRNA sequence and E nt after. When ".rc" is given for a miRNA/pre-miRNA pair, it indicates that the corresponding miRNAs/pre-miRNAs lie on complementary strand.
The MIReStruC program has been developed under the CeCILL licence (see LICENCE).
MIReStruC uses a modified implementation of RNAfold from the ViennaRNA package [Hofacker et al. 1994].
For questions, comments, or suggestions feel free to contact Alessandra Carbone or Anthony Mathelier.
If you are using MIReStruC, please cite:
Last Update Sept. 2013