FAQs - miRWalk

Data Input

Data Filter and Export

Statistical & Functional Analysis

How is the functional enrichment analysis performed in miRWalk?
David Tools Usage

Data Inputs:

How to perform target search?

There are two ways to search miRWalk for predicted miRNA targets.

Search by miRNA names (e.g. hsa-miR-214-3p) or Accession numbers (e.g. MIMAT0000271) based on current miRBase. While searching single miRNAs, also short names or family names (e.g. let-7) belongs to several miRNAs are also acceptible. A list of miRNAs will be shown.
Search by gene target information like official genesymbols (e.g. GAS2), EntrezIDs (e.g. 10608), Ensembl-IDs (e.g. ENSG00000148935 or ENST00000454584) and RefseqIDs (e.g. NM_001143830).

Why does miRWalk search possible miRNA binding sites within the complete sequence of a gene?

For more than a decade, attempts to study the interaction of miRNAs with their targets were limited to the mRNA 3'-UTR region. However, several investigators have recently suggested an alternative mode of gene regulation in which miRNAs anneal within the cds, 5'- and/or 3'-UTR regions of their targets thereby regulating their translation. Therefore, it is of paramount importance to search possible miRNA binding sites within the complete sequence (5'-UTR, CDS and 3'-UTR) of a gene.

What does miRWalk cover?

miRWalk 3 novelties are as follows:

It hosts possible binding site interaction information between genes (encompassing the complete sequence) and miRNAs resulting from the TarPmiR algorithm.
2 other miRNA-target prediction data-sets are documented to upgrade the comparative platform of miRNA binding sites (TargetScan and miRDB).
validated interaction data from miRTarBase

Does miRWalk integrate all transcript encoding by a gene?

Yes, miRWalk integrates all transcripts encoding by a genes - as it has previously been shown that a gene can encode for different transcripts with different lengths due to alternative splicing process - for example, TP63 gene is known to encode six different transcripts with variant length on 5'-UTR, CDS and 3'-UTR regions.

What are the future plans of miRWalk?

More annotations and additional species will be integrated to further expand this resource. We would like you to let us know if you encounter problems during the use of miRWalk or you have suggestions to improve the user interface as well as incorporation of new features to this resource. To obtain further information about miRWalk, please contact: miRWalkTeam at mirwalkteam[ at ]medma.uni-heidelberg.de

What kind of algorithm is used in miRWalk?

miRWalk database contains informations produced with TarPmiR, a miRNA-mRNA binding sites prediction tool, which can utilize miRNA-mRNA binding experiment data such as CLASH. TarPmiR applies a random-forest-based approach to integrate six conventional features and seven new features to predict miRNA target sites. These features were learned from the only CLASH dataset in mammal (Ding et. al, 2016). Another feature of this approach is the probability to implement new features into the algorithm.

Reference: TarPmiR: a new approach for microRNA target site prediction. Ding J, Li X, Hu H. Bioinformatics. 2016 May 20. (Pubmed)

How often will be the database updated?

The database is completely updated twice a year. For this purpose, special scripts (in Python 3) were written for this task, which automatically download all necessary data, process them and save them in the corresponding formats and tables. The actual prediction of gene miRNA interactions using TarPmiR (the most time-consuming part) is then performed on a grid server and the results are finally integrated into miRWalk. Thus the complete database is updated every 6 months.

How to perform target mining?

The Target Mining page provides an advanced search option for several miRNAs or gene targets. You may provide your own miRNA or gene list. Alternatively, you may choose the pre-compiled pathway gene list from the page [not implemented yet]. When searching for miRNA gene targets, full mature miRNA names are required. For the search of miRNA regulators, you may provide either NCBI gene IDs or official gene symbols.

What about the old database miRWalk 2.0?

The database miRWalk 2.0 will be coexist to the new version for a while. The plan is to shutdown the server hardware in two years. Data, which not be migrated to the new database, will be available for download.

What is different from miRWalk Version 3 compared to 2?

We want to provide the prediction of miRNA target gene interaction with good accuracy. The strategy of miRWalk version 2 was to implement the data of several different algorithms with different approaches to predict possible binding sites. Most of the databases are not updated anymore. To run all programs on our server takes too much time to hold the update cycle suitable. We set our focus to a smaller subset and with a machine learning approach. Therefor we try to include the most of the features, covered with several programs before, in one run. We have reduced third party data to targetScan, miRDB and miRTarBase. The interface is completely redesigned to obtain the best experience. Instead to set several options before searching miRNA-target interactions, users can search for genes or miRNAs and filter the results dynamically and save then the results or perform a gene set enrichment analysis.

How to cite miRWalk?

Sticht C, De La Torre C, Parveen A, Gretz N.: miRWalk: An online resource for prediction of microRNA binding sites. PLoS One. 2018 Oct 18;13(10):

miRWalk Version

The current Version is 3.

Data Filtering and Export:

Search for higher p-values?

TarPmiR use a random-forest-based approach for miRNA target site prediction. The model was trained with 13 features. TarPmiR applies the trained random-forest based predictor to predict target sites. The output of the random-forest model is the predicted probability that a candi- date target site is a true target site. That means higher p-value are better.

How to obtain only the validated miRNA-gene interaction?

Include under "validated contains" the term "MIRT" to filter all interactions with a miRTarBase entry.

How to filter the results?

To filter the interaction table, several options are available:

miRNA-ID or GeneID - shows only the interactions from this miRNA or gene. The IDs must fit exactly. Choose Ensembl-ID (e.g. ENSG) or official genesymbols
binding probability - Filter all results with a minimum of binding probability (higher are better). The p-values are calculated from a random-forest-based approach with TarPmiR for miRNA target site prediction.
binding site position - shows only the results from 5UTR, CDS or 3UTR. Only one entry is allowed.
validated interactions - to filter all validated interactions, include "MIRT" to "validated contains" to exclude all interactions without an entry in miRTarBase
other databases - select miRDB or TargetScan to filter interactions with an entry also in one of these databases.

Why TarPmiR?

Many computational techniques have been discovered to predict miRNA-target gene, multiple features are introduced to identify their target genes such as complementarily of different regions on miRNAs, binding site conservation or target sites accessibility. Different predictive algorithms are based on different features, therefore, integrating diverse algorithms may improve target prediction. One strategy was to include prediction results of several different algorithms to cover all these factors and getting better accuracy in predicting of miRNA target gene interactions. TarPmiR applies a random-forest-based approach to integrate most of these features to predict miRNA target sites. One important reason to choose TarPmiR was the possibility to extend the binding class and include new features.
Reference: TarPmiR: a new approach for microRNA target site prediction. Ding J, Li X, Hu H. Bioinformatics. 2016 May 20. (Pubmed)

What are the features used with TarPmiR?

Folding Energy
Seed match
Accessibility
AU content
Stem Conservation
Flanking conservation
Conservation Difference
m/e motif
Total number of paired positions
The length of target mRNA region
The length of the largest consecutive pairs
Position of the largest consecutive pairs
The length of the largest consecutive pairs allowing 2 mismatches
The position of the largest consecutive pairs allowing 2 mismatches
The number of paired positions at the miRNA 3 end
The total number of paired positions in the seed region and the miRNA 3 end
The difference between the number of paired positions in the seed region and that in the miRNA 3 end
Exon preference

What is "Au" and "Me"?

Au shows the AU-rich elements (AREs). The local AU content reflects the transcript AU content 30nt upstream and downstream of predicted site.

Me stands for m/e motif. This feature is about the paring probabilities at different positions of miRNA.

What does the "score" exactly mean?

The score is calculated from a random-forest based approach by executing TarPmiR algorithm for miRNA target site prediction. Based on the training data, it's shows the probability that this interaction "works".

The Duplex information

The binding site is determined using the RNAduplex programme from the ViennaRNA software package. RNAduplex forms intermolecular pairs and neglects the competition between intramolecular folding and hybridization. It is used as a pre-filter in the TarPmiR prediction software.

More information:
https://www.tbi.univie.ac.at/RNA/tutorial/#sec6_3

"&" character as separator.
. denotes bases that are essentially unpaired
, weakly paired
|strongly paired without preference
{},() weakly ( >33%) upstream (downstream) paired or strongly ( >66%) up-/downstream paired bases, respectively.

Link from Outside

The miRWalk database has no API function yet. It is planned to create it together with a R package. For now it is only possible to link to a gene or mirna:

http://mirwalk.umm.uni-heidelberg.de/{species}/{type}/{ID}
for {species}: human, mouse, rat, dog, cow or fish
for {type}: gene or mirna
for {ID}: entrezIDs for genes and MIMAT-IDs for miRNAs

Example: http://mirwalk.umm.uni-heidelberg.de/human/gene/595/

Can't find the targetscan results

To minimise false positives, the results are compared with other databases. Only the results from miRDB and targetscan that were also found with TarPmiR are used. In particular, only the results with conserved regions are used.

Statistics & Functional Analysis:

How is the functional enrichment analysis performed in miRWalk?

The geneset enrichment analysis is to test whether any functional group of genes (e.g. pathways, target of a transcription factor) from the user selected library are significantly enriched among those genes of interest. miRWalk offers a standard enrichment analysis based on the hypergeometric tests (fisher-exact-test).

David Tools Usage

The API for send a genelist to DAVID Tools is limited (max 400 genes). The button is activated only, when the number of 400 genes is not reached. If you have more then 400 genes, export the genelist and upload the list on https://david.ncifcrf.gov/tools.jsp.

Frequently Asked Questions (FAQs)