CMV offers tools for the visualisation of RNA family models,also known as covariance models (CM) and Hidden Markov Models (HMM).
Moreover, comparsions between models, the multiple sequence alignments they were
constructed from and, in the case of RNA families, the consensus secondary
structure can be visualised.
The aim is to simplify model construction and evaluation
by providing visualisations with different levels of detail. Minimal and simple detail
representations give a overview over the size of the model by showing all nodes and in
case of covariance models depict the guide tree, meaning nestedness of the secondary structure elements,
of the model. Detailed views show nodes, states, emission as well as transition probabilites.
Comparison results are highlighted by color labels which are consistently applied to model,
alignment and secondary structure visualisation, allowing to inspect found similarities
between models in their context.
The tools can be applied to already existing HMMs
and CMs from the Pfam and Rfam database,
as well as newly constructed models from RNAlien.
Comparsions between covariance models can be computed via CMCompare and its webservice.
If you look for source code or a installation guide for the tool please refer
to the Tool subpage of the webservice.
The second part of the guide explains the application of the tools for hidden markov models
HMMV (Hidden markov model visualisation tool) and their comparisons HMMCV (Hidden markov model
comparison visualisation tool) on the command line, as well as on the web service.
The third part addresses the usage of the tools for covariance models CMV (Covariance model visualisation)
and their comparisons CMCV (Covariance model comparison visualisation). For each of these for
tools an example is depicted in this guide, which is also available as single click-example
submission on the webservice, complete with input files.
The fourth part introduces the three auxiliary tools CMCwstoCMCV, converting CMCompare webserver output to input for CMCV,
CMCtoHMMC mapping CMCompare output on HMMs and HMMCtoCMC mapping HMMCompare output on CMs.
A variant of this manual is included with the tool as manual.pdf.
Table of contents:
Hidden Markov models are used to represent the sequence information of biopolymers. Nodes of the model represent columns of a multiple sequence alignment. The guide describes the required input and parameters for both tools als well as the output.A visualisation for the EGF-protein family and a comparison visualisation for the hammerhead-RNA clan with corresponding command line calls are used are used as examples.
HMMV program flowchart
Flowchart representation of HMMV showing the possible options for the commandline tool and the processing of input models via HMMDraw. Optional input alignments trigger the output of alignment visualisation via StockholmDraw. Both modules are based on the diagrams library and a cairo backend. Options on the shown on the left, processing in the center and output on the right.
HMMCV program flowchart
Flowchart representation of HMMCV showing possible options for the commandline tool and the processing of input models and input comparisons via HMMDraw Optional input alignments trigger the output of alignment visualisation via StockholmDraw Linked nodes are highlighted in both alignment and model visualisation. Both modules are based on the diagrams library and a cairo backend. Options on the shown on the left, processing in the center and output on the right.
Input models are supported in HMMER3 (see HMMER User-guide) format, as used by Pfam and as part of Rfam INFERNAL models. Here is the EGF protein family as example. Multiple input models can be provided by concatenating them in one file, see hammerhead-RNA family clan. The webservice accepts a file upload, the commandline tool a absolute filepath.
Optionally the multiple sequence alignment used to construct the input model can be provided in Stockholm format, e.g. EGF alignment. For multiple input models the same number of alignments must be provided in the same sequence as the models, also concatenated into one file, e.g. hammerhead-RNA. The webservice accepts a file upload, the commandline tool a absolute filepath.
HMMCV requires a comparison file, detailing the relationship between the input models. The webservice accepts a file upload, the commandline tool a absolute filepath. This format is derived from the CMCompare output format and contains on each line following white space separated fields:
model1Name model2Name linkscore1 linkscore2 linksequence model1matchednodes model2matchednodes
Here is a example line from a hammerhead clan comparison, the whole file can be found here: hammerheadClan-comparsion.
Hammerhead_1 Hammerhead_3 6.168 5.244 GUCCCAGUAAUAGGAC [17,18,19,20,21,22,23,23,24,25,26,27,28] [36,37,38,39,40,41,42,43,44,45,46,47,48]
Detail level (-d)
Three detail level for each node are available:
- minimal - showing the node number.
- simple - showing emission probabilites
- detailed - showing emission and transition probabilities
Emission layout (-e)
Controls display of emission probabilities for detail levels simple and detailed. The selected variant is shown next to the emitted symbol.
- box - fill state of a box
- score - bit score, as a floating pint number
- probability - as a floating point number
Output format (-f)
Available output formats are pdf, png, svg and ps. The webservice always generates svg by default for rendering of the preview.
Max. number of alignment entries (-n)
This controls how many entries are displayed for optionally uploaded alignments.
Image size scaling factor (-c)
Scales the result image by set factor. Please consider that resulting .svg output can be easily rescaled.
Transition probability cutoff (-t)
Minimum necessary cutoff for a transition probability to be displayed.
Output directory path (only cmdline, -o)
Absolute path to output directory
Help (only cmdline, --help)
Prints help with all default options and commandline parameters
Commandline usageHMMV-visualisation for the Piwi protein family, as used for the webservice, can be obtained with the following commandline call:
HMMV -d detailed -m Piwi.hmm -s /home/user/PF02171_seed.txt
HMMCV comparsion visualisation for the hammerheadRNA clan can be computed like this:
HMMCV -d detailed -m /home/user/hammerhead.hmm -s /home/user/hammerhead.stockholm.txt -r /home/user/hammerhead.hmmc -f pdf
HMMV-OutputFor each input model a outputfile in the requested format is generated. Als filename the modelname encoded in the file is used and as file extension the request output format (.png,.svg,.ps,.pdf). If stockholm alignments have been provided, then for each model a alignment visualisation with index colums is created. The alignment file name is the model name, followed by ".aln" and then the requested file format extension. The webservice gzips all results and provides a download link. Following is a example result table for the EGF-protein family from Pfam (PF00008).
|Model name||Minimal Model-pdf||Minimal Model-svg||Simple Model-pdf||Simple Model-svg||Detailed Model-pdf||Detailed Model-svg||Alignment-pdf||Alignment-svg|
The webservice also creates a zoom- and panable preview, that can be expanded by clicking. Here the the detailed visualisation is shown: