Rank nsSNPs According to Specific Disease Concepts

Enter Your Mutations:

Overview:


Our disease-specific algortihm is is capable of ranking protein missense mutations according to seventeen disease concepts by combining sequence conservation within hidden Markov models (HMMs), representing the alignment of homologous sequences and conserved protein domains, with "pathogenicity weights", representing the overall tolerance of the corresponding model to disease concepts, e.g. musculoskeletal and/or metabolic disease.

For more information, please refer to the following publications:

Shihab HA, Gough J, Mort M, Cooper DN, Day INM, Gaunt, TR. A Method for Ranking Non-Synonymous Single Nucleotide Polymorphisms based on Disease Concepts (submitted)

Shihab HA, Gough J, Cooper DN, Stenson PD, Barker GLA, Edwards KJ, Day INM, Gaunt, TR. (2013). Predicting the Functional, Molecular and Phenotypic Consequences of Amino Acid Substitutions using Hidden Markov Models. Hum. Mutat., 34:57-65 fathmm Paper


Back to Top ...

Input Format:


Our software accepts one of the following formats (see here for annotating VCF files):

  • <protein> <substitution>
  • dbSNP rs identifiers

In the above, <protein> is the protein identifier and <substitution> is the amino acid substitution in the conventional one letter format. At present, our server accepts SwissProt/TrEMBL, RefSeq and Ensembl protein identifiers, e.g.:

P43026 L441P
or:

rs137854462


Back to Top ...

Batch Submission:


It is possible to submit multiple amino acid substitutions as a 'Batch Submission' via our server. Here, all amino acid substitutions for a protein can be entered on a single line and should be separated by a comma, e.g:

P43026 L441P
ENSP00000325527 N548I,E1073K,C2307S 

Note: this option is not available when analysing dbSNP rs identifiers.


Back to Top ...

Prediction Score:


Our disease-specific predictions are still experimental; therefore, we have not defined clear prediction thresholds for identifying whether a mutation is associated with your disease of interest or not. However, predictions scoring less than zero indicate there is a chance the mutation is associated with your disease of interest, with lower scores indicating increased confidence in the association.


Back to Top ...

VCF Annotation:


Unfortunately, due to disk space constraints, we are unable to annotate Variant Call Format (VCF) files on your behalf. However, the consequences of all VCF variants can be derived using the Ensembl Variant Effect Predictor (VEP). Once annotated, the following script (available here) is capable of parsing these annotations and will provide you with a list of protein consequences which can then be used as input into our server/software.

Additional help on using our script is available by typing the following command:

python parseVCF.py --help


Back to Top ...