Previous - Next - Bottom - PP home - PP help TOC

PP Help 01: Introduction



  1. What is PredictProtein?
  2. How does PP work?
  3. What is META PP?
  4. How to use PP and META-PP?
    1. email
    2. www
  5. What can we do for you?
  6. Responsible administrator
  7. If You Do NOT Receive a reply


What is PredictProtein (PP)?

PP is an automatic service for protein database searches and the prediction of aspects of protein structure. You send an amino acid sequence and PP returns:

  1. a multiple sequence alignment (i.e. database search),
  2. ProSite sequence motifs (more info),
  3. low-complexity retions (SEG) ( more info),
  4. ProDom domain assignments (more info),
  5. Nuclear localisation signals ( more info),
  6. and predictions of
    1. secondary structure (more info),
    2. solvent accessibility (more info),
    3. globular regions ( more info),
    4. transmembrane helices (more info),
    5. coiled-coil regions ( more info).
    6. structural switch regions ( more info).
The following features are available upon request:
  1. fold recognition by prediction-based threading (more info):
    PDB is searched for possible remote homologues (sequence identity 0-25%) to your sequence,
  2. evaluation of prediction accuracy (more info):
    for a given predicted and observed secondary structure (for one or several proteins), per-residue and per-segment scores are compiled.
For all services, you can submit your sequence (or prediction) either by electronic mail, or interactively from World Wide Web.

How does PredictProtein work?

Generating an alignment. The following steps are performed.

  1. the sequence database (currently only SWISSPROT) is scanned for similar sequences (by BLASTP).
  2. a multiple sequence alignment is generated by a weighted dynamic programming method (by MaxHom).
  3. ProSite motifs are retrieved from the ProSite database,
  4. low-complexity regions (e.g. composition bias) are marked by the program SEG,
  5. and your protein is compared to a domain database (ProDom),

Prediction of protein structure in 1D. The multiple alignment is used as input for profile-based neural network predictions (PHD methods). The following levels of prediction accuracy have been evaluated in cross-validation experiments:

  1. Secondary structure prediction (PHDsec or PROFsec):
    expected three-state (helix, strand, rest) overall accuracy >72% (PHD) >76% (PROF) for water-soluble globular proteins. For an automatic, continuous comparison of prediction accuracy to other programs see EVA.
    You may find details about accuracy in graphs, on tables, and in the literature: Rost 1997 (paper) and 1996 (paper); Rost & Sander 1993 (abstract) and 1994 (abstract).
  2. Solvent accessibility prediction (PHDacc or PROFacc):
    expected correlation between observed and predicted relative accessibility > 0.5.
    You may find details about accuracy in graphs, on tables, and in the literature: Rost 1997 (paper) and 1996 (paper), Rost & Sander 1994 (abstract).
  3. Transmembrane helix prediction (PHDhtm):
    expected overall two-state accuracy (transmembrane, non-transmembrane) > 95%; refined prediction of transmembrane helices and topology & expected likelihood of predicting all helices correctly about 89%, expected accuracy of topology prediction > 86%
    You may find details about accuracy on tables, and in the literature: Rost, Casadio & Fariselli 1996 (abstract), and Rost, Casadio, Fariselli & Sander 1995 abstract).

Fold recognition by prediction-based threading. Predictions of secondary structure and accessibility are aligned against PDB to detect remote homologues (prediction-based threading). As for other threading methods, results should be taken with caution.

You may find details about accuracy in the literature: Rost, Schneider & Sander, 1996 (paper), Rost 1995 (abstract) and 1994 (abstract).

Evaluation of prediction accuracy. If you opt for 'evaluate prediction accuracy', we evaluate the accuracy of the secondary structure prediction provided by you. The following per-residue and per-segment scores are returned: overall three-state accuracy, single state accuracy, correlation coefficients, information entropy, fractional segment overlap, and finally the accuracy of predicting secondary structure content and structural class (Rost et al., JMB, 1994, 235, 13-26, example for output).

What is META-PP?

META-PP provides a single-page interface to various World Wide Web services for sequence analysis (list of servers available at the moment). 'Single-page interface' means that you fill in your sequence only once, and can select any number of a list of services. For each selected service, you will receive the results by email. Currently, the following features of sequence analysis are covered by META-PP:

  1. signal peptides
  2. cleavage sites
  3. O-glycosylation sites
  4. cleavage sites of picornaviral proteases
  5. chloroplast transit peptides and cleavage sites
  6. secondary structure prediction
  7. membrane helix prediction
  8. threading, or remote homology modelling (searching for proteins of known 3D structure that appear structurally similar to your protein)
  9. database searches
  10. homology modelling (prediction of protein 3D structure by homology to a sequence similar protein of known structure) NOTE: this will only work if there is a protein of known structure that has sufficient sequence similarity to your protein!
  11. How to use PP and META-PP?

    Use of the PredictProtein server is free for academics. Commercial users may want to apply for a license.
    The use of META-PredictProtein is currently restricted to academical users.
    1. Using email (internet, not for META):
      1. Prepare a file with your sequence(s) according to the required format (see below), and:
        Send sequence(s) to:
      2. Send questions to:
      3. Send problem reports to:
    2. Using World Wide Web (WWW):
      1. Home page:
      2. Help page (this):
      3. Submit request to PP:
      4. Submit request to META-PP:
      5. Questions, feedback:

    What can we do for you?

    Responsible for PP

    Burkhard Rost, CUBIC, Biochemistry, Columbia University, New York

    Previous - Next - Top - PP home - PP help TOC