Previous - Next - Bottom - PP home - PP help TOC

PP Help 03: Default input and output

Contents

DEFAULT INPUT AND OUTPUT

Default input format (www)

For WWW submissions, simply fill in the appropriate form and let us know if you'd anticipate problems in understanding what the form wants you to do!

Please:

make sure that you fill in your correct email address,
paste only protein sequence into the respective box.

thanks!

Default input format (email)

For email submission create a file with the following content (example):

Your email address (one line)
Commercial users: "password(my_password)"
A one-line description of the protein sequence submitted that must start with the symbol "#".
The amino acid sequence in one-letter-code (one or more lines of any length, letters or blanks only, no special symbols or numbers, NOTHING may follow your sequence).

PredictProtein insists that you adhere to this format. Your input file will in general only be read by a computer program. Any additional messages to be read by one of us in person should be sent

by fax via:     Predict-Help at +1-212-305-7932
by email via:   predict_help@columbia.edu
or via WWW:     http://cubic.bioc.columbia.edu/predictprotein/send_feedback.html

Default output

The output format is self-documenting (examples given for the major prion protein precursor prio_human (all,phd only) and the known 3D structure of HIV protease 1hhp). The output contains:

A list of likely homologues found in the protein database (SWISSPROT), and the multiple sequence alignment of these sequence (by default in 'MSF' format)
If found: a list of the putative ProSite motifs.
If found: a list of ProDom domain assignments.
If found: a prediction of coiled-coil regions.
Information about the expected levels of accuracy of structure predictions. (We suggest that newcomers read this carefully.)
Prediction of aspects of protein structure. These are grouped in the following way:
1. Prediction of secondary structure for all residues, with an expected average three-state accuracy of > 72%;
2. Prediction of secondary structure for reliably scored residues only, with an expected three-state accuracy for these residues of > 82%;
3. Prediction of solvent accessibility for all residues, with an expected average correlation to the experimentally observed values of 0.54;
4. Prediction of solvent accessibility for reliably scored residues only, with an expected correlation between experimental observation and prediction of 0.69;
5. Prediction of transmembrane helices and their topology (if any detected), with an expected prediction accuracy of about 95% in two states.

Note: for the prediction of transmembrane helices a conservative threshold is chosen. Thus, your protein may not be reported to contain a HTM although it may have one. If you opt explicitly for the refined prediction of transmembrane helices and topology ("predict htm"), four predictions are given (example for output)):

neural network prediction (expected accuracy for HTM's about 78%);
result of empirical filter (expected accuracy for HTM's about 97%);
refined prediction (expected accuracy for HTM's about 99%);
prediction of topology (expected accuracy about 86%).

Example for input and output (important for email submission)

Submitting a single sequence

INPUT is: your protein sequence,
OUTPUT is: alignment + prediction

INPUT
You send the following file:

joe@amino.churn.edu
# incredulase from paracoccus dementiae, translated from cDNA
KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKD
WWKVEVNDRQGFVPAAYVKKLD

Notes:

The '#' is a control for PredictProtein.
The hash (#) is crucial, as the parser interprets anything after this line as a protein sequence. Following the hash, put a one-line description of the protein.

OUTPUT (detailed example)
If your sequence has at least one non-trivial homologue in the database of protein sequences, you receive a multiple sequence alignment and the annotated prediction in the following form:

Block with multiple sequence alignment.
Block with explanations about the prediction method.
Block with prediction (example for secondary structure prediction follows).

    .........1.........2.........3.........4.........5.........6
AA  KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLD
PHD   EEEEEE                EEEEEE     EEEEEE    EEEE  EEE   
Rel 854777641334566643102441577762566642443213663122112234155

Previous - Next - Top - PP home - PP help TOC