Previous - Next - Bottom - PP home - PP help TOC

PP Help 07: Examples for input formats

Contents

Note: these examples for input formats are primarily important for users using the email submission procedure (instead of directly filling in the forms on the WWW: default, advanced, expert form)

  1. Single sequence in
  2. FASTA list (unaligned)
  3. PIR list (unaligned)
  4. SAF alignment
  5. MSF alignment
  6. FASTA list (aligned)
  7. FASTA list (aligned)
  8. COLUMN (for TOPITS)
  9. COLUMN (for EvalSec)
  10. SWISSPROT identifier



EXAMPLES for input formats (required for email submissions)

Note: the examples for the allowed PP input formats are primarily important when you submit the request by email.

Example for input and output (important for email submission)

Submitting a single sequence INPUT
You send the following file:
joe@amino.churn.edu
# incredulase from paracoccus dementiae, translated from cDNA
KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKD
WWKVEVNDRQGFVPAAYVKKLD

Notes:

OUTPUT (detailed example)
If your sequence has at least one non-trivial homologue in the database of protein sequences, you receive a multiple sequence alignment and the annotated prediction in the following form:

Block with multiple sequence alignment.
Block with explanations about the prediction method.
Block with prediction (example for secondary structure prediction follows).

    .........1.........2.........3.........4.........5.........6
AA  KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLD
PHD   EEEEEE                EEEEEE     EEEEEE    EEEE  EEE   
Rel 854777641334566643102441577762566642443213663122112234155


Submitting a set of unaligned sequences (in FASTA format)

INPUT
You send the following file:
joe@amino.churn.edu
# FASTA list incredulase from paracoccus dementiae, translated from cDNA
> Andr_Mouse
RQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFT
> Prgr_Rabit
QLLSVVKWSKSLPGFRNLHIDDQITLIQYSWMSLMVFGLRSYK
:

Notes:

OUTPUT (example)

Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.


Submitting a set of unaligned sequences (in PIR format)

INPUT
You send the following file:
joe@amino.churn.edu
# PIR list incredulase from paracoccus dementiae, translated from cDNA
>P1;
Andr_Mouse
RQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFT
>P1;
Prgr_Rabit
QLLSVVKWSKSLPGFRNLHIDDQITLIQYSWMSLMVFGLRSYK
:

Notes:

OUTPUT (example)

Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.


Submitting your alignment (in SAF format)

Note: I do strongly recommend this as THE option of choice for non-experts (rather than the MSF format). INPUT You send the following file:
joe@amino.churn.edu
# SAF incredulase from paracoccus dementiae, translated from cDNA
Andr_Human RQLVHVVKWA KALPGFRNLH VDDQMAVIQY SWMGLMVFAM GWRSFT
Prgr_Rabit .QLLSVVKWS KSLPGFRNLH IDDQITLIQY SWMSLMVFGL GWRSYK

Notes:

OUTPUT (example)

Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.


Submitting your alignment (in MSF format)

Note: To non-experts I strongly recommend to use the SAF format, instead (see above). INPUT You send the following file:
joe@amino.churn.edu
# MSF incredulase from paracoccus dementiae, translated from cDNA
MSF of: x.hssp from: 1 to: 176
x.msf MSF: 176 Type: P 11-Oct-93 21:17:4 Check: 5859 ..
Name: Andr_Human Len: 176 Check: 750 Weight: 1.00
Name: Prgr_Rabit Len: 176 Check: 3980 Weight: 1.00
//
Andr_Human RQLVHVVKWA KALPGFRNLH VDDQMAVIQY SWMGLMVFAM GWRSFT
Prgr_Rabit .QLLSVVKWS KSLPGFRNLH IDDQITLIQY SWMSLMVFGL GWRSYK

Notes:

OUTPUT (example)

Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.


Submitting a set of aligned sequences (in FASTA format)

INPUT
You send the following file:
joe@amino.churn.edu
do NOT align
# FASTA list incredulase from paracoccus dementiae, translated from cDNA
> Andr_Mouse
RQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFT
> Prgr_Rabit
QLLSVVKWSKSLPGFRNLHIDDQITLIQYSWMSLMVFGLRSYK
:

Notes:

Block with ProDom domain assignment (if found).

Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.


Submitting a set of aligned sequences (in PIR format)

INPUT
You send the following file:
joe@amino.churn.edu
do NOT align
# PIR list incredulase from paracoccus dementiae, translated from cDNA
>P1;
Andr_Mouse
RQLVHVVKWAKALPGFRNLHVDDQMAVIQYSWMGLMVFAMGWRSFT
>P1;
Prgr_Rabit
QLLSVVKWSKSLPGFRNLHIDDQITLIQYSWMSLMVFGLRSYK
:

Notes:

Block with ProDom domain assignment (if found).

Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.


Submitting 1D structure prediction for fold recognition (in COLUMN format)

INPUT You send the following file:
joe@amino.churn.edu
prediction-based threading
# COLUMN format
AA PSEC PACC RI_SEC RI_ACC
E L 11 9 6
F E 7 9 0
: : : : :
V H 61 3 0
L H 113 3 0
R H 39 1 1
: : : : :
P L 17 9 4
A L 89 9 2
: : : : :
  1. Delimiters of columns: allowed are spaces, commas, and tabs.
  2. Compulsory information: (1) sequence (AA) in one-letter code; (2) secondary structure (PSEC) in either of the states H=helix, E=strand, or L=rest; (3) solvent accessibility (PACC) in square Angstrom (note: for prediction-based threading accessibility will be converted to relative accessibility in two states: buried (<15%) or exposed (≥15%)).
  3. Optional: (1) reliability, or strength for secondary structure (RI_SEC) scaled from 0 (low) to 9 (high); (2) reliability, or strength for relative accessibility (RI_ACC) scaled from 0 (low) to 9 (high).
  4. Notes:
    • The string '# COLUMN format' is crucial, as the parser interprets anything after this line as a prediction.
    • To receive PHD prediction in this format use the output option 'return COLUMN format'.
OUTPUT (example)
Block with ProDom domain assignment (if found).
Block with ProSite motif (if found).
Block with predictions of coiled-coil regions (if found).
Block with explanations about the prediction method.
Block with prediction.


Submitting secondary structure for evaluation of prediction accuracy (in COLUMN format)

INPUT You send the following file:
joe@amino.churn.edu
evaluate prediction accuracy
# COLUMN format
NAME AA PSEC OSEC
first M L L
first Q L L
first T L H
first S H H
first S H H
first I H H
: : : :
second G L L
second V L L
second K E L
second S L H
second I L H
: : : :

  1. Delimiters of columns: allowed are spaces, commas, and tabs.
  2. Compulsory information: (1) sequence (AA) in one-letter code; (2) secondary structure (PSEC) in either of the states H=helix, E=strand, or L=rest; (3) observed (OSEC) secondary structure in either of the states H=helix, E=strand, or L=rest (e.g. from DSSP assignment); (4) if more than one protein is used, simply append all requested proteins (in that case make sure that the first column (NAME) lists a unique protein name).
  3. Optional: (1) name of protein (compulsory for more than one protein).
  4. (Note: Your email address is required. The string "# COLUMN format" is crucial, as the parser interprets anything after this line as a prediction.)
OUTPUT (example)
Block with definition of scores for prediction accuracy.
Tables with per-residue and per-segment prediction accuracy.


Submitting a single sequence through its SWISSPROT identifier

INPUT
You send the following file:
joe@amino.churn.edu
# SWISSid paho_chick

Notes:





Previous - Next - Top - PP home - PP help TOC