Example for EVALSEC (default output)


The output consists of the following parts:



The following information has been received by the server:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________

rost
evaluate prediction accuracy
return concise
# COL format
NAME     AA  PSEC OSEC   RI_SEC
first    M   L    L     9
first    Q   L    L     9
first    T   L    H     8
first    S   H    H     4
first    E   H    H     9
first    R   H    H     8
first    L   H    H     7
first    A   L    H     4
first    G   L    L     8
first    V   L    L     9
first    K   L    L     3
first    Q   H    H     6
first    Q   H    H     9
first    S   H    H     9
first    I   H    H     9
second   G   L    L     8
second   V   L    L     9
second   K   E    L     3
second   Q   E    H     6
second   Q   E    H     9
second   S   L    H     9
second   I   L    H     9
________________________________________________________________________________

The resulting scores for accuracy are:   	
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________

SYM  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SYM  Abbreviations for accuracy of secondary structure prediction
SYM  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SYM
SYM          For an explanation of the scores, please see:
SYM          per-residue accuracy: Rost & Sander, JMB, 1993, 232, 584-599
SYM          per-segment accuracy: Rost et al., JMB, 1994, 235, 13-26
SYM
SYM  H, E, L:   helix (H), extended strand (E), all others (L = loop)
SYM
SYM  obs, prd:  observed, predicted
SYM
SYM  ~~~~~~~~~~~~~~~~~~
SYM  Per-residue scores
SYM  ~~~~~~~~~~~~~~~~~~
SYM
SYM  A(i,j): number of residues observed in secondary structure state i and
SYM          predicted in secondary structure state j, where i and j is can
SYM          be either of the following: helix (H), strand (E), or other (L)
SYM
SYM            number of residues correctly predicted in state i
SYM  Q(i)obs = ------------------------------------------------- * 100
SYM            number of residues observed in state i
SYM
SYM            number of residues correctly predicted in state i
SYM  Q(i)prd = ------------------------------------------------- * 100
SYM            number of residues predicted in state i
SYM
SYM  Q3      overall three-state per-residue accuracy (three states: H,E,L)
SYM          defined by:
SYM            number of residues correctly predicted
SYM          = --------------------------------------- * 100
SYM            number of all residues
SYM
SYM  BAD     percentage of residues predicted in helix, observed in strand
SYM          or predicted in strand and observed in helix
SYM
SYM  OVER    percentage of residues predicted in helix or strand, and ob-
SYM          served in loop
SYM
SYM  UNDER   percentage of residues predicted in helix or strand, and ob-
SYM          served in loop
SYM
SYM  Iobs    information entropy contained in matrix A(i,j), defined by:
SYM
SYM              SUM                    SUM
SYM              SUM   a(i)*ln a(i)  -  SUM  A(i,j) * ln A(i,j)
SYM              SUM                    SUM
SYM               i                      ij
SYM          =  ________________________________________________
SYM
SYM                                 SUM
SYM                    N * ln N  -  SUM  b(i) * ln b(i)
SYM                                 SUM
SYM                                  i
SYM
SYM          where N is the number of residues, a(i) the number of residues
SYM          predicted to be in secondary structure i; b(i) the number of re-
SYM          sidues observed to be in i; and A(i,j) the number of residues
SYM          predicted to be in i and observed to be in j.
SYM
SYM  Iprd    information entropy but weighted by the predicted numbers, i.e.,
SYM          same as Iobs by exchanging b(i) <-> a(i).
SYM
SYM  COR(i)  Matthew correlation coefficient for structure i
SYM
SYM  /prot
SYM          overall three-state accuracy averaged over proteins (as opposed
SYM          posed to residues).
SYM
SYM  Dcontent(i)
SYM          difference between observed and predicted content of secondary
SYM          structure type i (percentage)
SYM
SYM
SYM  ~~~~~~~~~~~~~~~~~~
SYM  Per-segment scores
SYM  ~~~~~~~~~~~~~~~~~~
SYM
SYM  avL(i)obs   average length for the structure type i as observed
SYM          e.g., average length of an observed helix
SYM
SYM  avL(i)prd   average length for the structure type i as predicted
SYM          e.g., average length of a predicted helix
SYM
SYM  SOV(i)obs
SYM  SOV(i)prd
SYM          fractional overlap (in percentage between segments predicted
SYM          and observed in structure type i), defined by:
SYM
SYM            SUM   1     MINOV(S1;S2) + DELTA
SYM  SOV(i)  = SUM   -  *  --------------------  *  LEN(S1)
SYM            SUM   N         MAXOV(S1;S2)
SYM             S
SYM
SYM          where N is the total number of residues, S1 and S2 are the ob-
SYM          served and predicted secondary structure segments (in state i),
SYM          and LEN(S1) is the number of residues in the segments of S1.
SYM          The sum (SUMSUMSUM) is taken over all segment pairs S={S1,S2}.
SYM          The actual overlap bewteen the two segments is MINOV, i.e.,the
SYM          number of residues for which both segments have, e.g., a H (he-
SYM          lix) in common; maxov is the total extent of both segments,i.e.,
SYM          the number of residues jfor which either jof the two has, say,
SYM          the assigned state H.  The accepted variation DELTA assures a
SYM          ratio of 1.0 when there are only minor deviations at segment
SYM          ends; it is chosen to be smaller than MINOV and smaller than
SYM          half the length of segment S1.  The ratio MINOV/MAXOV is con-
SYM          strained to a maximum value of 1.0, i.e., the allowance cannot
SYM          lead to a "more than perfect" value of fractional overlap for
SYM  obs     any segment comparison.  The addition of 'obs' (SOV(i)obs)
SYM          indicates that the length of the observed segments was used for
SYM          weighting (likelihood that an observed segment is correctly
SYM  prd     predicted), i.e., S1 is the observed segment. In contrast, 'prd'
SYM          labels the weighting by the lengtyh of the predicted segments
SYM          (likelihood that a predicted segment is correct).
SYM
SYM  SOV3    fractional segment overlap for all three states H, E, L
SYM

#
#
# Prediction accuracy for FIRST
#
#
# A(i,j): number of residues observed in state i, predicted in j:
#
DAT +---------+---------+---------+---------+---------+
DAT | NUMBERS |  prd  H |  prd  E |  prd  L | obs Sum |
DAT +---------+---------+---------+---------+---------+
DAT |  obs  H |       8 |       0 |       2 |      10 |
DAT |  obs  E |       0 |       0 |       0 |       0 |
DAT |  obs  L |       0 |       0 |       5 |       5 |
DAT +---------+---------+---------+---------+---------+
DAT | prd Sum |       8 |       0 |       7 |      15 |
DAT +---------+---------+---------+---------+---------+
#
# Per-residue and Per-segment scores:
#
DAT +---------------------------------+ +---------------------------------------+
DAT |        Per-residue scores       | |           Per-segment scores          |
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
DAT | SCORES  |Q(i)obs|Q(i)prd| COR(i)| |SOV(i)obs|SOV(i)prd|avL(i)obs|avL(i)prd|
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
DAT | i =  H  |    80 |   100 |  0.76 | |   100.0 |   100.0 |     5.0 |     4.0 |
DAT | i =  E  |     0 |     0 |  0.00 | |     0.0 |     0.0 |     0.0 |     0.0 |
DAT | i =  L  |   100 |    71 |  0.76 | |   100.0 |   100.0 |     2.5 |     3.5 |
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
#
# Overall scores:
#
DAT +---------------------------------+ +---------------------------------------+
DAT |   Overall per-residue scores    | |       Overall per-segment scores      |
DAT +-------+--------+-------+--------+ +---------+---------+---------+---------+
DAT | OVER  |   0.0  | UNDER |  13.3  | |                                       |
DAT | I obs |   0.48 | I prd |   0.48 | |                                       |
DAT |  Q3   |  86.7  |  BAD  |   0.0  | | SOV3obs |  100.0  | SOV3prd |  100.0  |
DAT +-------+========+-------+--------+ +---------+=========+---------+---------+
#
#
# Prediction accuracy for SECOND
#
#
# A(i,j): number of residues observed in state i, predicted in j:
#
DAT +---------+---------+---------+---------+---------+
DAT | NUMBERS |  prd  H |  prd  E |  prd  L | obs Sum |
DAT +---------+---------+---------+---------+---------+
DAT |  obs  H |       0 |       2 |       2 |       4 |
DAT |  obs  E |       0 |       0 |       0 |       0 |
DAT |  obs  L |       0 |       1 |       2 |       3 |
DAT +---------+---------+---------+---------+---------+
DAT | prd Sum |       0 |       3 |       4 |       7 |
DAT +---------+---------+---------+---------+---------+
#
# Per-residue and Per-segment scores:
#
DAT +---------------------------------+ +---------------------------------------+
DAT |        Per-residue scores       | |           Per-segment scores          |
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
DAT | SCORES  |Q(i)obs|Q(i)prd| COR(i)| |SOV(i)obs|SOV(i)prd|avL(i)obs|avL(i)prd|
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
DAT | i =  H  |     0 |     0 |  0.00 | |     0.0 |     0.0 |     4.0 |     0.0 |
DAT | i =  E  |     0 |     0 |  0.00 | |     0.0 |     0.0 |     0.0 |     3.0 |
DAT | i =  L  |    66 |    50 |  0.17 | |   100.0 |    50.0 |     3.0 |     2.0 |
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
#
# Overall scores:
#
DAT +---------------------------------+ +---------------------------------------+
DAT |   Overall per-residue scores    | |       Overall per-segment scores      |
DAT +-------+--------+-------+--------+ +---------+---------+---------+---------+
DAT | OVER  |  14.3  | UNDER |  28.6  | |                                       |
DAT | I obs |  -0.38 | I prd |  -0.38 | |                                       |
DAT |  Q3   |  28.6  |  BAD  |  28.6  | | SOV3obs |   42.9  | SOV3prd |   28.6  |
DAT +-------+========+-------+--------+ +---------+=========+---------+---------+
#
#
# Prediction accuracy for Average over all residues
#
#
# A(i,j): number of residues observed in state i, predicted in j:
#
DAT +---------+---------+---------+---------+---------+
DAT | NUMBERS |  prd  H |  prd  E |  prd  L | obs Sum |
DAT +---------+---------+---------+---------+---------+
DAT |  obs  H |       8 |       2 |       4 |      14 |
DAT |  obs  E |       0 |       0 |       0 |       0 |
DAT |  obs  L |       0 |       1 |       7 |       8 |
DAT +---------+---------+---------+---------+---------+
DAT | prd Sum |       8 |       3 |      11 |      22 |
DAT +---------+---------+---------+---------+---------+
#
# Per-residue and Per-segment scores:
#
DAT +---------------------------------+ +---------------------------------------+
DAT |        Per-residue scores       | |           Per-segment scores          |
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
DAT | SCORES  |Q(i)obs|Q(i)prd| COR(i)| |SOV(i)obs|SOV(i)prd|avL(i)obs|avL(i)prd|
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
DAT | i =  H  |    57 |   100 |  0.57 | |    71.4 |   100.0 |     4.7 |     4.0 |
DAT | i =  E  |     0 |     0 |  0.00 | |     0.0 |     0.0 |     0.0 |     3.0 |
DAT | i =  L  |    87 |    63 |  0.57 | |   100.0 |    81.8 |     2.7 |     2.8 |
DAT +---------+-------+-------+-------+ +---------+---------+---------+---------+
#
# Overall scores:
#
DAT +---------------------------------+ +---------------------------------------+
DAT |   Overall per-residue scores    | |       Overall per-segment scores      |
DAT +-------+--------+-------+--------+ +---------+---------+---------+---------+
DAT | OVER  |   4.5  | UNDER |  18.2  | |                                       |
DAT | I obs |   0.27 | I prd |   0.27 | |                                       |
DAT |  Q3   |  68.2  |  BAD  |   9.1  | | SOV3obs |   81.8  | SOV3prd |   77.3  |
DAT +-------+========+-------+--------+ +---------+=========+---------+---------+
#
# Per-residue accuracy averaged over all     2 proteins:
#
+---------------------+---------------------------------+
| /prot  =  57.62 | one standard deviation =  57.62 |
+---------------------+---------------------------------+
#
# Accuracy of predicting secondary structural content:
#
DAT +---------------------+---------------------------------+
DAT | Dcontent H =  35.24 | one standard deviation =  35.24 |
DAT | Dcontent E =  21.43 | one standard deviation =  21.43 |
DAT +---------------------+---------------------------------+
#
# Accuracy of predicting secondary structural class:
#
#        Sorting into structure class according to
#        Zhang, C.-T. and Chou, K.-C., Prot. Sci. 1:401-408, 1992:
#           all-H: percentage of H >= 45% , percentage of E <  5%
#           all-E: percentage of H <   5% , percentage of E >=45%
#           mix  : percentage of H >= 30% , percentage of E >=20%
#
DAT +-------+-------+-------+-------+-------+-------+
DAT |       |  sum  |  sum  |  sum  |   Q   |   Q   |
DAT | class |  obs  |  prd  |correct| %obs  | %prd  |
DAT +-------+-------+-------+-------+-------+-------+
DAT | all-H |    2  |    1  |    1  |  50.0 | 100.0 |
DAT | all-E |    0  |    0  |    0  |   0.0 |   0.0 |
DAT | mix   |    0  |    0  |    0  |   0.0 |   0.0 |
DAT | other |    0  |    1  |    0  |   0.0 |   0.0 |
DAT +-------+-------+-------+-------+-------+-------+
DAT | SUM   |    4  |    4  |    2  |  50.0 |  50.0 |
DAT +-------+-------+-------+-------+-------+-------+
END