Example for the default output from PredictProtein: MaxHom and PHD

Examples for the input submission that resulted in the output below:


The output consists of the following parts:




The following information has been received by the server:          
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~          
________________________________________________________________________________

b.rost
EMBL, 69012 Heidelberg, Europe
rost@embl-heidelberg.de
# CYTOCHROME C OXIDASE POLYPEPTIDE I (cox1_parde)
MSAQISDSIEEKRGFFTRWFMSTNHKDIGVLYLFTAGLAGLISVTLTVYMRMELQHPGVQ
YMCLEGMRLVADAAAECTPNAHLWNVVVTYHGILMMFFVVIPALFGGFGNYFMPLHIGAP
DMAFPRLNNLSYWLYVCGVSLAIASLLSPGGSDQPGAGVGWVLYPPLSTTEAGYAMDLAI
FAVHVSGATSILGAINIITTFLNMRAPGMTLFKVPLFAWAVFITAWMILLSLPVLAGGIT
MLLMDRNFGTQFFDPAGGGDPVLYQHILWFFGHPEVYMLILPGFGIISHVISTFARKPIF
GYLPMVLAMAAIAFLGFIVWAHHMYTAGMSLTQQTYFQMATMTIAVPTGIKVFSWIATMW
GGSIEFKTPMLWALAFLFTVGGVTGVVIAQGSLDRVYHDTYYIVAHFHYVMSLGALFAIF
AGTYYWIGKMSGRQYPEWAGQLHFWMMFIGSNLIFFPQHFLGRQGMPRRYIDYPVEFSYW
NNISSIGAYISFASFLFFIGIVFYTLFAGKPVNVPNYWNEHADTLEWTLPSPPPEHTFET
LPKPEDWDRAQAHR
________________________________________________________________________________




The sequence had been interpreted as being:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

________________________________________________________________________________

>P1; t2
(#)  cytochrome c oxidase polypeptide i (cox1_parde)
MSAQISDSIEEKRGFFTRWFMSTNHKDIGVLYLFTAGLAGLISVTLTVYMRMELQHPGVQ
YMCLEGMRLVADAAAECTPNAHLWNVVVTYHGILMMFFVVIPALFGGFGNYFMPLHIGAP
DMAFPRLNNLSYWLYVCGVSLAIASLLSPGGSDQPGAGVGWVLYPPLSTTEAGYAMDLAI
FAVHVSGATSILGAINIITTFLNMRAPGMTLFKVPLFAWAVFITAWMILLSLPVLAGGIT
MLLMDRNFGTQFFDPAGGGDPVLYQHILWFFGHPEVYMLILPGFGIISHVISTFARKPIF
GYLPMVLAMAAIAFLGFIVWAHHMYTAGMSLTQQTYFQMATMTIAVPTGIKVFSWIATMW
GGSIEFKTPMLWALAFLFTVGGVTGVVIAQGSLDRVYHDTYYIVAHFHYVMSLGALFAIF
AGTYYWIGKMSGRQYPEWAGQLHFWMMFIGSNLIFFPQHFLGRQGMPRRYIDYPVEFSYW
NNISSIGAYISFASFLFFIGIVFYTLFAGKPVNVPNYWNEHADTLEWTLPSPPPEHTFET
LPKPEDWDRAQAHR



The alignment that has been used as input to the network is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

________________________________________________________________________________

--- ------------------------------------------------------------
--- MAXHOM multiple sequence alignment
--- ------------------------------------------------------------
--- 
--- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY 
--- ID           : identifier of aligned (homologous) protein
--- STRID        : PDB identifier (only for known structures)
--- PIDE         : percentage of pairwise sequence identity
--- WSIM         : percentage of weighted similarity
--- LALI         : number of residues aligned
--- NGAP         : number of insertions and deletions (indels)
--- LGAP         : number of residues in all indels
--- LSEQ2        : length of aligned sequence
--- ACCNUM       : SwissProt accession number
--- NAME         : one-line description of aligned protein
---
--- MAXHOM ALIGNMENT HEADER: SUMMARY
ID         STRID  IDE WSIM LALI NGAP LGAP LEN2 ACCNUM NAME
cox1_parde        100  100  554    0    0  554 P08305 SUBUNIT 1).
cox1_rhosh         78   84  550    3   13  565 P33517 SUBUNIT 1).
cox1_scapl         67   76  153    2    3  157 P29654 CYTOCHROME C OXIDASE POLY
cox1_gomva         67   76  152    2    3  155 P29646 CYTOCHROME C OXIDASE POLY
cox1_polsp         67   76  155    2    3  158 P29650 CYTOCHROME C OXIDASE POLY
cox1_lepsp         67   76  154    2    3  157 P29644 CYTOCHROME C OXIDASE POLY
cox1_megat         67   76  157    2    3  160 P29648 CYTOCHROME C OXIDASE POLY
cox1_lepoc         67   76  157    2    3  163 P29647 CYTOCHROME C OXIDASE POLY
cox1_pomni         66   76  158    2    3  161 P29652 CYTOCHROME C OXIDASE POLY
cox1_saltr         64   76  107    1    1  109 P29653 CYTOCHROME C OXIDASE POLY
cox1_geosd         62   72  149    2    3  152 P29645 CYTOCHROME C OXIDASE POLY
cox1_braja         61   70  513    4   20  541 P31833 CYTOCHROME C OXIDASE POLY
cox1_panbu         61   72  181    2    3  184 P29649 CYTOCHROME C OXIDASE POLY
cox1_prowi         61   69  506    6   27  514 Q05143 CYTOCHROME C OXIDASE POLY
cox1_polsx         60   71  181    2    3  184 P29651 CYTOCHROME C OXIDASE POLY
cox1_amica         60   72  185    2    3  188 P29643 CYTOCHROME C OXIDASE POLY
cox1_marpo         60   69  512    5   25  522 P26856 CYTOCHROME C OXIDASE POLY
cox1_maize         60   69  511    5   27  528 P08742 CYTOCHROME C OXIDASE POLY
cox1_orysa         60   69  511    5   27  524 P14578 CYTOCHROME C OXIDASE POLY
cox1_wheat         60   69  511    5   27  524 P08741 CYTOCHROME C OXIDASE POLY
cox1_sorbi         59   69  519    5   27  530 P05502 CYTOCHROME C OXIDASE POLY
cox1_betvu         59   66  505    6   27  516 P24794 CYTOCHROME C OXIDASE POLY
cox1_soybn         57   66  509    5   27  527 P07506 CYTOCHROME C OXIDASE POLY
cox1_oenbe         56   65  509    5   27  527 P08743 CYTOCHROME C OXIDASE POLY
cox1_pea           56   66  509    5   27  527 P12786 CYTOCHROME C OXIDASE POLY
cox1_parli         56   67  502    7   31  517 P12700 CYTOCHROME C OXIDASE POLY
cox1_crola         56   67  499    6   30  516 P34188 CYTOCHROME C OXIDASE POLY
cox1_cypca         56   66  499    6   30  516 P24985 CYTOCHROME C OXIDASE POLY
cox1_strpu         56   66  502    7   31  517 P15544 CYTOCHROME C OXIDASE POLY
cox1_pisoc         56   66  504    7   31  517 P25001 CYTOCHROME C OXIDASE POLY
cox1_chick         56   65  500    7   31  515 P18943 CYTOCHROME C OXIDASE POLY
cox1_triru         55   66  508    5   38  528 Q01555 CYTOCHROME C OXIDASE POLY
cox1_yeast         55   66  503    5   27  512 P00401 CYTOCHROME C OXIDASE POLY
cox1_podan         55   66  515    5   38  541 P20681 CYTOCHROME C OXIDASE POLY
cox1_mouse         55   66  501    6   30  514 P00397 CYTOCHROME C OXIDASE POLY
cox1_rat           55   66  501    6   30  514 P05503 CYTOCHROME C OXIDASE POLY
cox1_human         55   66  501    6   30  513 P00395 CYTOCHROME C OXIDASE POLY
cox1_neucr         55   66  525    5   38  557 P03945 CYTOCHROME C OXIDASE POLY
cox1_emeni         55   66  524    5   38  567 P00402 CYTOCHROME C OXIDASE POLY
cox1_xenla         55   66  499    6   30  519 P00398 CYTOCHROME C OXIDASE POLY
cox1_bovin         55   66  501    6   30  514 P00396 CYTOCHROME C OXIDASE POLY
cox1_balph         55   66  501    6   30  516 P24983 CYTOCHROME C OXIDASE POLY
cox1_balmu         54   66  501    6   30  516 P41293 CYTOCHROME C OXIDASE POLY
cox1_chlre         54   63  494    6   33  505 P08681 CYTOCHROME C OXIDASE POLY
cox1_didma         54   66  500    6   30  513 P41310 CYTOCHROME C OXIDASE POLY
cox1_drome         54   65  507    6   30  512 P00399 CYTOCHROME C OXIDASE POLY
cox1_droya         54   65  507    6   30  512 P00400 CYTOCHROME C OXIDASE POLY
cox1_halgr         54   66  501    6   30  514 P38595 CYTOCHROME C OXIDASE POLY
cox1_phovi         54   66  501    6   30  514 Q00527 CYTOCHROME C OXIDASE POLY
cox1_anoqu         54   65  507    6   30  514 P33504 CYTOCHROME C OXIDASE POLY
cox1_anoga         53   65  507    6   30  514 P34838 CYTOCHROME C OXIDASE POLY
cox1_cotja         52   63  100    1   17  102 P24984 CYTOCHROME C OXIDASE POLY
cox1_schpo         52   63  510    8   41  537 P07657 CYTOCHROME C OXIDASE POLY
cox1_apime         50   62  501    6   31  521 P20374 CYTOCHROME C OXIDASE POLY
cox1_ascsu         48   59  503    6   29  525 P24881 CYTOCHROME C OXIDASE POLY
cox1_caeel         48   59  509    6   29  525 P24893 CYTOCHROME C OXIDASE POLY
cox1_thep3         46   52  504    8   32  615 P16262 SUBUNIT 1).
cox1_bacfi         45   52  514    7   30  624 Q04440 SUBUNIT 1).
cox1_syny3         44   51  515    5   29  533 Q06473 SUBUNIT 1).
cox1_bacsu         41   49  526    8   32  621 P24010 SUBUNIT 1) (CAA-3605 SUBU
qox1_bacsu         39   46  530    9   43  649 P34956 SUBUNIT QOXB).
cox1_leita         38   49  501    4   24  549 P14544 CYTOCHROME C OXIDASE POLY
cyob_ecoli         38   43  520    6   25  663 P18401 CYTOCHROME O UBIQUINOL OX
qoxm_sulac         37   42  477    9   37  788 P39481 QUINOL OXIDASE POLYPEPTID
cox1_halha         37   45  530    6   27  593 P33518 SUBUNIT 1).
cox1_trybb         37   48  501    4   24  549 P04371 CYTOCHROME C OXIDASE POLY
cox1_parte         32   39  518    8  148  645 P05489 CYTOCHROME C OXIDASE POLY
cox1_tetpy         32   41  543    6  150  698 P11947 CYTOCHROME C OXIDASE POLY

---
--- MAXHOM ALIGNMENT: IN MSF FORMAT
MSF of: /home/phd/tmp/t2_12833.hssp from:    1 to:  554
 /home/phd/tmp/t2_12833.ret_msf  MSF:  554  Type: P 15-Nov-95  04:01:3  Check: 3510  ..


 Name: t2_12833     Len:   554  Check: 3342  Weight:  1.00
 Name: cox1_parde   Len:   554  Check: 3342  Weight:  1.00
 Name: cox1_rhosh   Len:   554  Check: 2597  Weight:  1.00
 Name: cox1_scapl   Len:   554  Check: 6174  Weight:  1.00
 Name: cox1_gomva   Len:   554  Check: 4345  Weight:  1.00
 Name: cox1_polsp   Len:   554  Check: 8332  Weight:  1.00
 Name: cox1_lepsp   Len:   554  Check: 7195  Weight:  1.00
 Name: cox1_megat   Len:   554  Check: 1022  Weight:  1.00
 Name: cox1_lepoc   Len:   554  Check:  689  Weight:  1.00
 Name: cox1_pomni   Len:   554  Check: 1867  Weight:  1.00
 Name: cox1_saltr   Len:   554  Check: 6114  Weight:  1.00
 Name: cox1_geosd   Len:   554  Check: 6855  Weight:  1.00
 Name: cox1_braja   Len:   554  Check:  665  Weight:  1.00
 Name: cox1_panbu   Len:   554  Check: 7209  Weight:  1.00
 Name: cox1_prowi   Len:   554  Check:  482  Weight:  1.00
 Name: cox1_polsx   Len:   554  Check: 6803  Weight:  1.00
 Name: cox1_amica   Len:   554  Check: 2170  Weight:  1.00
 Name: cox1_marpo   Len:   554  Check: 5772  Weight:  1.00
 Name: cox1_maize   Len:   554  Check: 8474  Weight:  1.00
 Name: cox1_orysa   Len:   554  Check: 7904  Weight:  1.00
 Name: cox1_wheat   Len:   554  Check: 8039  Weight:  1.00
 Name: cox1_sorbi   Len:   554  Check: 8243  Weight:  1.00
 Name: cox1_betvu   Len:   554  Check: 4687  Weight:  1.00
 Name: cox1_soybn   Len:   554  Check: 1559  Weight:  1.00
 Name: cox1_oenbe   Len:   554  Check: 9577  Weight:  1.00
 Name: cox1_pea     Len:   554  Check: 1152  Weight:  1.00
 Name: cox1_parli   Len:   554  Check: 7876  Weight:  1.00
 Name: cox1_crola   Len:   554  Check: 9853  Weight:  1.00
 Name: cox1_cypca   Len:   554  Check: 1625  Weight:  1.00
 Name: cox1_strpu   Len:   554  Check: 9042  Weight:  1.00
 Name: cox1_pisoc   Len:   554  Check: 2768  Weight:  1.00
 Name: cox1_chick   Len:   554  Check: 8102  Weight:  1.00
 Name: cox1_triru   Len:   554  Check: 1564  Weight:  1.00
 Name: cox1_yeast   Len:   554  Check: 9105  Weight:  1.00
 Name: cox1_podan   Len:   554  Check: 8167  Weight:  1.00
 Name: cox1_mouse   Len:   554  Check: 8729  Weight:  1.00
 Name: cox1_rat     Len:   554  Check: 1347  Weight:  1.00
 Name: cox1_human   Len:   554  Check: 1065  Weight:  1.00
 Name: cox1_neucr   Len:   554  Check: 6844  Weight:  1.00
 Name: cox1_emeni   Len:   554  Check: 5797  Weight:  1.00
 Name: cox1_xenla   Len:   554  Check: 9868  Weight:  1.00
 Name: cox1_bovin   Len:   554  Check:  587  Weight:  1.00
 Name: cox1_balph   Len:   554  Check: 3044  Weight:  1.00
 Name: cox1_balmu   Len:   554  Check: 2779  Weight:  1.00
 Name: cox1_chlre   Len:   554  Check: 2203  Weight:  1.00
 Name: cox1_didma   Len:   554  Check: 7872  Weight:  1.00
 Name: cox1_drome   Len:   554  Check: 8497  Weight:  1.00
 Name: cox1_droya   Len:   554  Check: 1096  Weight:  1.00
 Name: cox1_halgr   Len:   554  Check: 9892  Weight:  1.00
 Name: cox1_phovi   Len:   554  Check:  319  Weight:  1.00
 Name: cox1_anoqu   Len:   554  Check: 9265  Weight:  1.00
 Name: cox1_anoga   Len:   554  Check: 1521  Weight:  1.00
 Name: cox1_cotja   Len:   554  Check: 7055  Weight:  1.00
 Name: cox1_schpo   Len:   554  Check: 6761  Weight:  1.00
 Name: cox1_apime   Len:   554  Check: 3689  Weight:  1.00
 Name: cox1_ascsu   Len:   554  Check:  252  Weight:  1.00
 Name: cox1_caeel   Len:   554  Check:  393  Weight:  1.00
 Name: cox1_thep3   Len:   554  Check: 9619  Weight:  1.00
 Name: cox1_bacfi   Len:   554  Check: 9313  Weight:  1.00
 Name: cox1_syny3   Len:   554  Check: 5842  Weight:  1.00
 Name: cox1_bacsu   Len:   554  Check: 7123  Weight:  1.00
 Name: qox1_bacsu   Len:   554  Check: 8281  Weight:  1.00
 Name: cox1_leita   Len:   554  Check: 5925  Weight:  1.00
 Name: cyob_ecoli   Len:   554  Check: 2221  Weight:  1.00
 Name: qoxm_sulac   Len:   554  Check: 7747  Weight:  1.00
 Name: cox1_halha   Len:   554  Check: 6043  Weight:  1.00
 Name: cox1_trybb   Len:   554  Check: 6181  Weight:  1.00
 Name: cox1_parte   Len:   554  Check: 5633  Weight:  1.00
 Name: cox1_tetpy   Len:   554  Check: 7995  Weight:  1.00

//


           1                                                   50
t2_12833   MSAQISDSIE EKRGFFTRWF MSTNHKDIGV LYLFTAGLAG LISVTLTVYM
cox1_parde MSAQISDSIE EKRGFFTRWF MSTNHKDIGV LYLFTAGLAG LISVTLTVYM
cox1_rhosh ..AAIHGHEH DRRGFFTRWF MSTNHKDIGV LYLFTGGLVG LISVAFTVYM
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja .......... .....WRRYV YSTNHKDIGT MYLIFAVIAG VIGAAMSIAI
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi .......... ....MVTRWL YSTNHKDIGT MYLIFGAFSG VLGTVFSLLI
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo .......... ....FAQRWL FSTNHKDIGT LYLIFGAIAG VMGTCFSVLI
cox1_maize .......... .....LVRWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_orysa .......... .....LVRWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_wheat .......... .....MVRWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_sorbi .......... .....LVRWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_betvu .......... .......... VSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_soybn .......... .......RWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_oenbe .......... .......RWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_pea   .......... .......RWL FSTNHKDIGT LYFIFGAIAG VMGTCFSVLI
cox1_parli .......... .....LSRWL FSTNHKDIGT LYLIFGAWAG MVGTAMSVII
cox1_crola .......... ......TRWF FSTNHKDIGT LYLVFGAWAG MVGTALSLLI
cox1_cypca .......... ......TRWF FSTNHKDIGT LYLVFGAWAG MVGTALSLLI
cox1_strpu .......... .....LSRWL FSTNHKDIGT LYLIFGAWAG MVGTAMSVII
cox1_pisoc .......... .....LSRWL FSTNHKDIGT LYLIFGAWAG MIGTAMSVII
cox1_chick .......... ....FINRWL FSTNHKDIGT LYLIFGTWAG MAGTALSLLI
cox1_triru .......... ......ERWF LSTNAKDIGT LYLMFRYFSG LVGTAFSVLI
cox1_yeast .......... ....MVQRWL YSTNAKDIAV LYFMLAIFSG MAGTAMSLII
cox1_podan .......... ....WIERWM LSTNAKDIGN LYLIFALFSG LLGTAFSVLI
cox1_mouse .......... ....FINRWL FSTNHKDIGT LYLLFGAWAG MVGTALSILI
cox1_rat   .......... ....FVNRWL FSTNHKDIGT LYLLFGAWAG MVGTALSILI
cox1_human .......... ....FADRWL FSTNHKDIGT LYLLFGAWAG VLGTALSLLI
cox1_neucr ........MS SISIWTERWF LSTNAKDIGV LYLIFALFSG LLGTAFSVLI
cox1_emeni IESSSFLTFK QPTEWQERWY LSSNAKDIGT LYLMFALFSG LLGTAFSVLI
cox1_xenla .......... ......TRWL FSTNHKDIGT LYLVFGAWAG LVGTALSLLI
cox1_bovin .......... ....FINRWL FSTNHKDIGT LYLLFGAWAG MVGTALSLLI
cox1_balph .......... ....FMNRWL FSTNHKDIGT LYLLFGAWAG MVGTGLSLLI
cox1_balmu .......... ....FMNRWL FSTNHKDIGT LYLLFGAWAG MVGTGLSLLI
cox1_chlre .......... .......RWL YSTSHKDIGL LYLVFAFFGG LLGTSLSMLI
cox1_didma .......... ....FINRWL FSTNHKDIGT LYLLFGAWAG MVGTALSLLI
cox1_drome .......... ....MSRQWL FSTNHKDIGT LYFIFGAWAG MVGTSLSILI
cox1_droya .......... ....MSRQWL FSTNHKDIGT LYFIFGAWAG MVGTSLSILI
cox1_halgr .......... ....FMDRWL FSTNHKDIGT LYLLFGAWAG MAGTALSLLI
cox1_phovi .......... ....FMNRWL FSTNHKDIGT LYLLFGAWAG MVGTALSLLI
cox1_anoqu .......... ....MSRQWL FSTNHKDIGT LYFIFGAWAG MVGTSLSILI
cox1_anoga .......... ....MSRQWL FSTNHKDIGT LYFIFGAWAG MVGTSLSILI
cox1_cotja .......... ....FINRWL FSTNHKDIGT LYLIFGTWAG MAGTALSLLI
cox1_schpo .......... ....YVNRWI FSTNAKDIAI LYLLFGLVSG IIGSVFSFII
cox1_apime .......... .....MMKWF MSTNHKNIGI LYIILALWSG MLGSSMSLII
cox1_ascsu .......... ..QGGLSVWL ESSNHKDIGT LYFLFGLWSG MVGTSLSLVI
cox1_caeel ......NLYK KYQGGLAVWL ESSNHKDIGT LYFIFGLWSG MVGTSFSLLI
cox1_thep3 .......... ........YL TTVDHKKIAH LYLISGGFFF LLGGLEALFI
cox1_bacfi .........K QEKSVIWDWL TTVDHKKIAI MYLIAGTLFF VKAGVMALFM
cox1_syny3 IAAENLTANH PRRKWTDYFT FCVDHKVIGI QYLVTSFLFF FIGGSFAEAM
cox1_bacsu LNALTEK..R TRGSMLWDYL TTVDHKKIAI LYLVAGGFFF LVGGIEAMFI
qox1_bacsu LGAQVstYFK KWKWLWSEWI TTVDHKKLGI MYIISAVIML FRGGVDGLMM
cox1_leita .......... .......... LSVSHKMIGL CYLLVAILSG FVGYVYSLFI
cyob_ecoli .......... ....LWKEWL TSVDHKRLGI MYIIVAIVML LRGFADAIMM
qoxm_sulac .......... .........L YTTNASDVGQ MYIVLGIVAL IIGSVNAALI
cox1_halha LGERTGYTHE EKPGGIIRWF TTVDHKDIGI LYGVYGTIAF AWGGVSVLLM
cox1_trybb .......... .......... LSVSHKMIGI CYLLVAILCG FIGYIYSLFI
cox1_parte .......... .......... ...NHKRIAL NYFYFSMWTG LSGAALATMI
cox1_tetpy IKKLFTYLND LRKHILKKYV YTINHKRIAI NYLYFSMVTG LSGAALATMI

           51                                                 100
t2_12833   RMELQHPGVQ YMCLEGMRLV ADAAAECTPN AHLWNVVVTY HGILMMFFVV
cox1_parde RMELQHPGVQ YMCLEGMRLV ADAAAECTPN AHLWNVVVTY HGILMMFFVV
cox1_rhosh RMELMAPGVQ FMCAEHlsLW PSAVENCTPN GHLWNVMITG HGILMMFFVV
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja RAELMYPGVQ IFH....... .........E THTYNVFVTS HGLIMIFFMV
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi RMELAQPGNQ IL........ .......NGN HQLYNVIITA HAFLMIFFML
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo RMELAQPGNQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_maize RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_orysa RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_wheat RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_sorbi RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_betvu RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_soybn RMELARPGDQ IL........ .......GGN HQLYNVLITG HAFLMIFFMV
cox1_oenbe RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFLMIFFMV
cox1_pea   RMELARPGDQ IL........ .......GGN HQLYNVLITA HAFFMIFFMV
cox1_parli RAELAQPGSL LN........ .........D DQIYNVVVTA HALVMIFFMV
cox1_crola RAELNQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_cypca RAELSQPGSL LS........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_strpu RAELAQPGSL LN........ .........D DQIYNVVVTA HALVMIFFMV
cox1_pisoc RTELAQPGSL LQ........ .........D DQIYNVIVTA HALVMIFFMV
cox1_chick RAELGQPGTL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_triru RLELSAPGVQ YI........ ........AD NQLYNSIITA HAILMIFFMV
cox1_yeast RLELAAPGSQ YL........ .......HGN SQLFNVLVVG HAVLMIFFLV
cox1_podan RMELSGPSVQ YI........ ........AD NQLYNSIITA HALLMIFFMV
cox1_mouse RAELGQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_rat   RAELGQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_human RAELGQPGNL LG........ .........N DHIYNVIVTA HAFVMIFFMV
cox1_neucr RMELSGPGVQ YI........ ........AD NQLYNAIITA HAILMIFFMV
cox1_emeni RLELSGPGVQ YI........ ........AD NQLYNSIITA HAIMMIFFMV
cox1_xenla RAELSQPGTL LG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_bovin RAELGQPGTL LG........ .........D DQIYNVVVTA HAFVMIFFMV
cox1_balph RAELGQPGTL IG........ .........D DQVYNVLVTA HAFVMIFFMV
cox1_balmu RAELGQPGTL IG........ .........D DQVYNVLVTA HAFVMIFFMV
cox1_chlre RYELALPGRG LL........ .......DGN GQLYNVIITG HGIIMLLFMV
cox1_didma RAELGQPGTL IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_drome RAELGHPGAL IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_droya RAELGHPGAL IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_halgr RAELGQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_phovi RAELGQPGAL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_anoqu RAELGHPGAF IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_anoga RAELGHPGAF IG........ .........D DQIYNVIVTA HAFIMIFFMV
cox1_cotja RAELGQPGTL LG........ .........D DQIYNVIVTA HAFVMIFFMV
cox1_schpo RMELSAPGSQ FL........ .......SGN GQLYNVAISA HGILMIFFFI
cox1_apime RMELSSPGSW IS........ .........N DQIYNTIVTS HAFLMIFFMV
cox1_ascsu RLELAKPGLL LG........ .........S GQLYNSVITA HAILMIFFMV
cox1_caeel RLELAKPGFF LS........ .........N GQLYNSVITA HAILMIFFMV
cox1_thep3 RIQLAKPNND FLV....... .......... GGLYNEVLTM HGTTMIFLAA
cox1_bacfi RIQLMYPEMN FL........ .........S GQTFNEFITM HGTIMLFLAA
cox1_syny3 RTELATPSPD FV........ .........Q PEMYNQLMTL HGTIMIFLWI
cox1_bacsu RIQLAKPENA FL........ .........S AQAYNEVMTM HGTTMIFLAA
qox1_bacsu RAQLALPNNS FL........ .........D SNHYNEIFTT HGTIMIIFMA
cox1_leita RLELSLIGCG IL........ .......FGD YQFYNVLITS HGLIMVFAFI
cyob_ecoli RSQQALASAG EAGFLP.... .......... PHHYDQIFTA HGVIMIFFVA
qoxm_sulac RDQLSFNNL. .......... .........N AVDYYDAVTL HGIFMIFFVV
cox1_halha RTELATSSET LI........ .........S PSLYNGLLTS HGITMLFLFG
cox1_trybb RLELSLIGCG VL........ .......FGD YQFYNVLITS HGLIMVFAFI
cox1_parte RLEMAYPGSP FF........ .......KGD SIKYLQVATA HGLIMVFFVV
cox1_tetpy RMELAHPESP FFKGDSLR.. .......... ...YLQVVTA HGLIMVFFVV

           101                                                150
t2_12833   IPALFGGFGN YFMPLHIGAP DMAFPRLNNL SYWLYVCGVS LAIASLLSPG
cox1_parde IPALFGGFGN YFMPLHIGAP DMAFPRLNNL SYWLYVCGVS LAIASLLSPG
cox1_rhosh IPALFGGFGN YFMPLHIGAP DMAFPRMNNL SYWLYVAGTS LAVASLFAPG
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja MPAMIGGFGN WFVPLMIGAP DMAFPRMNNI SFWLLPASFG LLLMSTFVEG
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi MPALMGGFGN WFLPILIGAP DMAFPRLNNI SFWLLPPSLL LLVSSALVEV
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo MPAMIGGFGN WFVPILIGSP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_maize MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_orysa MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_wheat MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_sorbi MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_betvu MPAMIGGFGN WFVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_soybn MPAMIGGSGN WSVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_oenbe MPAMIGGSGN WSVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_pea   MPAMIGGSGN WSVPILIGAP DMAFPRLNNI SFWLLPPSLL LLLSSALVEV
cox1_parli MPIMIGGFGN WLIPLMIGAP DMAFPRMNNM SFWLIPPSFI LLLASAGVES
cox1_crola MPILIGGFGN WLVPLMIGAP HMAFPRMNNM SFWLLPPSFL LLLASSGVEA
cox1_cypca MPILIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSGVEA
cox1_strpu MPIMIGGFGN WLIPLMIGAP DMAFPRMNNM SFWLIPPSFI LLLASAGVEN
cox1_pisoc MPIMIGGFGN WLIPLMIGAP DMAFPRMNNM SFWLIPPSFL LLLASAGVES
cox1_chick MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSTVEA
cox1_triru MPALIGGFGN FLLPLLVGGP DMAFPRLNNI SFWLLIPSLL LFVFASIIEN
cox1_yeast MPALIGGFGN YLLPLMIGAT DTAFPRINNI AFWVLPMGLV CLVTSTLVES
cox1_podan MPALIGGFGN FLLPLLVGGP DMAFPRLNNI SFWLLPPSLI LLVFSACIEG
cox1_mouse MPMMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_rat   MPMMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_human MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSLL LLLASAMVEA
cox1_neucr MPALIGGFGN FLLPLLVGGP DMAFPRLNNI SFWLLPPSLL LLVFSACIEG
cox1_emeni MPALIGGFGN FLLPLLVGGP DMAFPRLNNI SFWLLVPSLL LFVFSATIEN
cox1_xenla MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSGVEA
cox1_bovin MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_balph MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLMASSMIEA
cox1_balmu MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLMASSMIEA
cox1_chlre MPALFGGFGN WLLPIMIGAP DMAFPRLNNI SFWLNPPALA LLLLSTLVEQ
cox1_didma MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSTIEA
cox1_drome MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM SFWLLPPALS LLLVSSMVEN
cox1_droya MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM SFWLLPPALS LLLVSSMVEN
cox1_halgr MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_phovi MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM SFWLLPPSFL LLLASSMVEA
cox1_anoqu MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM SFWMLPPSLT LLISSSMVEN
cox1_anoga MPIMIGGFGN WLVPLMLGAP DMAFPRMNNM SFWMLPPSLT LLISSSMVEN
cox1_cotja MPIMIGGFGN WLVPLMIGAP DMAFPRMNNM S......... ..........
cox1_schpo IPALFGAFGN YLVPLMIGAP DVAYPRVNNF TFWLLPPALM LLLISALTEE
cox1_apime MPFLIGGFGN WLIPLMLGSP DMAFPRMNNI SFWLLPPSLF MLLLSNLFYP
cox1_ascsu MPTMIGGFGN WMLPLMLGAP DMSFPRLNNL SFWLLPTAMF LILDACFVDM
cox1_caeel MPTMIGGFGN WLLPLMLGAP DMSFPRLNNL SFWLLPTSML LILDACFVDM
cox1_thep3 MPLVFA.FMN AVVPLQIGAR DVAFPFLNAL GFWMFFFGGL FLNCSWFLGG
cox1_bacfi TPLLFA.FMN YVIPLQIGAR DVAFPFVNAL GFWIFFFGGL LLSLSWFFGG
cox1_syny3 VPA.GAAFAN YLIPLMVGTE DMAFPRLNAV AFWLTPPGGI LLISSFFVGA
cox1_bacsu MPLLFA.LMN AVVPLQIGAR DVSFPFLNAL GFWLFFFGGI FLNLSWFLGG
qox1_bacsu MP.FLIGLIN VVVPLQIGAR DVAFPYLNNL SFWTFFVGAM LFNISFVIGG
cox1_leita MPVMMGGLVN YFIPVMAGFP DMVFPRLNNM SFWMYLAGFG CVVNGFLTEE
cyob_ecoli MP.FVIGLMN LVVPLQIGAR DVAFPFLNNL SFWFTVVGVI LVNVSLGVGE
qoxm_sulac MP.LSTGFAN YLVPRMIGAH DLYWPKINAL SFWMLVPAVI LAAISPLLGA
cox1_halha TP.MIAAFGN YFIPLLIDAD DMAFPRINAI AFWLLPPGAI LIWSGFLIPG
cox1_trybb MPITMGGFTN YFAPVMVGFP DMVFPRINNM SFWMFIGGFG CLVSGFLTEE
cox1_parte VPIFFGGFAN FLIPYHVGSK DVAFPRLNSI GFWIQPLGFL LVAKIAFLRT
cox1_tetpy VPILFGGFAN FLIPYHVGSK DVAYPRLNSI GFWIQPCGYI LLAKIGFLRP

           151                                                200
t2_12833   GSDQPGAGVG WVLYPPLSTT EAGYAMDLAI FAVHVSGATS ILGAINIITT
cox1_parde GSDQPGAGVG WVLYPPLSTT EAGYAMDLAI FAVHVSGATS ILGAINIITT
cox1_rhosh GNGQLGSGIG WVLYPPLSTS ESGYSTDLAI FAVHLSGASS ILGAINMITT
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja EPGANGVGAG WTMYVPLSSS gpGPAVDFAI LSLHLAGASS ILGAINFITT
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi GA.....GTG WTVYPPLASI asGGSVDLAI FSLHLAGVSS ILGAINFICT
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo GC.....GSG WTVYPPLSGI tsGGSVDLAI FSLHLSGVSS ILGSINFITT
cox1_maize GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGVSS ILGSINFITT
cox1_orysa GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGVSS ILGSINFITT
cox1_wheat GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGISS ILGSINFITT
cox1_sorbi GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGVSS ILGSINFITT
cox1_betvu GS.....GTG WTVYPPLSGI tsGGAVDLAI FSLHLSGVSS ILGSINFITT
cox1_soybn GS.....GTG WTVYPPLSGI tsGGAVDSAI SSLHLSGVSS ILGSINFITT
cox1_oenbe GS.....GTG WTVYPPLSGI tsGGAVDSAI SSLHLSGVSS ILGSINFITT
cox1_pea   GS.....GTG WTVYPPLSGI tsGGAVDSAI SSLHLSGVSS ILGSINFLTT
cox1_parli GA.....GTG WTIYPPLSSn hAGGSVDLAI FSLHLAGASS ILASINFITT
cox1_crola GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_cypca GA.....GTG WSVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_strpu GA.....GTG WTIYPPLSSn hAGSSVDLAI FSLHLAGASS ILGLINFITT
cox1_pisoc GT.....GTG WTIYPPLSSg hAGGSVDLAI FSLHLAGASS ILASINFITT
cox1_chick GA.....GTG WTVYPPLAGn hAGASVDLAI FH.YLAGVSS ILGAINFITT
cox1_triru GA.....GTG WTLYPPLASI qsGPSVDLAI FGLHLSGISS LLGAMNFITT
cox1_yeast GA.....GTG WTVYPPLSSI qsGPSVDLAI FALHLTSISS LLGAINFIVT
cox1_podan GA.....GTG WTIYPPLSGV qsGPSVDLAI FALHLSGVSS LLGAMNFITT
cox1_mouse GA.....GTG WTVYPPLAGN paGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_rat   GA.....GTG WTVYPPLAGn hAGVSVDLTI FSLHLAGVSS ILGAINFITT
cox1_human GA.....GTG WTVYPPLAGN ypGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_neucr GA.....GTG WTIYPPLSGV qsGPSVDLAI FALHLSGVSS LLGSINFITT
cox1_emeni GA.....GTG WTLYPPLSGI qsGPSVDLAI FGLHLSGISS MLGAMNFITT
cox1_xenla GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGISS ILGAINFITT
cox1_bovin GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_balph GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_balmu GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_chlre GP.....GTG WTAYPPLSVQ HSGTSVDLAI LSLHLNGLSS ILGAVNMLVT
cox1_didma GA.....GTG WTVYPPLAGn hAGASVDLAI FSLHLAGISS ILGAINFITT
cox1_drome GA.....GTG WTVYPPLSAg hGGASVDLAI FSLHLAGISS ILGAVNFITT
cox1_droya GA.....GTG WTVYPPLSSg hGGASVDLAI FSLHLAGISS ILGAVNFITT
cox1_halgr GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_phovi GA.....GTG WTVYPPLAGn hAGASVDLTI FSLHLAGVSS ILGAINFITT
cox1_anoqu GA.....GTG WTVYPPLSSg hAGASVDLAI FSLHLAGISS ILGAVNFITT
cox1_anoga GA.....GTG WTVYPPLSSg hAGASVDLAI FSLHLAGISS ILGAVNFITT
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo GP.....GGG WTVYPPLSSI tsGPAIDLAI LSLQLTGISS TLGSVNLIAT
cox1_apime SP.....GTG WTVYPPLSAy hSSPSVDFAI FSLHMSGISS IMGSLNLMVT
cox1_ascsu GC.....GTS WTVYPPLSTM gpGGSVDLAI FSLHCAGVSS ILGAINFMTT
cox1_caeel GC.....GTS WTVYPPLSTM gpGSSVDLAI FSLHAAGLSS ILGGINFMCT
cox1_thep3 APD.....AG WTSYASLSLD saHHGIDFYT LGLQISGFGT IMGAINFLVT
cox1_bacfi GPD.....AG WTAYVPLSSR dgGLGIDFYV LGLQVSGIGT LISAINFLVT
cox1_syny3 PQA......G WTSYPPLSLL SGKWGEELWI LSLLLVGTSS ILGAINFVTT
cox1_bacsu APD.....AG WTSYASLSLH SKGHGIDFSI LGLQISGLGT LIAGINFLAT
qox1_bacsu SPN.....AG WTSYMPLASN dpGPGENYYL LGLQIAGIGT LMTGINFMVT
cox1_leita GM.....GVG WTLYPTLICI dsSLACDFVM FAVHLLGISS ILNSINLLGT
cyob_ecoli FAQ.....TG WLAYPPLSGI epGVGVDYWI WSLQLSGIGT TLTGINFFVT
qoxm_sulac VD......LG WYMYAPLSVE tyGLGTNLIQ IALILSGLSS TLTGVNFVMT
cox1_halha IAT...AQTS WTMYTPLSLQ MSSPAVDMMM LGLHLTGVSA TMGAINFIAT
cox1_trybb GM.....GVG WTLYPTLICI dsSLACDFII FSVHFLGISS ILNSINVVGT
cox1_parte TSWkaAVTAG WTFITPFSSn sGFGAQDVLS VAVVLAGIST TISLLTLITR
cox1_tetpy QFWrtLTTAG WTFITPFSSn tGVGSQDILI LSVVFAGIST TISFTNLLIT

           201                                                250
t2_12833   FLNMRAPGMT LFKVPLFAWA VFITAWMILL SLPVLAGGIT MLLMDRNFGT
cox1_parde FLNMRAPGMT LFKVPLFAWA VFITAWMILL SLPVLAGGIT MLLMDRNFGT
cox1_rhosh FLNMRAPGMT MHKVPLFAWS IFVTAWLILL ALPVLAGAIT MLLTDRNFGT
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja IFNMRAPGMT LHKMPLFVWS ILVTVFLLLL SLPVLAGAIT MLLTDRNFGT
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi VFNMRAPGMS ML.DLLFVWA VFITAWLLLL CLPVLAGGIT MLLTDRNFNT
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo IFNMRAPGLT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_maize IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_orysa IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_wheat IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_sorbi IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNT
cox1_betvu IFNMRGPGMT MHRLPLFVWS VLVTAFLLLL SLPVLAGAIT MLLTDRNFNR
cox1_soybn ISNMRGPGMT MHRSPLFVWS VPVTAFPLLL SLPVLAGAIT MLLTDRNFNT
cox1_oenbe ISNMRGLGMT MHRSPLFVWS VLATAFPILL SLPVLAGAIT MLLTDRNFNT
cox1_pea   ISNMRGPGMT MHRSPLFVWS VPVTAFPLLL SLPVLAGAIT MLLTDRNFNT
cox1_parli IINMRTPGMS FDRLPLFVWS VFVTAFLLLL SLPVLAGAIT MLLTDRNINT
cox1_crola TINMKPPALS QYQTPLFVWA VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_cypca TINMKPPAIS QYQTPLFVWS VLVTAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_strpu IINMRTPGMS LDRLPLFVWS VFVTAFLLLL SLPVLAGAIT MLLTDRNINT
cox1_pisoc IINMRTPGMS FDRLPLFVWS VFVTAFLLLL SLPVLAGAIT MLLTDRNINT
cox1_chick IINMKPPALS QYQTPLFVWS VLITAILLLL SLPVLAAGIT MLLTDRNLNT
cox1_triru IINMRSPGIR LHKLALFGWA VLITAVLLLL SLPVLAGAIT MLLTDRNFNT
cox1_yeast TLNMRTNGMT MHKLPLFVWS IFITAFLLLL SLPVLSAGIT MLLLDRNFNT
cox1_podan IMNMRTPSIR LHKLALFGWA VIITAVLLLL SLPVLAGAIT MLLTDRNFNT
cox1_mouse IINMKPPAMT QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_rat   IINMKPPAMT QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_human IINMKPPAMT QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_neucr IVNMRTPGIR LHKLALFGWA VVITAVLLLL SLPVLAGAIT MLLTDRNFNT
cox1_emeni ILNMRSPGIR LHKLALFGWA VIITAVLLLL SLPVLAGGIT MVLTDRNFNT
cox1_xenla TINMKPPAMS QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_bovin IINMKPPAMS QYQTPLFVWS VMITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_balph IINMKPPAMT QYQTPLFVWS VLVTAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_balmu IINMKPPAMT QYQTPLFVWS VLVTAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_chlre VAGLRAPGMK LLHMPLFVWA IALTAVLVIL AVPVLAAALV MLLTDRNINT
cox1_didma IINMKPPAMS QYQTPLFVWS VMITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_drome VINMRSTGIS LDRMPLFVWS VVITALLLLL SLPVLAGAIT MLLTDRNLNT
cox1_droya VINMRSTGIT LDRMPLFVWS VVITALLLLL SLPVLAGAIT MLLTDRNLNT
cox1_halgr IINMKPPAMS QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_phovi IINMKPPAMS QYQTPLFVWS VLITAVLLLL SLPVLAAGIT MLLTDRNLNT
cox1_anoqu VINMRAPGIT LDRMPLFVWS VVITAVLLLL SLPVLAGAIT MLLTDRNLNT
cox1_anoga VINMRSPGIT LDRMPLFVWS VVITAVLLLL SLPVLAGAIT MLLTDRNLNT
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo MINMRAPGLS LYQMPLFAWA IMITSILLLL TLPVLAGGLF MLFSDRNLNT
cox1_apime IMMMKNFSMN YDQISLFPWS VFITAILLIM SLPVLAGAIT MLLFDRNFNT
cox1_ascsu TKNLRSSSIS LEHMSLFVWT VFVTVFLLVL SLPVLAGAIT MLLTDRNLNT
cox1_caeel TKNLRSSSIS LEHMTLFVWT VFVTVFLLVL SLPVLAGAIT MLLTDRNLNT
cox1_thep3 IINMRAPGMT FMRMPMFTWA TFVTSALILF AFPPLTVGLI FMMMDRLFGG
cox1_bacfi IVNMRAPGMT MMRLPLFVWT SFISSTLILF AFTPLAAGLA LLMLDRLFEA
cox1_syny3 ILKMRIKDMD LHVCPCFAGA MLATSSLILL STPLLASALI LLSFDLIAGT
cox1_bacsu IINMRAPGMT YMRLPLFTWT TFVASALILF AFPPLTVGLA LMMLDRLFGT
qox1_bacsu ILKMRTKGMT LMRMPMFTWT TLITMVIIVF AFPVLTVALA LLSFDRLFGA
cox1_leita LFCCRRKFFS FLSWSLFIWA ALITAILLII TLPVLAGGVT LILCDRNFNT
cyob_ecoli ILKMRAPGMT MFKMPVFTWA SLCANVLIIA SFPILTVTVA LLTLDRYLGT
qoxm_sulac ITKMK..KVP YLKMPLFVWG FFTTAILMII AMPSLTAGLV FAYLERLWGT
cox1_halha IFTERGEDVG WPDLDIFSWT MLTQSGLILF AFPLFGSALI MLLLDRNFGT
cox1_trybb IFCCRRKYFS FLIWTLFIWG ALLTSILLII TLPVLAGGVT LLLCDRNFNT
cox1_parte .RTLVAPGLR NRRvpFITIS LLLTLRLLAI VTPILGAAVL MSLMDRHWQT
cox1_tetpy RRTLAMPGMR HRRvpFVTIS IFLTLRMLAT ITPVLGAAVI MMAFDRHWQT

           251                                                300
t2_12833   QFFDPAGGGD PVLYQHILWF FGHPEVYMLI LPGFGIISHV ISTFARKPIF
cox1_parde QFFDPAGGGD PVLYQHILWF FGHPEVYMLI LPGFGIISHV ISTFARKPIF
cox1_rhosh TFFQPSGGGD PVLYQHILWF FGHPEVYIIV LPAFGIVSHV IATFAKKPIF
cox1_scapl .......... .......FWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_gomva .......... .........F FGHPEVYILI LPGFGMISHI VAYYskKEPF
cox1_polsp .......... .....HLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_lepsp .......... .....HLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_megat .......... ...YQHLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_lepoc .......... ...YQHLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_pomni .......... ...YQHLFWF FGHPEVYILI LPGFGMISHI VAYYskKEPF
cox1_saltr .......... .......FWF FGHPEVYILI LPGFGMISHI VAYYskKEPF
cox1_geosd .......... ...YEHLFWF FGHPEVYILI LPGFGMISHI VAYYakKEPF
cox1_braja TFFAPDGGGD PVLFQHLFWF FGHPEVYILI LPGFGMISQI VSTFSRKPVF
cox1_panbu .......... ........WF FGHPEVYILI LPGFGMISHI VAYYskKEPF
cox1_prowi SFFDPAGGGD PILYQHLFWF FGHPEVYILI IPGFGIISHV IATFSKKPIF
cox1_polsx .......... .......FWF FGHPEVYILI LPGFGMISHI VAYYSGKnpF
cox1_amica .......... ....QHLFWF FGHPEVYILI LPGFGMVSHI VAYYakKEPF
cox1_marpo TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_maize TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_orysa TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_wheat TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_sorbi TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSRKPVF
cox1_betvu PFLIR.WGGD PILYQHLFWF FGHPEVYILI LPGFGIISHI VSTFSGKPVF
cox1_soybn TFSDPAGGGD PILYQHLFRF FGHPEVYIPI LPGSGIISHI VSTFSGKPVF
cox1_oenbe TFSDPAGGGD PILYQHLFRF FGHPEVYILI LPGSGIISHI VSTFSGKPVF
cox1_pea   TFSDPAGGGD PILYQHLFRF FGHPEVYIPI LPGSGIISHI VSTFSGKPVF
cox1_parli TFFDPAGGGD PILFQHLFWF FGHPEVYILI LPGFGMISHV IAHYSGKrpF
cox1_crola TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHV VAYYakKEPF
cox1_cypca TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHV VAYYskKEPF
cox1_strpu TFFDPAGGGD PILFQHLFWL FGHPEVYILI LPGFGMISHV IAHYSGKrpF
cox1_pisoc TFFDPAGGGD PILFQHLFWF FGHPEVYILI LPGFGMISHV IAHYAGKnpF
cox1_chick TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHV VAYYakKEPF
cox1_triru SFFELAGGGD PIFIQHLFWF FGHPEVYILI VPGFGIISTV ISANSSKNVF
cox1_yeast SFFEVAGGGD PILYEHLFWF FGHPEVYILI IPGFGIISHV VSTYSKKPVF
cox1_podan SFFETAGGGD PILFQHLFWF FGHPEVYILI IPAFGIISTT ISAYSNKSVF
cox1_mouse TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHV VTYYskKEPF
cox1_rat   TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGIISHV VTYYskKEPF
cox1_human TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_neucr SFFETAGGGD PILFQHLFWF FGHPEVYILI IPGFGIISTT ISAYSNKSVF
cox1_emeni SFFEVAGGGD PILFQHLFWF FGHPEVYILI IPGFGIISTV IAAGSGKNVF
cox1_xenla TFFDPAGGGD PVLYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_bovin TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_balph TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_balmu TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_chlre AYFCE..SGD LILYQHLFWF FGHPEVYILI LPAFGIVSQV VSFFSQKPVF
cox1_didma TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_drome SFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI IseSGKKETF
cox1_droya SFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI IseSGKKETF
cox1_halgr TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_phovi TFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI VTYYskKEPF
cox1_anoqu SFFDPAGGGE PNLYQHLFWF FGHPEVYILI LPGFGMISHI IteSGKKETF
cox1_anoga SFFDPAGGGD PILYQHLFWF FGHPEVYILI LPGFGMISHI IteSGKKETF
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo SFYAPEGGGD PVLYQHLFWF FGHPEVYILI MPAFGVVSHI IPSLAHKPIF
cox1_apime SFFDPMGGGD PILYQHLFWF FGHPEVYILI LPGFGLISHI VmeSGKKEIF
cox1_ascsu SFFDPSTGGN PLIYQHLFWF FGHPEVYILI LPAFGIISQS SLYLtkKEVF
cox1_caeel SFFDPSTGGN PLIYQHLFWF FGHPEVYILI LPAFGIVSQS TLYLtkKEVF
cox1_thep3 NFFNPAAGGN TIIWEHLFWV FGHPEVYILV LPAFGIFSEI FATFSRKRLF
cox1_bacfi QYFIPSMGGN VVLWQHIFWI FGHPEVYILV LPAFGIISEV IPAFSRKRLF
cox1_syny3 SFFNRVGGGD PVVYQHLFWF YSHPAVYIMI LPFFGVISEV IPVHARKPIF
cox1_bacsu NFFNPELGGN TVIWEHLFWI FGHPEVYILI LPAFGIFSEV IPVFARKRLF
qox1_bacsu HFFTLEAGGM PMLWANLFWI WGHPEVYIVI LPAFGIFSEI ISSFARKQLF
cox1_leita SFYDVVGGGD LILFQHIFWF FGHPEVYIIL LPVFGLISTI VEVIGFRCVF
cyob_ecoli HFFTNDMGGN MMMYINLIWA WGHPEVYILI LPVFGVFSEI AATFSRKRLF
qoxm_sulac PFFDSALGGS PVLWQQLFWF FGHPEVYILI LPAMGLVSEL LPKMARREIF
cox1_halha TFFTVA.GGD PIFWQHLFWF FGHPEVYVLV LPPMGIVSLI LPKFSGRKLF
cox1_trybb SFYDVVGGGD LVLFQHLFWF FGHPEVYIII LPVFGLVSTI IEVTSFRCVF
cox1_parte SFFDFAYGGD PILFQHLFWF FGHPEVYILI IPSFGVANIV LPFYTMRRMS
cox1_tetpy TFFEYAYGGD PILSQHLFWF FGHPEVYVLI IPTFGFINMI VPHNNTRRVA

           301                                                350
t2_12833   GYLPMVLAMA AIAFLGFIVW AHHMYTAGMS LTQQTYFQMA TMTIAVPTGI
cox1_parde GYLPMVLAMA AIAFLGFIVW AHHMYTAGMS LTQQTYFQMA TMTIAVPTGI
cox1_rhosh GYLPMVYAMV AIGVLGFVVW AHHMYTAGLS LTQQSYFMMA TMVIAVPTGI
cox1_scapl GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_gomva GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_polsp GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_lepsp GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_megat GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_lepoc GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_pomni GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_saltr GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_geosd GCMGMIWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMVIAIPTGV
cox1_braja GYLGMAYAMV AIGGIGFVVW AHHMYTVGMS SATQAYFVAA TMVIAVPTGV
cox1_panbu GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_prowi GYLGMVYAMC SIGILGFIVW AHHMYVVGLD IDTRAYFTAA TMIIAVPTGI
cox1_polsx GYMGMVWAMM AIGLLGFIVW AHHMYTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_amica GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMVIAIPTGV
cox1_marpo GYLGMVYAMI SIGVLGFIVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_maize GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_orysa GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_wheat GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_sorbi GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_betvu GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_soybn GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_oenbe GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGV
cox1_pea   GYLGMVYAMI SIGVLGFLVW AHHMFTVGLD VDTRAYFTAA TMIIAVPTGI
cox1_parli GYLGMVYAMI AIGVLGFLVW AHHMFTVGMD VDTRAYFTAA TMIIAVPTGI
cox1_crola GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_cypca GYMGMVWAMM AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_strpu GYLGLVYAMI AIGVLGFLVW AHHMFTVGMD VDTRAYFTAA TMIIAVPTGL
cox1_pisoc GYLGMVYAII SIGILGFLVW AHHMFTVGMD VDTRAYFTAA TMIIAVPTGI
cox1_chick GYMGMVWAML SIGFLGFIVW AHHMFTVRMD VDTRAYFTSA TMIIAIPTGI
cox1_triru GYLGMVYAMM SIGILGFVFW SHHMYTVGLD VDTRAYFIAA TLIIAVPTGI
cox1_yeast GEISMVYAMA SIGLLGFLVW SHHMYIVGLD ADTRAYFTSA TMIIAIPTGI
cox1_podan GYIGMVYAMM SIGILGFIVW SHHMYTVGLD VDTRAYFTAA TLIIAVPTGI
cox1_mouse GYMGMVWAMM SIGFLGFIVW AHHMFTVGLD VDTRACFTSA TMIIAIPTGV
cox1_rat   GYMGMVWTMM SIGFLGFIVW AHHMFTVGLD VDTRAYFTSA TMIIAIPTGV
cox1_human GYMGMVWAMM SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_neucr GYIGMVYAMM SIGILGFIVW SHHMYTVGLD VDTRAYFTAA TLIIAVPTGI
cox1_emeni GYLGMVYAMM SIGVLGFLVW SHHMYTVGLD VDTRAYFTAA TLIIAVPTGI
cox1_xenla GYMGMVWAMM SIGLLGFIVW AHHMFTVDLN VDTRAYFTSA TMIIAIPTGV
cox1_bovin GYMGMVWAMM SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_balph GYMGMVWAMV SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_balmu GYMGMIWAMV SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_chlre GLTGMICAMG AISLLGFIVW AHHMFTVGLD LDTVAYFTSA TMIIAVPTGM
cox1_didma GYMGMVWAMM SIGFLGFIVW AHHMFTVGLD VDTRAYFTSA TMIIAIPTGV
cox1_drome GSLGMIYAML AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAVPTGI
cox1_droya GSLGMIYAML AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAVPTGI
cox1_halgr GYMGMVWAMM SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_phovi GYMGMVWAMM SIGFLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAIPTGV
cox1_anoqu GNLGMIYAML AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAVPTGI
cox1_anoga GNLGMIYAML AIGLLGFIVW AHHMFTVGMD VDTRAYFTSA TMIIAVPTGI
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo GKEGMLWAML SIALLGLMVW SHHLFTVGLD VDTRAYFSAA TMVIAIPTGI
cox1_apime GNLSMIYAML GIGFLGFIVW AHHMFTVGLD VDTRAYFTSA TMIIAVPTGI
cox1_ascsu GSLGMVYAIL SIGLIGCVVW AHHMYTVGMD LDSRAYFTAA TMVIAVPTGV
cox1_caeel GALGMVYAIL SIGLIGCVVW AHHMYTVGMD LDSRAYFSAA TMVIAVPTGV
cox1_thep3 GYSSMVFATV LIAFLGFMVW AHHMFTVGMG PIANAIFAVA TMTIAVPTGV
cox1_bacfi GYTAMVFATM IIAFLGFMVW AHHMFTVGMG PVANSIFAVA TMTIAVPTGI
cox1_syny3 GYRAIAYSSL AISFLGLIVW AHHMFTHGTP GWLRMFFMAT TMLIAVPTGI
cox1_bacsu GYSSMVFAI. VLGFLGFMVW VHHMFTTGLG PIANAIFAVA TMAIAIPTGI
qox1_bacsu GYKAMVGSII AISVLSFLVW THHFFTMGNS ASVNSFFSIT TMAISIPTGV
cox1_leita STVAMIYSMI LIAILGMFVW AHHMFVVGMD VDSRAYFGGV SILIGLPTCV
cyob_ecoli GYTSLVWATV CITVLSFIVW LHHFFTMGAG ANVNAFFGIT TMIIAIPTGV
qoxm_sulac GYTAIALSSI AIAFLSAlvW MHHMFTAIDN TLVQIVSSAT TMAIAIPSGV
cox1_halha GFKFVVYSTL AIGVLSFGVW AHHMFTTGID PRIRSSFMAV SLAISIPSAV
cox1_trybb SSVAMIYSML LISVLGMFVW AHHMFVVGMD VDSRAYFGSI TVLIGLPTCI
cox1_parte SKHHMIWAVY VMAYMGFVVW GHHMYLVGLD HRSRNIYSTI TIMICLPATI
cox1_tetpy SKHHMIWAIY VMAYMGYLVW GHHMYLVGLD HRSRTMYSTI TIMISMPATI

           351                                                400
t2_12833   KVFSWIATMW GGSIEFKTPM LWALAFLFTV GGVTGVVIAQ GSLDRVYHDT
cox1_parde KVFSWIATMW GGSIEFKTPM LWALAFLFTV GGVTGVVIAQ GSLDRVYHDT
cox1_rhosh KIFSWIATMW GGSIELKTPM LWALgfLFTV GGVTGIVLSQ ASVDRYYHDT
cox1_scapl KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_gomva KVFSWLATLH GGSIKWETPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_polsp KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_lepsp KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSLDIMLHDT
cox1_megat KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSIDIVLHDT
cox1_lepoc KVFSWLATLH GGSIKWDTPL LWALgfLFTV GGLTGIVLAN SSLDIMLHDT
cox1_pomni KVFSWLATLH GASIKWETPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_saltr KVFSWLATLH GGSIKWETPL LWAL...... .......... ..........
cox1_geosd KVFSWLATLH GGSLKWETXX XXALgfLFTV GGLTGIVLAN SSLDIMLHDT
cox1_braja KIFSWIATMW GGSIEFRAPM IWAVgfLFTV GGVTGVVLAN AGVDRVLQET
cox1_panbu KVFSWLATLH GGSIKWDTPM LWALgfLFTV GGLTGIILAN SSLDIVLHDT
cox1_prowi KIFSWVATMW GGSIELRTPM LFAVgfLFTV GGLTGVVLAN SGLDVAFHDT
cox1_polsx KVFSWLATLH GGAIKWETPM LWALgfLFTV GGLTGIILAN SSLDIMLHDT
cox1_amica KVFSWLATLH GGAIKWETPL LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_marpo KIFSWIATMW GGSIQYKTPM LFAVgfLFTV GGLTGIVLAN SGVDIALHDT
cox1_maize KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_orysa KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_wheat KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_sorbi KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_betvu KIFSWIATMW GGSIQYKTPM LFAVgfLFTV GGLTGIVLAN SGLDIALHDT
cox1_soybn KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVLAN SGLDIALHDT
cox1_oenbe KIFSWIATMW GGSIQYKTPM LFAVgfLFTV GGLAGIVPAN SGLDIALHDT
cox1_pea   KIFSWIATMW GGSIQYKTPM LFAVgfLFTI GGLTGIVPAN SGLDIALHDT
cox1_parli KVFSWMATLQ GSNLQWETPL LWALgfLFTL GGLTGIVLAN SSIDVVLHDT
cox1_crola KVFSWLATLH GGTIKWDTPM LWALgfLFTV GGLTGIVLSN SSLDIVLHDT
cox1_cypca KVFSWLATLH GGSIKWETPM LWALgfLFTV GGLTGIVLSN SSLDIVLHDT
cox1_strpu KVFSWMAKLQ GSNLQWSLPL LWTLgfLFTL GGLTGIVLAN SSIDFVLHDT
cox1_pisoc KVFSWMATLQ GSNLRWDTPL LWALgfLFTI GGLTGVVLAN SSIDIILHDT
cox1_chick KVFSWLATLH GGTIKWDPPM LWALgfLFTI GGLTGIVLAN SSLDIALHDT
cox1_triru KIFSWLATCY GGSLNLTPAM LFALGfmFTI GGLSGVVLAN ASLDIAFHDT
cox1_yeast KIFSWLATIY GGSIRLATPM LYAIafLFTM GGLTGVALAN ASLDVAFHDT
cox1_podan KIFSWLATCY GGSIRLTPSM LFALgfMFTI GGLSGVVLAN ASLDIAFHDT
cox1_mouse KVFSWLATLH GGNIKWSPAM LWALgfLFTV GGLTGIVLSN SSLDIVLHDT
cox1_rat   KVFSWLATLH GGNIKWSPAM LWALgfLFTV GGLTGIVLSN SSLDIVLHDT
cox1_human KVFSWLATLH GSNMKWSAAV LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_neucr KIFSWLATCY GGSIRLTPSM LFALgfMFTI GGLSGVVLAN ASLDIAFHDT
cox1_emeni KIFSWLATCY GGSLHLTPPM LFALGflFTI GGLSGVVLAN ASLDVAFHDT
cox1_xenla KVFSWLATMH GGTIKWDAPM LWALgfLFTV GGLTGIVLAN SSLDIMLHDT
cox1_bovin KVFSWLATLH GGNIKWSPAM MWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_balph KVFSWLATLH GGNIKWSPAL MWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_balmu KVFSWLATLH GGNIKWSPAL MWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_chlre KIFSWMATIY SGRVWFTTPM WFAVGflFTL GGVTGVVLAN AGVDMLVHDT
cox1_didma KVFSWLATLH GGNIKWSPAM LWALgfLFTI GGLTGIVLAN SSLDIVLHDT
cox1_drome KIFSWLATLH GTQLSYSPAI LWALgfLFTV GGLTGVVLAN SSVDIILHDT
cox1_droya KIFSWLATLH GTQLSYSPAI LWALgfLFTV GGLTGVVLAN SSVDIILHDT
cox1_halgr KVFSWLATLH GGNIKWSPAM LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_phovi KVFSWLATLH GGNIKWSPAM LWALgfLFTV GGLTGIVLAN SSLDIVLHDT
cox1_anoqu KIFSWLATMH GTQLTYSPAM LWAFgfLFTV GGLTGVVLAN SSIDIVLHDT
cox1_anoga KIFSWLATLH GTQLTYSPAM LWAFgfLFTV GGLTGVVLAN SSIDIVLHDT
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo KIFSWLATLT GGAIQWsvPM LYAIGflFTI GGLTGVILSN SVLDIAFHDT
cox1_apime KVFSWLATYH GSKLKLNISI LWSLGflFTI GGLTGIMLSN SSIDIILHDT
cox1_ascsu KVFSWLATLF GMKMVFQPLL LWVMgfLFTI GGLTGVMLSN SSLDIILHDT
cox1_caeel KVFSWLATLF GMKMVFNPLL LWVLgfLFTL GGLTGVVLSN SSLDIILHDT
cox1_thep3 KIFNWLFTMW GGSIKFTTPM HYAVAfsFVM GGVTGVMLAS AAADYQYHDS
cox1_bacfi KIFNWLFTMW GGKITFNTAM LFASSftFVL GGVTGVMLAM APVDYLYHDT
cox1_syny3 QNFQLVRYLW GGKIQLNSAM LFAFGFlfMI GGLTGVMVAS VPFDIHVHDT
cox1_bacsu KIFNWLLTIW GGNVKYTTAM LYAVSfsFVL GGVTGVMLAA AAADYQFHDT
qox1_bacsu KIFNWLFTMY KGRISFTTPM LWALAFifVI GGVTGVMLAM AAADYQYHNT
cox1_leita KLFNWIYSFl dMIITFEVYF VIMFIFMFLI GAVTGLFLSN VGIDIMLHDT
cyob_ecoli KIFNWLFTMY QGRIVFHSAM LWTIGftFSV GGMTGVLLAV PGADFVLHNS
qoxm_sulac KVLNWTATLY GGEIRYKTpl LISFIVMFLL GGITGVFFPL VPIDYALNGT
cox1_halha KVFNWITTMW NGKLRLTAPM LFCIGFvfII GGVTGVFLAV IPIDLILHDT
cox1_trybb KLFNWIYSFl dMCICFEIYF IYMFILMFLA GGLTGLFLSN VGIDILMHDT
cox1_parte KLVNWTLTLA NAAIHVDLVF LFFCsfFFLT GGFTGMWLSH VGLNISVHDT
cox1_tetpy KVVNWTLSLV NGALKVDLPF LFSMSflFLV AGFTGMWLSH VSLNVSMHDT

           401                                                450
t2_12833   YYIVAHFHYV MSLGALFAIF AGTYYWIGKM SGRQYPEWAG QLHFWMMFIG
cox1_parde YYIVAHFHYV MSLGALFAIF AGTYYWIGKM SGRQYPEWAG QLHFWMMFIG
cox1_rhosh YYVVAHFHYV MSLGAVFGIF AGSTSGIGKM SGRQYPEWAG KLHFWMMFVG
cox1_scapl YYVVAHFHYV LSMGAVFAIM .......... .......... ..........
cox1_gomva YYVVAHFHYV LSMGAVFAIV A......... .......... ..........
cox1_polsp YYVVAHFHYV LSMGAVFAIM .......... .......... ..........
cox1_lepsp YYVVAHFHYV LSMGAVFAI. .......... .......... ..........
cox1_megat YYVVAHFHYV LSMGAVFAIM .......... .......... ..........
cox1_lepoc YYVVAHFHYV LSMGAVFAIM .......... .......... ..........
cox1_pomni YYVVAHFHYV LSMGAVFAIV A......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd YYVVAHFHYV LS........ .......... .......... ..........
cox1_braja YYVVAHFHYV LSLGAVFAIF AGWYYWFPKM TGYMYNETLA KAHFWVTFIG
cox1_panbu YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLHNTWT KIHFGVMFM.
cox1_prowi YYVVAHFHYV LSMGAVFALF SGFYYWIGKI TGLQYPETLG QIHFWLMFLG
cox1_polsx YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLHSTWT KIHFGVMF..
cox1_amica YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLHPTWS KIHFGVMFV.
cox1_marpo YYVVAHFHYV LSMGAVFALF AGFYYWIGKI TGLQYPETLG QIHFWITFFG
cox1_maize YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_orysa YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_wheat YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_sorbi YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_betvu YYVVAHFHYV LSMGAVFALF AGFYYWVGKI FGRTYPETLG QIHFWITFFG
cox1_soybn YYVVAHFHYV LSMGAVFALF AGFHYWVGKI FGRTYPETLG QIHFWITFFG
cox1_oenbe YYAGAHFHYV LSMGAVFALF AGFRYWVGKI FGRTYPETLG QIHFWITFFG
cox1_pea   YYVVAHFHYV LSMGAVFALF AGFHYWVGKI FGRTYPETLG KIHFWITFFG
cox1_parli YYVVAHFHYV LSMGAVFAIF AGFTHWFPLF CGYNLHPLWG KAHFFMMFVG
cox1_crola YYVVAHFHYV LSMGAVFAIM AGFVHWFPLF TGFSLHDTWT KIHFGVMFIG
cox1_cypca YYVVAHFHYV LSMGAVFAIM AAFVHWFPLL TGYTLHSTWT KIHFGVMFIG
cox1_strpu YYVVAHFHYV LSMGAVFAIF AGFTHWFPLF SGYSLHPLWG KVHFFIMFVG
cox1_pisoc HYVVAHFHYV LSMGAVFAIF AGFTHWFPLF SGVSLHPLWS KVHFAVMFIG
cox1_chick YYVVAHFHYV LSMGAVFAIL AGFTHWFPLF TGFTLHPSWT KAHFGVMFTG
cox1_triru YYVVAHFHYV LSMGAVFALF SGWYFWIPKL LGLSYDLFAG KVHFWILFVG
cox1_yeast YYVVGHFHYV LSMGAIFSLF AGYYYWSPQI LGLNYNEKLA QIQFWLIFIG
cox1_podan YYVVAHFHYV LSMGAVFAMF SGWYFWIPKM LGLNYNMTLS KVQFWILFIG
cox1_mouse YYVVAHFHYV LSMGAVFAIM AGFVHWFPLF SGFTLDDTWA KAHFAIMFVG
cox1_rat   YYVVAHFHYV LSMGAVFAIM AGFVHWFPLF SGYTLNDTWA KAHFAIMFVG
cox1_human YYVVAHFHYV LSMGAVFAIM GGFIHWFPLF SGYTLDQTYA KIHFTIMFIG
cox1_neucr YYVVAHFHYV LSMGAVFAMF SGWYHWVPKI LGLNYNMVLS KAQFWLLFIG
cox1_emeni YYVVAHFHYV LSMGAVFALF SGWYLWIPKL LGLSYDQFAA KVHFWILFIG
cox1_xenla YYVVAHFHYV LSMGAVFAIM GGFIHWFPLF TGYTLHETWA KIHFGVMFAG
cox1_bovin YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLNDTWA KIHFAIMFVG
cox1_balph YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLNTTWA KIHFMIMFVG
cox1_balmu YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLNTTWA KIHFLIMFVG
cox1_chlre YYVVAHFHYV LSMGAVFGIF AGVYFWGNLI TGLGYHEGRA MVHFWLLFIG
cox1_didma YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF TGYMLNDMWA KIHFFIMFVG
cox1_drome YYVVAHFHYV LSMGAVFAIM AGFIHWYPLF TGLTLNNKWL KSHFIIMFIG
cox1_droya YYVVAHFHYV LSMGAVFAIM AGFIHWYPLF TGLTLNNKWL KSQFIIMFIG
cox1_halgr YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYTLDNTWA KIHFTIMFVG
cox1_phovi YYVVAHFHYV LSMGAVFAIM GGFVHWFPLF SGYMLDDTWA KIHFTIMFVG
cox1_anoqu YYVVAHFHYV LSMGAVFAIM AGFIHWYPLL TGLTMNPNWL KLQFAMMFVG
cox1_anoga YYVVAHFHYV LSMGAVFAIM AGFVHWYPLL TGLTMNPTWL KIQFSIMFVG
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo YFVVAHFHYV LSMGALFGL. CGAYYWSPKM FGLMYNETLA SIQFWILFIG
cox1_apime YYVVGHFHYV LSMGAVFAII SSFIHWYPLI TGLLLNIKWL KIQFIMMFIG
cox1_ascsu YYVVSHFHYV LSLGAVFGIF TGVTLWWSFI TGFVYDKMMM SSVFVLMFVG
cox1_caeel YYVVSHFHYV LSLGAVFGIF TGVTLWWSFI TGYVLDKLMM SAVFILLFIG
cox1_thep3 YFVVAHFHYV IVGGVVFALL AGTHYWWPKM FGRMLNETLG KITFWLFFIG
cox1_bacfi YFVVAHFHYI IVGGIVLSLF AGLFYWYPKM FGHMLNETLG KLFFWVFYIG
cox1_syny3 YFVVGHFHYV LFGGSAFALF SGVYHWFPKM TGRMVNEPLG RLHFILTFIG
cox1_bacsu YFVVAHFHYV IIGGVVFGLL AGVHFWWPKM FGKILHETMG KISFVLFFIG
qox1_bacsu YFLVSHFHYV LIAGTVFACF AGFIFWYPKM FGHKLNERIG KWFFWIFMIG
cox1_leita YFVVGHFHYV LSLGAVVGFF TGFIHFLAKW LPIELYLFWM FYFISTLFIG
cyob_ecoli LFLIAHFHNV IIGGVVFGCF AGMTYWWPKA FGFKLNETWG KRAFWFWIIG
qoxm_sulac YFVVGHFHY. MVYAILYALL GALFYYFPFW SGKWYNDDLG KTGAILLVAG
cox1_halha YYVVGHFHFI VYGAIGFALF AASYYWFPMV TGRMYQKRLA HAHFWTALVG
cox1_trybb YFVVAHFHYV LSLGAVVGVF GGFFHFLMKW IPIELHTFWL FFFISTLWFG
cox1_parte FYVVAHFHLM LAGAAMMGAF TGLYYYYNTF FDVQYSKIFG FLHLVYYSAG
cox1_tetpy FYVVAHFHIM LSGAAITGIF SGFYYYFNAL FGIKFSRMFG YMHLIYYSGG

           451                                                500
t2_12833   SNLIFFPQHF LGRQGMPRRY IDYPVEFSYW NNISSIGAYI SFASFLFFIG
cox1_parde SNLIFFPQHF LGRQGMPRRY IDYPVEFSYW NNISSIGAYI SFASFLFFIG
cox1_rhosh ANLTFFPQHF LGRQGMPRRY IDYPEAFATW NFVSSLGAFL SFASFLFFLG
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja VNLVFFPQHF LGLSGMPRRY VDYPDAFAGW NLVSSVGSYI SGFGVLIFLY
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi VNITFFPMHF LGLAGMPRRI PDYPDCYAGW NAVASYGSYL SITAVLFFFY
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo VNLTFFPMHF LGLAGMPRRI PDYPDAYAGW NAFSSFGSYV SVVGIFCFFV
cox1_maize VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_orysa VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_wheat VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_sorbi VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_betvu VNLTFFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGICCFFV
cox1_soybn VNLTLFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_oenbe VNPTFFPMHF LGLSGMPRPI PDYPESYAGW NALSSFGSYI SVVGIRCFFV
cox1_pea   VNLTLFPMHF LGLSGMPRRI PDYPDAYAGW NALSSFGSYI SVVGIRRFFV
cox1_parli VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTVSSIGSTI SLVAMLFFIF
cox1_crola VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTVSSIGSLI SLVAVIIFLF
cox1_cypca VNLTFFPQHF LGLSAMPRRY SDYPDAYALW NTVSSIGSLI SLVAVIMFLF
cox1_strpu VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTISSIGSTI SVVAMLFFLF
cox1_pisoc VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTVSSIGSTI SLIRTLIFLF
cox1_chick VNLTFFPQHF LGLAGMPRRY SDYPDAYTLW NTLSSIGSLI SMTAVIMLMF
cox1_triru VNLTLFPQHF LGLQGMPRRI GDYPDAFAGW NLISSFGSIV SVVATWYFLN
cox1_yeast ANVIFFPMHF LGINGMPRRI PDYPDAFAGW NYVASIGSFI ATLSLFLFIY
cox1_podan VNVTFFPQHF LGLQGMPRRI SDYPDAFAGW NLISSFGSII SVVAAWLFLY
cox1_mouse VNMTFFPQHF LGLSGMPRRY SDYPDAYTTW NTVSSMGSFI SLTAVLIMIF
cox1_rat   VNMTFFPQHF LGLAGMPRRY SDYPDAYTTW NTVSSMGSFI SLTAVLVMIF
cox1_human VNLTFFPQHF LGLSGMPRRY SDYPDAYTTW NILSSVGSFI SLTAVMLMIF
cox1_neucr VNLTFFPQHF LGLQGMPRRI SDYPDAFSGW NLISSFGSIV SVVASWLFLY
cox1_emeni VNLTFFPQHF LGLQLMPRRI SDYPDAFYGW NLLSSIGSII SVVATWYFLT
cox1_xenla VNLTFFPQHF LGLSAMPRRY SDYPDAYTLW NTVSSIGSLI SLVAVIMMMF
cox1_bovin VNMTFFPQHF LGLSGMPRRY SDYPDAYTMW NTISSMGSFI SLTAVMLMVF
cox1_balph VNLTFFPQHF LGLSGMPRRY SDYPDAYTTW NTISSMGSFI SLTAVMLMIF
cox1_balmu VNLTFFPQHF LGLSGMPRRY SDYPDAYTTW NTISSMGSFI SLTAVMLMIF
cox1_chlre VNLTFFPQHF LGLAGMPRRM FDYADCFAGW NAVSSFGASI SFISV.....
cox1_didma VNLTFFPQHF LGLSGMPRRY SDYPDAYTMW NVVSSIGSFI SLTAVILMVF
cox1_drome VNLTFFPQHF LGLAGMPRRY SDYPDAYTTW NIVSTIGSTI SLLGILFFFF
cox1_droya VNLTFFPQHF LGLAGMPRRY SDYPDAYTTW NVVSTIGSTI SLLGILFFFY
cox1_halgr VNMTFFPQHF LGLSGMPRRY SDYPDAYTTW NTVSSMGSFI SLTAVMLMVF
cox1_phovi VNMTFFPQHF LGLSGMPRRY SDYPDAYTTW NTVSSMGSFI SLTAVMLMVF
cox1_anoqu VNLTFFPQHF LGLAGMPRRY SDFPDSYLAW NIVSSLGSTI SLFAILYFLF
cox1_anoga VNLTFFPQHF LGLAGMPRRY SDFPDSYLTW NVVSSLGSTI SLFAILYFLF
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo VNIVFGPQHF LGLNGMPRRI PDYPEAFVGW NFVSSIGSVI SILSLFLFMY
cox1_apime VNLTFFPQHF LGLMSMPRRY SDYPDSYYCW NSISSMGSMI SLNSMIFLIF
cox1_ascsu VNLTFFPLHF AGIHGYPRKY LDYPDVYSVW NIMASYGSMI SVFALFLFIY
cox1_caeel VNLTFFPLHF AGLHGFPRKY LDYPDVYSVW NIIASYGSII STAGLFLFIY
cox1_thep3 FHLTFFIQHF LGLTGMPRRV FTYLpgWETG NLISTIGAfi AAATVILLIN
cox1_bacfi FHLTFFVQHL LGLMGMPRRV YTYLGdlDAF NFISTIGTFF MSAGVILLVI
cox1_syny3 MDLTFMPMHE LGLMGMNRRI ALYDVEFQPL NVLSTIGAYV LAASTIPFVI
cox1_bacsu FHLTFFIQHF VGLMGMPRRV YTFLpgLETG NLISTIGAFF MAARVILLLV
qox1_bacsu FNICFFPQYF LGLQGMPRRI YTYGpgWTTL NFISTVGAFM MGVGFLILCY
cox1_leita SNMLFFPMHS LGMYAFPRRI SDYPVSFLFW SSFMLYGMLL LASLILFLCA
cyob_ecoli FFVAFMPLYA LGFMGMTRRL sqIDPQFHTM LMIAASGAVL IALGILCLVI
qoxm_sulac TFLTATGMSI AGILGMPRRY AVIPSPIypF QFMASVGAVL TGIGLFILAG
cox1_halha SNATFLAMLW LGYGGMPRRY ATYIPQFATA HRLATVGAFL IGVSTLIWLF
cox1_trybb SNMVFFPLHS LGMFAFPRRI SDYPISFLFW SAFTLYGMLL LTFLVIFCCC
cox1_parte IWTTFFPMFF LGFSGLPRRI HDFPAFFLGW HGLASCGHFL TLAGVCFFFF
cox1_tetpy QWVAFVPQFY LGFSGMPRRI HDYPVVFMGW HSMSTAGHFI TLIGIMFFFL

           501                                                550
t2_12833   IVFYTLFAGK PVNVPNYWNE HADTLEWTLP SPPPEHTFET LPKPEDWDRA
cox1_parde IVFYTLFAGK PVNVPNYWNE HADTLEWTLP SPPPEHTFET LPKPEDWDRA
cox1_rhosh VIFYSL.SGA RVTANNYWNE HADTLEWTLT SPPPEHTFEQ LPKREDWERA
cox1_scapl .......... .......... .......... .......... ..........
cox1_gomva .......... .......... .......... .......... ..........
cox1_polsp .......... .......... .......... .......... ..........
cox1_lepsp .......... .......... .......... .......... ..........
cox1_megat .......... .......... .......... .......... ..........
cox1_lepoc .......... .......... .......... .......... ..........
cox1_pomni .......... .......... .......... .......... ..........
cox1_saltr .......... .......... .......... .......... ..........
cox1_geosd .......... .......... .......... .......... ..........
cox1_braja CVI.DAFAKK VPAGDNPWGA GATTLEWTLP SPPPFHQFEV LPRVQ.....
cox1_panbu .......... .......... .......... .......... ..........
cox1_prowi VVYKTLTSNe pRNPWETTPG VSPTLEWMLP SPPAFHTFEE I.........
cox1_polsx .......... .......... .......... .......... ..........
cox1_amica .......... .......... .......... .......... ..........
cox1_marpo VVFLTLTSEN KCAPSPwvEQ NSTTLEWMVP SPPAFHTFEE LPAIKE....
cox1_maize VVAITSSSGK NKRCAEsvEQ NPTTLEWLVQ SPPAFHTFGE LPTIKE....
cox1_orysa VVAITSSSGK NKRCAEsvEQ NPTTLEWLVQ SPPAFHTFGE LPAIKE....
cox1_wheat VVAITSSSGK NQKCAEsvEQ NPTTLEWLVQ SPPAFHTFGE LPAVKE....
cox1_sorbi VVAITSSSGK NKRCAEsvEQ NPTTLEWLVQ SPPAFHTFGE LPTIKETQGE
cox1_betvu VVTITLSSGK NKRCApwAVE ENSTTLNDVQ SPPAFHTFGE LPAIKE....
cox1_soybn VVTITSSSGN NITRANivEQ NSTTLEWLVQ SPPAFHTFGE LPAIKE....
cox1_oenbe VVTITSSSGN NKrsPWAVEK NSTTLEWMVQ SPPAFHTFGE LPATKE....
cox1_pea   VVTITSSSGN NITRANivEQ NSTTLEWLVQ SPPAFHTFGE LPAIKE....
cox1_parli LIWEAFASQR EGVTPEFAN. ..ASLEWQys FPPSHHTFDE TP........
cox1_crola ILWEAFASKR QVMSVEL... TMTNVEWLHG CPPPYHTFEE ..........
cox1_cypca ILWEAFAAKR EVLSVEL... TATNVEWLHG CPPPYHTYEE ..........
cox1_strpu LIWEAFASQR EGITPEFSH. ..ASLEWQYT spPSHHTFDE TP........
cox1_pisoc LIWEAFSTKR TPIHPEFSS. ..SSLEWQYP spPSHHTFDE TPSA......
cox1_chick IVWEAFSAKR KVLQPEL... TATNIEWIHG CPPPYHTFEE ..........
cox1_triru ILYLQLTQGS PVsfQHLFTR NNSSLEWCLN SPPKPHAFDC LPVQS.....
cox1_yeast ILYDQLVNNK SVI...YAKA PSSSIEFLLT SPPAVHSFNT ..........
cox1_podan IVYLQLVEGE YAglQALLNR SYPSLEWALS SPPKPHAFVS LPLQSNILRS
cox1_mouse MIWEAFASKR EVMSVSY... ASTNLEWLHG CPPPYHTFEE ..........
cox1_rat   MIWEAFASKR EVLSISYSS. ..TNLEWLHG CPPPYHTFEE ..........
cox1_human MIWEAFASKR KVLM...VEE PSMNLEWLYG CPPPYHTFEE ..........
cox1_neucr IVYIQLVQGE YAglRALLNR SYPSLEWSIS SPPKPHSFAS LPLQSSSFFL
cox1_emeni IIYKQLTEGK AVsfQVLFTR NNSSLEWCLT SPPKPHAFAS LPLQS.....
cox1_xenla IIWEAFAAKR EVTT...YEL TSTMLEWLQG CPTPYHTLKT ..........
cox1_bovin IIWEAFASKR EVLTVDL... TTTNLEWLNG CPPPYHTFEE ..........
cox1_balph IIWEAFTSKR EVLAVDL... TSTNLEWLNG CPPPYHTFEE ..........
cox1_balmu IIWEAFTSKR EVLAVDL... TYTNLEWLNG CPPPYHTFEE ..........
cox1_chlre IVFATTFQEA VRTVPR.... TATTLEWVLL ATPAHHALSQ VP........
cox1_didma IIWEAFASKR EVLDVEL... TTTNIEWLYG CPPPYHTFE. ..........
cox1_drome IIWESLVSQR QVIYPIQLN. ..SSIEWYQN TPPAEHSYSE LPLLTN....
cox1_droya IIWESLVSQR QVIYPIQLN. ..SSIEWYQN TPPAEHSYSE LPLLTN....
cox1_halgr MIWEAFASKR EVAAVEL... TTTNIEWLHG CPPPYHTFEE ..........
cox1_phovi MIWEAFASKR EVAAVEL... TTTNIEWLHG CPPPYHTFEE ..........
cox1_anoqu IIWESMITQR TPAFPM...Q LSSSIEWYHT LPPAEHTYAE LPLLTN....
cox1_anoga IIWESMITQR TPAFPM...Q LSSSIEWYHT LPPAEHTYAE LPLLTN....
cox1_cotja .......... .......... .......... .......... ..........
cox1_schpo VMYDQFTSNR VVkiPSYFDD naQSIEWLLH SPVHEHAFNT LPTKS.....
cox1_apime IILESLISKR ML....LFKF NQSSLEWLNF LPPLDHSHLE IP........
cox1_ascsu VLLESFVGHR IFLFDYYVN. ..SGPEYSLS GYVFGHSYQS ..........
cox1_caeel VLLESFFSYR LVISDYYSN. ..SSPEYCMS NYVFGHSYQS ..........
cox1_thep3 IVVTT...AK GEKVPGDAWG DGRTLEWAIA SPPPVYNFAQ TPLVRGLD..
cox1_bacfi NVIYSAFKGE RVTVADPWD. .ARTLEWATP TPVPEYNFAQ TPQVRSLD..
cox1_syny3 NVFWSLFKGE KAARNPW... RALTLEWQTA SPPIIENFEE EP........
cox1_bacsu NVIWTSVKGE YVGADPWHDG R..TLEWTVS SPPPEYNFKQ LPFVRGLDPL
qox1_bacsu NIYYSFRYST REISGDSWG. VGRTLDWAts AIPPHYNFAV LPEVKSQDAF
cox1_leita LFCVFLFWDY CLFFVSLFVF SLYCFFYFST WLPCVMVLYL L.........
cyob_ecoli QMYVSIRDRD QNRDLTGDPW GGRTLEWATS SPPPFYNFAV VPHVHERDAF
qoxm_sulac VLVHGVFRGR AVNGVDPWDN ISVKLQ.... .......... ..........
cox1_halha NMATSWREGP RVDSTDPWdt DQFTNDWAWF RAKEETTVLP DGGDEAQSEA
cox1_trybb LFNVILFWDY CLFFINLFTY SLSIFFYFYT WVPVCMAIYL L.........
cox1_parte GIFDSTSENK SSILANfyNN YTNEIASELP KVEveNTFGE YE........
cox1_tetpy MIFDSHIERR AATSSTLgnG IPGSTVRLML IDRHFAEFEV FKK.......

            554
t2_12833   QAHR
cox1_parde QAHR
cox1_rhosh PAH.
cox1_scapl ....
cox1_gomva ....
cox1_polsp ....
cox1_lepsp ....
cox1_megat ....
cox1_lepoc ....
cox1_pomni ....
cox1_saltr ....
cox1_geosd ....
cox1_braja ....
cox1_panbu ....
cox1_prowi ....
cox1_polsx ....
cox1_amica ....
cox1_marpo ....
cox1_maize ....
cox1_orysa ....
cox1_wheat ....
cox1_sorbi LQTR
cox1_betvu ....
cox1_soybn ....
cox1_oenbe ....
cox1_pea   ....
cox1_parli ....
cox1_crola ....
cox1_cypca ....
cox1_strpu ....
cox1_pisoc ....
cox1_chick ....
cox1_triru ....
cox1_yeast ....
cox1_podan ....
cox1_mouse ....
cox1_rat   ....
cox1_human ....
cox1_neucr SFFR
cox1_emeni ....
cox1_xenla ....
cox1_bovin ....
cox1_balph ....
cox1_balmu ....
cox1_chlre ....
cox1_didma ....
cox1_drome ....
cox1_droya ....
cox1_halgr ....
cox1_phovi ....
cox1_anoqu ....
cox1_anoga ....
cox1_cotja ....
cox1_schpo ....
cox1_apime ....
cox1_ascsu ....
cox1_caeel ....
cox1_thep3 ....
cox1_bacfi ....
cox1_syny3 ....
cox1_bacsu WIEK
qox1_bacsu LHMK
cox1_leita ....
cyob_ecoli WEMK
qoxm_sulac ....
cox1_halha DA..
cox1_trybb ....
cox1_parte ....
cox1_tetpy ....
****************************************************************************
*                                                                          *
*                                                                          *
*      PredictProtein@EMBL-Heidelberg.DE                                   *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                   *
*                                                                          *
*      Prediction of:			                                   *
*                                                                          *
*	- secondary structure,   		by PHDsec		   *
*	- solvent accessibility, 		by PHDacc		   *
*	- and helical transmembrane regions, 	by PHDhtm		   *
*                                                                          *
*      PHD: Profile fed neural network systems from HeiDelberg             *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             *
*                                                                          *
*      Author:             Burkhard Rost		                   *
*                          EMBL, Heidelberg, FRG                           *
*                          Meyerhofstrasse 1, 69 117 Heidelberg            *
*                          Internet: Predict-Help@EMBL-Heidelberg.DE       *
*                                                                          *
*      All rights reserved.                                                *
*                                                                          *
*                                                                          *
****************************************************************************
*                                                                          *
*                                                                          *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~	                   *
*      Secondary structure prediction by PHDsec:                           *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~	                   *
*                                                                          *
*      Author:             Burkhard Rost		                   *
*                          EMBL, Heidelberg, FRG                           *
*                          Meyerhofstrasse 1, 69 117 Heidelberg            *
*                          Internet: Rost@EMBL-Heidelberg.DE 		   *
*                                                                          *
*      All rights reserved.                                                *
*                                                                          *
*                                                                          *
****************************************************************************
*                                                                          *
*  About the network method                                                *
*  ~~~~~~~~~~~~~~~~~~~~~~~~                                                *
*                                                                          *
*  The network procedure is described in detail in:                        *
*  1) Rost, Burkhard; Sander, Chris:                                       *
*     Prediction of protein structure at better than 70% accuracy.         *
*     J. Mol. Biol., 1993, 232, 584-599.        	                   *
*                                                                          *
*  A brief description is given in:                                        *
*     Rost, Burkhard; Sander, Chris:                                       *
*     Improved prediction of protein secondary structure by use of se-     *
*     quence profiles and neural networks.                                 *
*     Proc. Natl. Acad. Sci. U.S.A., 1993, 90, 7558-7562.   		   *
*                                                                          *
*  The PHD mail server is described in:                                    *
*  2) Rost, Burkhard; Sander, Chris; Schneider, Reinhard:                  *
*     PHD - an automatic mail server for protein secondary structure       *
*     prediction.                                                          *
*     CABIOS, 1994, 10, 53-60.                                             *
*                                                                          *
*  The latest improvement steps (up to 72%) are explained in:              *
*  3) Rost, Burkhard; Sander, Chris:                                       *
*     Combining evolutionary information and neural networks to predict    *
*     protein secondary structure.                                         *
*     Proteins, 1994,  19, 55-72.                                          *
*                                                                          *
*  To be quoted for publications of PHD output:                            *
*     Papers 1-3 for the prediction of secondary structure and the pre-    *
*     diction server.                                                      *
*                                                                          *
****************************************************************************
*                                                                          *
*  About the input to the network                                          *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                         *
*                                                                          *
*  The prediction is performed by a system of neural networks.             *
*  The input is a multiple sequence alignment. It is taken from an HSSP    *
*  file (produced by the program MaxHom:                                   *
*     Sander, Chris & Schneider, Reinhard: Database of Homology-Derived    *
*     Structures and the Structural Meaning of Sequence Alignment.         *
*     Proteins, 1991, 9, 56-68.                                            *
*                                                                          *
*  For optimal results the alignment should contain sequences with varying *
*  degrees of sequence similarity relative to the input protein.           *
*  The following is an ideal situation:                                    *
*                                                                          *
*  +-----------------+----------------------+                              *
*  |   sequence:     |  sequence identity   |                              *
*  +-----------------+----------------------+                              *
*  | target sequence |  100 %               |                              *
*  | aligned seq. 1  |   90 %               |                              *
*  | aligned seq. 2  |   80 %               |                              *
*  |      ...        |   ...                |                              *
*  | aligned seq. 7  |   30 %               |                              *
*  +-----------------+----------------------+                              *
*                                                                          *
****************************************************************************
*                                                                          *
*  Estimated Accuracy of Prediction                                        *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                       *
*                                                                          *
*  A careful cross validation test on some 250 protein chains (in total    *
*  about 55,000 residues) with less than 25% pairwise sequence identity    *
*  gave the following results:                                             *
*                                                                          *
*  ++================++-----------------------------------------+          *
*  || Qtotal = 72.1% ||      ("overall three state accuracy")   |          *
*  ++================++-----------------------------------------+          *
*                                                                          *
*  +----------------------------+-----------------------------+            *
*  | Qhelix (% of observed)=70% | Qhelix (% of predicted)=77% |            *
*  | Qstrand(% of observed)=62% | Qstrand(% of predicted)=64% |            *
*  | Qloop  (% of observed)=79% | Qloop  (% of predicted)=72% |            *
*  +----------------------------+-----------------------------+            *
*..........................................................................*
*                                                                          *
*  These percentages are defined by:                                       *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                       *
*                                                                          *
*  |                    number of correctly predicted residues             *
*  |Qtotal =            ---------------------------------------      (*100)*
*  |                          number of all residues                       *
*  |                                                                       *
*  |                    no of res correctly predicted to be in helix       *
*  |Qhelix (% of obs) = -------------------------------------------- (*100)*
*  |                    no of all res observed to be in helix              *
*  |                                                                       *
*  |                                                                       *
*  |                    no of res correctly predicted to be in helix       *
*  |Qhelix (% of pred)= -------------------------------------------- (*100)*
*  |                    no of all residues predicted to be in helix        *
*                                                                          *
*..........................................................................*
*                                                                          *
*  Averaging over single chains                                            *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                            *
*                                                                          *
*  The most reasonable way to compute the overall accuracies is the above  *
*  quoted percentage of correctly predicted residues.  However, since the  *
*  user is mainly interested in the expected performance of the prediction *
*  for a particular protein, the mean value when averaging over protein    *
*  chains might be of help as well.  Computing first the three state       *
*  accuracy for each protein chain, and then averaging over 250 chains     *
*  yields the following average:                                           *
*                                                                          *
*  +-------------------------------====--+                                 *
*  | Qtotal/averaged over chains = 72.2% |                                 *
*  +-------------------------------====--+                                 *
*  | standard deviation          =  9.3% |                                 *
*  +-------------------------------------+                                 *
*                                                                          *
*..........................................................................*
*                                                                          *
*  Further measures of performance                                         *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                         *
*                                                                          *
*  Matthews correlation coefficient:                                       *
*                                                                          *
*  +---------------------------------------------+                         *
*  | Chelix = 0.63, Cstrand = 0.53, Cloop = 0.52 |                         *
*  +---------------------------------------------+                         *
*..........................................................................*
*                                                                          *
*  Average length of predicted secondary structure segments:               *
*                                                                          *
*  .           +------------+----------+                                   *
*  .           |  predicted | observed |                                   *
*  +-----------+------------+----------+                                   *
*  | Lhelix  = |    10.3    |    9.3   |                                   *
*  | Lstrand = |     5.0    |    5.3   |                                   *
*  | Lloop   = |     7.2    |    5.9   |                                   *
*  +-----------+------------+----------+                                   *
*..........................................................................*
*                                                                          *
*  The accuracy matrix in detail:                                          *
*                                                                          *
*  +---------------------------------------+                               *
*  |    number of residues with H, E, L    |                               *
*  +---------+------+------+------+--------+                               *
*  |         |net H |net E |net L |sum obs |                               *
*  +---------+------+------+------+--------+                               *
*  | obs H   |12447 | 1255 | 3990 |  17692 |                               *
*  | obs E   |  949 | 7493 | 3750 |  12192 |                               *
*  | obs L   | 2604 | 2875 |19962 |  25441 |                               *
*  +---------+------+------+------+--------+                               *
*  | sum Net |16000 |11623 |27702 |  55325 |                               *
*  +---------+------+------+------+--------+                               *
*                                                                          *
*  Note: This table is to be read in the following manner:                 *
*        12447 of all residues predicted to be in helix, were observed to  *
*        be in helix, 949 however belong to observed strands, 2604 to      *
*        observed loop regions.  The term "observed" refers to the DSSP    *
*        assignment of secondary structure calculated from 3D coordinates  *
*        of experimentally determined structures (Dictionary of Secondary  *
*        Structure  of Proteins: Kabsch & Sander (1983) Biopolymers, 22,   *
*        2577-2637).                                                       *
*                                                                          *
****************************************************************************
*                                                                          *
*  Position-specific reliability index                                     *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                     *
*                                                                          *
*  The network predicts the three secondary structure types using real     *
*  numbers from the output units. The prediction is assigned by choosing   *
*  the maximal unit ("winner takes all").  However, the real numbers       *
*  contain additional information.                                         *
*  E.g. the difference between the maximal and the second largest output   *
*  unit can be used to derive a "reliability index".  This index is given  *
*  for each residue along with the prediction.  The index is scaled to     *
*  have values between 0 (lowest reliability), and 9 (highest).            *
*  The accuracies (Qtot) to be expected for residues with values above a   *
*  particular value of the index are given below as well as the fraction   *
*  of such residues (%res).:                                               *
*                                                                          *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+    *
*  | index|  0  |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |    *
*  | %res |100.0| 99.2| 90.4| 80.9| 71.6| 62.5| 52.8| 42.3| 29.8| 14.1|    *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+    *
*  |      |     |     |     |     |     |     |     |     |     |     |    *
*  | Qtot | 72.1| 72.3| 74.8| 77.7| 80.3| 82.9| 85.7| 88.5| 91.1| 94.2|    *
*  |      |     |     |     |     |     |     |     |     |     |     |    *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+    *
*  | H%obs| 70.4| 70.6| 73.7| 77.1| 80.1| 83.1| 86.0| 89.3| 92.5| 96.4|    *
*  | E%obs| 61.5| 61.7| 63.7| 66.6| 69.1| 71.7| 74.6| 77.0| 77.8| 68.1|    *
*  |      |     |     |     |     |     |     |     |     |     |     |    *
*  | H%prd| 77.8| 78.0| 80.0| 82.6| 84.7| 86.9| 89.2| 91.3| 93.1| 95.4|    *
*  | E%prd| 64.5| 64.7| 67.8| 71.0| 74.2| 77.6| 81.4| 85.1| 89.8| 93.5|    *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+    *
*                                                                          *
*  The above table gives the cumulative results, e.g. 62.5% of all         *
*  residues have a reliability of at least 5.  The overall three-state     *
*  accuracy for this subset of almost two thirds of all residues is 82.9%. *
*  For this subset, e.g., 83.1% of the observed helices are correctly      *
*  predicted, and 86.9% of all residues predicted to be in helix are       *
*  correct.                                                                *
*                                                                          *
*..........................................................................*
*                                                                          *
*  The following table gives the non-cumulative quantities, i.e. the       *
*  values per reliability index range.  These numbers answer the question: *
*  how reliable is the prediction for all residues labeled with the        *
*  particular index i.                                                     *
*                                                                          *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+          *
*  | index|  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |          *
*  | %res |  8.8|  9.5|  9.3|  9.1|  9.7| 10.5| 12.5| 15.7| 14.1|          *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+          *
*  |      |     |     |     |     |     |     |     |     |     |          *
*  | Qtot | 46.6| 50.6| 57.7| 62.6| 67.9| 74.2| 82.2| 88.3| 94.2|          *
*  |      |     |     |     |     |     |     |     |     |     |          *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+          *
*  | H%obs| 36.8| 42.3| 49.5| 55.2| 61.7| 69.9| 78.8| 87.4| 96.4|          *
*  | E%obs| 44.7| 44.5| 52.1| 55.4| 60.9| 68.0| 75.9| 81.0| 68.1|          *
*  |      |     |     |     |     |     |     |     |     |     |          *
*  | H%prd| 49.9| 52.5| 60.3| 64.2| 69.2| 77.5| 85.4| 89.9| 95.4|          *
*  | E%prd| 41.7| 47.1| 53.6| 57.0| 64.0| 71.6| 78.8| 88.8| 93.5|          *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+          *
*                                                                          *
*  For example, for residues with Relindex = 5 64% of all predicted betha- *
*  strand residues are correctly identified.                               *
*                                                                          *
*                                                                          *
****************************************************************************
*                                                                          *
*                                                                          *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~		           *
*      Solvent accessibility prediction by PHDacc:                         *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~		           *
*                                                                          *
*      Author:             Burkhard Rost		                   *
*                          EMBL, Heidelberg, FRG                           *
*                          Meyerhofstrasse 1, 69 117 Heidelberg            *
*                          Internet: Rost@EMBL-Heidelberg.DE 		   *
*                                                                          *
*      All rights reserved.                                                *
*                                                                          *
*                                                                          *
****************************************************************************
*                                                                          *
*  About the network method                                                *
*  ~~~~~~~~~~~~~~~~~~~~~~~~                                                *
*                                                                          *
*  The network for prediction of secondary structure is described in       *
*  detail in:                                                              *
*     Rost, Burkhard; Sander, Chris:                                       *
*     Prediction of protein structure at better than 70% accuracy.         *
*     J. Mol. Biol., 1993, 232, 584-599.                                   *
*                                                                          *
*  The analysis of the prediction of solvent exposure is given in:         *
*     Rost, Burkhard; Sander, Chris:                                       *
*     Conservation and prediction of solvent accessibility in protein      *
*     families.  Proteins, 1994, 20, 216-226.                              *
*                                                                          *
*  To be quoted for publications of PHD exposure prediction:               *
*     Both papers quoted above.                                            *
*                                                                          *
****************************************************************************
*                                                                          *
*  Definition of accessibility                                             *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~                                             *
*                                                                          *
*  For training the residue solvent accessibility the DSSP (Dictionary of  *
*  Secondary Structure of Proteins; Kabsch & Sander (1983) Biopolymers, 22,*
*  2577-2637) values of accessible surface area have been used.  The       *
*  prediction provides values for the relative solvent accessibility.  The *
*  normalisation is the following:                                         *
*                                                                          *
*  |                           ACCESSIBILITY (from DSSP in Angstrom)       *
*  |RELATIVE_ACCESSIBILITY =   ------------------------------------- * 100 *
*  |                               MAXIMAL_ACC (amino acid type i)         *
*                                                                          *
*  where MAXIMAL_ACC (i) is the maximal accessibility of amino acid type i.*
*  The maximal values are:                                                 *
*                                                                          *
*  +----+----+----+----+----+----+----+----+----+----+----+----+           *
*  |  A |  B |  C |  D |  E |  F |  G |  H |  I |  K |  L |  M |           *
*  | 106| 160| 135| 163| 194| 197|  84| 184| 169| 205| 164| 188|           *
*  +----+----+----+----+----+----+----+----+----+----+----+----+           *
*  |  N |  P |  Q |  R |  S |  T |  V |  W |  X |  Y |  Z |                *
*  | 157| 136| 198| 248| 130| 142| 142| 227| 180| 222| 196|                *
*  +----+----+----+----+----+----+----+----+----+----+----+                *
*                                                                          *
*  Notation: one letter code for amino acid, B stands for D or N; Z stands *
*     for E or Q; and X stands for undetermined.                           *
*                                                                          *
*  The relative solvent accessibility can be used to estimate the number   *
*  of water molecules (W) in contact with the residue:                     *
*                                                                          *
*  W = ACCESSIBILITY /10                                                   *
*                                                                          *
*  The prediction is given in 10 states for relative accessibility, with   *
*                                                                          *
*  RELATIVE_ACCESSIBILITY = (PREDICTED_ACC * PREDICTED_ACC)                *
*                                                                          *
*  where PREDICTED_ACC = 0 - 9.                                            *
*                                                                          *
****************************************************************************
*                                                                          *
*  Estimated Accuracy of Prediction                                        *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                       *
*                                                                          *
*  A careful cross validation test on some 238 protein chains (in total    *
*  about 62,000 residues) with less than 25% pairwise sequence identity    *
*  gave the following results:                                             *
*                                                                          *
*                                                                          *
*  Correlation                                                             *
*  ...........                                                             *
*                                                                          *
*  The correlation between observed and predicted solvent accessibility    *
*  is:                                                                     *
*                                                                          *
*  -----------                                                             *
*  corr = 0.53                                                             *
*  -----------                                                             *
*                                                                          *
*  This value ought to be compared to the worst and best case prediction   *
*  scenario: random prediction (corr = 0.0) and homology modelling         *
*  (corr = 0.66).  (Note: homology modelling yields a relative accurate    *
*  prediction in 3D if, and only if, a significantly identical sequence    *
*  has a known 3D structure.)                                              *
*                                                                          *
*                                                                          *
*  3-state accuracy                                                        *
*  ................                                                        *
*                                                                          *
*  Often the relative accessibility is projected onto, e.g., 3 states:     *
*     b  = buried       (here defined as < 9% relative accessibility),     *
*     i  = intermediate ( 9% <= rel. acc. < 36% ),                         *
*     e  = exposed      ( rel. acc. >= 36% ).                              *
*                                                                          *
*  A projection onto 3 states or 2 states (buried/exposed) enables the     *
*  compilation of a 3- and 2-state prediction accuracy.  PHD reaches an    *
*  overall 3-state accuracy of:                                            *
*     Q3 = 57.5%                                                           *
*  (compared to 35% for random prediction and 70% for homology modelling). *
*                                                                          *
*  In detail:                                                              *
*                                                                          *
*  +-----------------------------------+-------------------------+         *
*  | Qburied       (% of observed)=77% | Qb (% of predicted)=60% |         *
*  | Qintermediate (% of observed)= 9% | Qi (% of predicted)=44% |         *
*  | Qexposed      (% of observed)=78% | Qe (% of predicted)=56% |         *
*  +-----------------------------------+-------------------------+         *
*                                                                          *
*                                                                          *
*  10-state accuracy                                                       *
*  .................                                                       *
*                                                                          *
*  The network predicts relative solvent accessibility in 10 states, with  *
*  state i (i = 0-9) corresponding to a relative solvent accessibility of  *
*  i*i %.  The 10-state accuracy of the network is:                        *
*                                                                          *
*     Q10 = 24.5%                                                          *
*                                                                          *
*..........................................................................*
*                                                                          *
*  These percentages are defined by:                                       *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                       *
*                                                                          *
*  |                     number of correctly predicted residues            *
*  |Q3 		      = ---------------------------------------      (*100)*
*  |                           number of all residues                      *
*  |                                                                       *
*  |                     no of res. correctly predicted to be buried       *
*  |Qburied (% of obs) = ------------------------------------------- (*100)*
*  |                     no of all res. observed to be buried              *
*  |                                                                       *
*  |                                                                       *
*  |                     no of res. correctly predicted to be buried       *
*  |Qburied (% of pred)= ------------------------------------------- (*100)*
*  |                     no of all residues predicted to be buried         *
*                                                                          *
*..........................................................................*
*                                                                          *
*  Averaging over single chains                                            *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                            *
*                                                                          *
*  The most reasonable way to compute the overall accuracies is the above  *
*  quoted percentage of correctly predicted residues.  However, since the  *
*  user is mainly interested in the expected performance of the prediction *
*  for a particular protein, the mean value when averaging over protein    *
*  chains might be of help as well.  Computing first the correlation       *
*  between observed and predicted accessibility for each protein chan, and *
*  then averaging over all 238 chains yields the following average:        *
*                                                                          *
*  +-------------------------------====--+                                 *
*  | corr/averaged over chains   = 0.53  |                                 *
*  +-------------------------------====--+                                 *
*  | standard deviation          = 0.11  |                                 *
*  +-------------------------------------+                                 *
*                                                                          *
*..........................................................................*
*                                                                          *
*  Further details of performance accuracy                                 *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                 *
*                                                                          *
*  The accuracy matrix in detail:                                          *
*  ..............................                                          *
*                                                                          *
* -------+----------------------------------------------------+----------- *
*  \ PHD |    0    1   2   3    4    5     6     7    8    9  |  SUM  %obs *
* -------+----------------------------------------------------+----------- *
* OBS  0 | 8611  140   8  44   82  169   772   334   27    0  | 10187 16.6 *
* OBS  1 | 4367  164   0  50  106  231   738   346   44    3  |  6049  9.8 *
* OBS  2 | 3194  168   1  68  125  303   951   513   42    7  |  5372  8.7 *
* OBS  3 | 2760  159   8  80  136  327  1246   746   58   19  |  5539  9.0 *
* OBS  4 | 2312  144   2  72  166  396  1615  1245  124   19  |  6095  9.9 *
* OBS  5 | 1873   96   3  84  138  425  1979  1834  187   27  |  6646 10.8 *
* OBS  6 | 1387   67   1  60   80  278  2237  2627  231   51  |  7019 11.4 *
* OBS  7 | 1082   35   0  32   56  225  1871  3107  302   60  |  6770 11.0 *
* OBS  8 |  660   25   0  27   43  136  1206  2374  325   87  |  4883  7.9 *
* OBS  9 |  325   20   2  27   29   74   648  1159  366  214  |  2864  4.7 *
* -------+----------------------------------------------------+----------- *
* SUM    |26571 1018  25 544  961 2564 13263 14285 1706  487  |            *
* %pred  | 43.3  1.7 0.0 0.9  1.6  4.2  21.6  23.3  2.8  0.8  |            *
* -------+----------------------------------------------------+----------- *
*                                                                          *
*  Note: This table is to be read in the following manner:                 *
*        8611 of all residues predicted to be in exposed by 0%, were       *
*        observed with 0% relative accessibility.  However, 325 of all     *
*        residues predicted to have 0% are observed as completely exposed  *
*        (obs = 9 -> rel. acc. >= 81%).  The term "observed" refers to the *
*        DSSP compilation of area of solvent accessibility calculated from *
*        3D coordinates of experimentally determined structures (Diction-  *
*        ary of Secondary Structure  of Proteins: Kabsch & Sander (1983)   *
*        Biopolymers, 22, 2577-2637).                                      *
*                                                                          *
*                                                                          *
*  Accuracy for each amino acid:                                           *
*  .............................                                           *
*                                                                          *
*  +---+------------------------------+-----+-------+------+               *
*  |AA |   Q3 b%o b%p i%o i%p e%o e%p | Q10 |  corr |    N |               *
*  +---+------------------------------+-----+-------+------+               *
*  | A | 59.0  87  60   2  38  66  57 |  31 | 0.530 | 5054 |               *
*  | C | 62.0  91  67   5  39  25  21 |  34 | 0.244 |  893 |               *
*  | D | 56.5  21  45   6  49  94  57 |  20 | 0.321 | 3536 |               *
*  | E | 60.8   9  40   3  41  98  61 |  21 | 0.347 | 3743 |               *
*  | F | 63.3  94  67   9  46  29  37 |  27 | 0.366 | 2436 |               *
*  | G | 52.1  75  51   1  31  67  53 |  22 | 0.405 | 4787 |               *
*  | H | 50.9  63  53  23  45  71  50 |  18 | 0.442 | 1366 |               *
*  | I | 64.9  95  68   6  41  30  38 |  34 | 0.360 | 3437 |               *
*  | K | 66.6   2  11   2  37  98  67 |  23 | 0.267 | 3652 |               *
*  | L | 61.6  93  65   8  44  31  40 |  31 | 0.368 | 5016 |               *
*  | M | 60.1  92  64   5  39  45  44 |  29 | 0.452 | 1371 |               *
*  | N | 55.5  45  45   8  38  87  59 |  17 | 0.410 | 2923 |               *
*  | P | 53.0  48  48   9  39  83  56 |  18 | 0.364 | 2920 |               *
*  | Q | 54.3  27  44   7  44  92  56 |  20 | 0.344 | 2225 |               *
*  | R | 49.9  15  47  36  47  76  51 |  18 | 0.372 | 2765 |               *
*  | S | 55.6  69  53   3  51  81  56 |  22 | 0.464 | 3981 |               *
*  | T | 51.8  61  51   8  38  78  53 |  21 | 0.432 | 3740 |               *
*  | V | 61.1  93  65   5  40  39  42 |  34 | 0.418 | 4156 |               *
*  | W | 56.2  85  62  20  49  29  27 |  21 | 0.318 |  891 |               *
*  | Y | 49.7  73  52  33  49  36  38 |  19 | 0.359 | 2301 |               *
*  +---+------------------------------+-----+-------+------+               *
*                                                                          *
*  Abbreviations:                                                          *
*                                                                          *
*  AA:   amino acid in one-letter code                                     *
*  b%o, i%o, e%o:   = Qburied, Qintermediate, Qexposed (% of observed),    *
*        i.e. percentage of correct prediction in each state, see above    *
*  b%p, i%p, e%p:   = Qburied, Qintermediate, Qexposed (% of predicted),   *
*        i.e. probability of correct prediction in each state, see above   *
*  b%o:  = Qburied (% of observed), see above                              *
*  Q10:  percentage of correctly predicted residues in each of the 10      *
*        states of predicted relative accessibility.                       *
*  corr: correlation between predicted and observed rel. acc.              *
*  N:    number of residues in data set                                    *
*                                                                          *
*                                                                          *
*  Accuracy for different secondary structure:                             *
*  ...........................................                             *
*                                                                          *
*  +--------+------------------------------+----+-------+-------+          *
*  | type   |   Q3 b%o b%p i%o i%p e%o e%p |Q10 |  corr |     N |          *
*  +--------+------------------------------+----+-------+-------+          *
*  | helix  | 59.5  79  64   8  44  80  56 | 27 | 0.574 | 20100 |          *
*  | strand | 61.3  84  73   9  46  69  37 | 35 | 0.524 | 13356 |          *
*  | loop   | 54.4  64  43  11  44  78  61 | 18 | 0.442 | 27968 |          *
*  +--------+------------------------------+----+-------+-------+          *
*                                                                          *
*  Abbreviations as before.                                                *
*                                                                          *
****************************************************************************
*                                                                          *
*  Position-specific reliability index                                     *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                     *
*                                                                          *
*  The network predicts the 10 states for relative accessibility using real*
*  numbers from the output units. The prediction is assigned by choosing   *
*  the maximal unit ("winner takes all").  However, the real numbers       *
*  contain additional information.                                         *
*  E.g. the difference between the maximal and the second largest output   *
*  unit (with the constraint that the second largest output is compiled    *
*  among all units at least 2 positions off the maximal unit) can be used  *
*  to derive a "reliability index".  This index is given for each residue  *
*  along with the prediction.  The index is scaled to have values between  *
*  0 (lowest reliability), and 9 (highest).                                *
*  The accuracies (Q3, corr, asf.) to be expected for residues with values *
*  above a particular value of the index are given below as well as the    *
*  fraction of such residues (%res).:                                      *
*                                                                          *
*  +---+------------------------------+----+-------+-------+               *
*  |RI |   Q3 b%o b%p i%o i%p e%o e%p |Q10 |  corr |  %res |               *
*  +---+------------------------------+----+-------+-------+               *
*  | 0 | 57.5  77  60   9  44  78  56 | 24 | 0.535 | 100.0 |               *
*  | 1 | 59.1  76  63   9  45  82  57 | 25 | 0.560 |  91.2 |               *
*  | 2 | 61.7  79  66   4  47  87  58 | 27 | 0.594 |  77.1 |               *
*  | 3 | 66.6  87  70   1  51  89  63 | 30 | 0.650 |  57.1 |               *
*  | 4 | 70.0  89  72   0  83  91  67 | 32 | 0.686 |  45.8 |               *
*  | 5 | 72.9  92  75   0   0  93  70 | 34 | 0.722 |  35.6 |               *
*  | 6 | 76.3  95  77   0   0  93  75 | 36 | 0.769 |  24.7 |               *
*  | 7 | 79.0  97  79   0   0  93  78 | 39 | 0.803 |  16.0 |               *
*  | 8 | 80.9  98  80   0   0  91  81 | 43 | 0.824 |   9.6 |               *
*  | 9 | 81.2  99  80   0   0  88  83 | 45 | 0.828 |   5.9 |               *
*  +---+------------------------------+----+-------+-------+               *
*                                                                          *
*  Abbreviations as before.                                                *
*                                                                          *
*  The above table gives the cumulative results, e.g. 45.8% of all         *
*  residues have a reliability of at least 4.  The correlation for this    *
*  most reliably predicted half of the residues is 0.686, i.e. a value     *
*  comparable to what could be expected if homology modelling were         *
*  possible.  For this subset of 45.8% of all residues, 89% of the buried  *
*  residues are correctly predicted, and 72% of all residues predicted to  *
*  be buried are correct.                                                  *
*                                                                          *
*..........................................................................*
*                                                                          *
*  The following table gives the non-cumulative quantities, i.e. the       *
*  values per reliability index range.  These numbers answer the question: *
*  how reliable is the prediction for all residues labeled with the        *
*  particular index i.                                                     *
*                                                                          *
*  +---+------------------------------+----+-------+-------+               *
*  |RI |   Q3 b%o b%p i%o i%p e%o e%p |Q10 |  corr |  %res |               *
*  +---+------------------------------+----+-------+-------+               *
*  | 0 | 40.9  79  40  16  41  21  40 | 14 | 0.175 |   8.8 |               *
*  | 1 | 45.4  61  46  28  44  48  44 | 17 | 0.278 |  14.1 |               *
*  | 2 | 47.4  53  52  10  46  80  44 | 19 | 0.343 |  19.9 |               *
*  | 3 | 52.9  75  59   4  50  77  47 | 23 | 0.439 |  11.4 |               *
*  | 4 | 60.0  81  63   0  83  84  56 | 25 | 0.547 |  10.1 |               *
*  | 5 | 65.2  82  70   0   0  93  62 | 28 | 0.607 |  10.9 |               *
*  | 6 | 71.3  90  72   0   0  94  70 | 31 | 0.692 |   8.8 |               *
*  | 7 | 76.0  94  76   0   0  95  75 | 34 | 0.762 |   6.3 |               *
*  | 8 | 80.5  97  81   0   0  94  79 | 39 | 0.808 |   3.8 |               *
*  | 9 | 81.2  99  80   0   0  88  83 | 45 | 0.828 |   5.9 |               *
*  +---+------------------------------+----+-------+-------+               *
*                                                                          *
*  For example, for residues with RI = 4 83% of all predicted intermediate *
*  residues are correctly predicted as such.                               *
*                                                                          *
*                                                                          *
****************************************************************************
*                                                                          *
*                                                                          *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             *
*      Prediction of helical transmembrane segments by PHDhtm:             *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             *
*                                                                          *
*      Author:             Burkhard Rost		                   *
*                          EMBL, Heidelberg, FRG                           *
*                          Meyerhofstrasse 1, 69 117 Heidelberg            *
*                          Internet: Rost@EMBL-Heidelberg.DE 		   *
*                                                                          *
*      All rights reserved.                                                *
*                                                                          *
*                                                                          *
****************************************************************************
*                                                                          *
*  About the network method                                                *
*  ~~~~~~~~~~~~~~~~~~~~~~~~                                                *
*                                                                          *
*  The PHD mail server is described in:                                    *
*     Rost, Burkhard; Sander, Chris; Schneider, Reinhard:                  *
*     PHD - an automatic mail server for protein secondary structure       *
*     prediction.                                                          *
*     CABIOS, 1994, 10, 53-60.     	                                   *
*                                                                          *
*  To be quoted for publications of PHDhtm output:                         *
*     Rost, Burkhard; Casadio, Rita; Fariselli, Piero; Sander, Chris:      *
*     Prediction of helical transmembrane segments at 95% accuracy.        *
*     Protein Science, 1995, 4, 521-533. 				   *
*                                                                          *
****************************************************************************
*                                                                          *
*  Estimated Accuracy of Prediction                                        *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                       *
*                                                                          *
*  A cross validation test on 69 helical trans-membrane  proteins (in total*
*  about 30,000 residues) with less than 25% pairwise sequence identity    *
*  gave the following results:                                             *
*                                                                          *
*  ++================++-----------------------------------------+          *
*  || Qtotal = 94.7% ||      ("overall two state accuracy")     |          *
*  ++================++-----------------------------------------+          *
*                                                                          *
*  +----------------------------+-----------------------------+            *
*  | Qhelix (% of observed)=92% | Qhelix (% of predicted)=83% |            *
*  | Qloop  (% of observed)=96% | Qloop  (% of predicted)=97% |            *
*  +----------------------------+-----------------------------+            *
*                                                                          *
*..........................................................................*
*                                                                          *
*  These percentages are defined by:                                       *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                       *
*                                                                          *
*  |                    number of correctly predicted residues             *
*  |Qtotal =            ---------------------------------------      (*100)*
*  |                          number of all residues                       *
*  |                                                                       *
*  |                    no of res correctly predicted to be in helix       *
*  |Qhelix (% of obs) = -------------------------------------------- (*100)*
*  |                    no of all res observed to be in helix              *
*  |                                                                       *
*  |                                                                       *
*  |                    no of res correctly predicted to be in helix       *
*  |Qhelix (% of pred)= -------------------------------------------- (*100)*
*  |                    no of all residues predicted to be in helix        *
*                                                                          *
*..........................................................................*
*                                                                          *
*  Further measures of performance                                         *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                         *
*                                                                          *
*  Matthews correlation coefficient:                                       *
*                                                                          *
*  +---------------------------------------------+                         *
*  | Chelix = 0.84, Cloop = 0.84                 |                         *
*  +---------------------------------------------+                         *
*..........................................................................*
*                                                                          *
*  Average length of predicted secondary structure segments:               *
*                                                                          *
*  |           +------------+----------+                                   *
*  |           |  predicted | observed |                                   *
*  +-----------+------------+----------+                                   *
*  | Lhelix  = |    24.6    |   22.2   |                                   *
*  +-----------+------------+----------+                                   *
*..........................................................................*
*                                                                          *
*  The accuracy matrix in detail:                                          *
*                                                                          *
*  +---------------------------------+                                     *
*  |    number of residues with H, L |                                     *
*  +---------+------+-------+--------+                                     *
*  |         |net H | net L |sum obs |                                     *
*  +---------+------+-------+--------+                                     *
*  | obs H   | 5214 |   492 |   5706 |                                     *
*  | obs L   | 1050 | 22423 |  23473 |                                     *
*  +---------+------+-------+--------+                                     *
*  | sum Net | 6264 | 22915 |  29179 |                                     *
*  +---------+------+-------+--------+                                     *
*                                                                          *
*  Note: This table is to be read in the following manner:                 *
*        5214 of all residues predicted to be in a helical trans-membrane  *
*        region, were observed to be in the lipid bilayer, 1050 however    *
*        were observed either inside or outside of the protein, i.e. in    *
*        loop (or non-membrane) regions. The term "observed" refers to DSSP*
*        assignment of secondary structure calculated from 3D coordinates  *
*        of experimentally determined structures (Dictionary of Secondary  *
*        Structure  of Proteins: Kabsch & Sander (1983) Biopolymers, 22,   *
*        2577-2637) where these were available.  For all other proteins,   *
*        the assignment of trans-membrane segments has been taken from the *
*        Swissprot data bank (Bairoch, A.; Boeckmann, B.: The SWISS-PROT   *
*        protein sequence data bank. Nucl. Acids Res. 20: 2019-2022, 1992).*
*                                                                          *
*..........................................................................*
*                                                                          *
*  Overlap between predicted and observed segments:                        *
*                                                                          *
*  +-----------------+---------------+----------------+                    *
*  | segment overlap | % of observed | % of predicted |                    *
*  |   Sov helix     |      95.6%    |      95.5%     |                    *
*  |   Sov loop      |      83.6%    |      97.2%     |                    *
*  +-----------------+---------------+----------------+                    *
*  |   Sov total     |      86.0%    |      96.8%     |                    *
*  +-----------------+---------------+----------------+                    *
*                                                                          *
*        Definition of Sov in: Rost et al., JMB, 1994, 235, 13-26.         *
*                                                                          *
*        As helical trans-membrane segments are longer than globular heli- *
*        ces, correctly predicted segments can easily be made out.  PHDhtm *
*        misses 5 out of 258 observed segments, predicts 6 where non is    *
*        observed and 3 times the predicted helical segment overlaps two   *
*        observed regions.  Thus, in total more than 95% of all segments   *
*        are correctly predicted.                                          *
*                                                                          *
*..........................................................................*
*                                                                          *
*  Entropy of prediction (information measure):                            *
*                                                                          *
*  +-----------------+                                                     *
*  | I = 0.64        |                                                     *
*  +-----------------+                                                     *
*                                                                          *
*        (For comparison: homology modelling of globular proteins in three *
*        states: I=0.62.)                                                  *
*        Definition of Sov in: Rost et al., JMB, 1994, 235, 13-26.         *
*                                                                          *
****************************************************************************
*                                                                          *
*  Position-specific reliability index                                     *
*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                     *
*                                                                          *
*  The network predicts two states: helical trans-membrane region and rest *
*  using two output units.  The prediction is assigned by choosing the ma- *
*  ximal unit ("winner takes all").  However, the real numbers of the out- *
*  put units contain additional information.                               *
*  E.g. the difference between the two output units can be used to derive  *
*  a "reliability index".  This index is given for each residue along with *
*  the prediction.  The index is scaled to have values between 0 (lowest   *
*  reliability), and 9 (highest).                                          *
*  The accuracies (Qtot) to be expected for residues with values above a   *
*  particular value of the index are given below as well as the fraction   *
*  of such residues (%res).:                                               *
*                                                                          *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+    *
*  | index|  0  |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |    *
*  | %res |100.0| 98.8| 97.3| 95.9| 94.1| 92.3| 89.9| 86.2| 75.0| 66.8|    *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+    *
*  |      |     |     |     |     |     |     |     |     |     |     |    *
*  | Qtot | 94.7| 95.2| 95.6| 96.2| 96.7| 97.2| 97.7| 98.4| 99.4| 99.8|    *
*  |      |     |     |     |     |     |     |     |     |     |     |    *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+    *
*  | H%obs| 91.8| 92.9| 93.8| 94.4| 95.0| 95.7| 96.2| 96.8| 95.5| 78.7|    *
*  | L%obs| 95.3| 95.7| 96.1| 96.6| 97.0| 97.5| 98.1| 98.8| 99.7|100.0|    *
*  |      |     |     |     |     |     |     |     |     |     |     |    *
*  | H%prd| 82.7| 83.8| 85.0| 86.7| 88.1| 89.7| 91.4| 93.8| 96.3| 97.1|    *
*  | L%prd| 97.9| 98.3| 98.5| 98.7| 98.8| 99.0| 99.2| 99.4| 99.7| 99.9|    *
*  +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+    *
*                                                                          *
*  The above table gives the cumulative results, e.g. 92.3% of all         *
*  residues have a reliability of at least 5.  The overall two-state       *
*  accuracy for this subset is 97.2%.  For this subset, e.g., 95.7% of     *
*  the observed helical trans-membrane residues are correctly predicted,   *
*  and 89.7% of all residues predicted to be in helical trans-membrane     *
*  segment are correct.                                                    *
*                                                                          *
*                                                                          *
*                                                                          *
****************************************************************************


The resulting network (PHD) prediction is:                             
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                             
________________________________________________________________________________

****************************************************************************
*                                                                          *
*      PredictProtein@EMBL-Heidelberg.DE                                   *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                   *
*                                                                          *
*      PHD: Profile fed neural network systems from HeiDelberg             *
*      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             *
*      Prediction of:			                                   *
*	- secondary structure,   		by PHDsec		   *
*	- solvent accessibility, 		by PHDacc		   *
*	- and helical transmembrane regions, 	by PHDhtm		   *
*                                                                          *
*      Author:             Burkhard Rost		                   *
*                          EMBL, Heidelberg, FRG                           *
*                          Meyerhofstrasse 1, 69 117 Heidelberg            *
*                          Internet: Predict-Help@EMBL-Heidelberg.DE       *
*      All rights reserved.                                                *
*                                                                          *
****************************************************************************
*                                                                          *
*  The network systems are described in:   		                   *
*                                                                          *
*  	PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599.		   *
*		B Rost & C Sander: Proteins, 1994, 19, 55-72.		   *
*	PHDacc:	B Rost & C Sander: Proteins, 1994, 20, 216-226.		   *
*	PHDhtm: B Rost, R Casadio, P Fariselli & C Sander, 		   *
*			Prot. Science,  4, 521-533.			   *
*                                                                          *
****************************************************************************
*                                                                          *
*    Some statistics                                                       *
*    ~~~~~~~~~~~~~~~                                                       *
*                                                                          *
*    Percentage of amino acids:                                            *
*    +--------------+--------+--------+--------+--------+--------+         *
*    | AA:          |    L   |    G   |    A   |    F   |    I   |         *
*    | % of AA:     |    9.6 |    9.2 |    9.0 |    8.3 |    7.2 |         *
*    +--------------+--------+--------+--------+--------+--------+         *
*    | AA:          |    V   |    T   |    P   |    M   |    S   |         *
*    | % of AA:     |    6.7 |    6.1 |    6.0 |    5.4 |    5.2 |         *
*    +--------------+--------+--------+--------+--------+--------+         *
*    | AA:          |    Y   |    H   |    W   |    R   |    Q   |         *
*    | % of AA:     |    4.9 |    3.4 |    3.2 |    2.7 |    2.7 |         *
*    +--------------+--------+--------+--------+--------+--------+         *
*    | AA:          |    N   |    E   |    D   |    K   |    C   |         *
*    | % of AA:     |    2.7 |    2.7 |    2.7 |    1.6 |    0.5 |         *
*    +--------------+--------+--------+--------+--------+--------+         *
*                                                                          *
*    Percentage of secondary structure predicted:                          *
*    +--------------+--------+--------+--------+                           *
*    | SecStr:      |    H   |    E   |    L   |                           *
*    | % Predicted: |   49.6 |   12.6 |   37.7 |                           *
*    +--------------+--------+--------+--------+                           *
*                                                                          *
*    According to the following classes:                                   *
*       all-alpha:   %H>45 and %E< 5; all-beta : %H<5 and %E>45            *
*       alpha-beta : %H>30 and %E>20; mixed:    rest,                      *
*    this means that the predicted class is:           mixed class         *
*                                                                          *
****************************************************************************
*                                                                          *
*    PHD output for your protein                                           *
*    ~~~~~~~~~~~~~~~~~~~~~~~~~~~                                           *
*                                                                          *
*    Wed Nov 15 04:29:17 1995                                              *
*    Jury on:       10    different architectures (version   5.94_317 ).   *
*    Note: differently trained architectures, i.e., different versions can *
*    result in different predictions.                                      *
*                                                                          *
****************************************************************************
*                                                                          *
*    About the protein                                                     *
*    ~~~~~~~~~~~~~~~~~                                                     *
*                                                                          *
*    HEADER     /home/phd/tmp/t2_12969.seq                                 *
*    COMPND                                                                *
*    SOURCE                                                                *
*    AUTHOR                                                                *
*    SEQLENGTH   554                                                       *
*    NCHAIN        1 chain(s) in t2_12969 data set                         *
*    NALIGN       68                                                       *
*    (=number of aligned sequences in HSSP file)                           *
*                                                                          *
****************************************************************************
*                                                                          *
*    Abbreviations: PHDsec                                                 *
*    ~~~~~~~~~~~~~~~~~~~~~                                                 *
*                                                                          *
*    sequence:                                                             *
*       AA : amino acid sequence                                           *
*    secondary structure:                                                  *
*       HEL: H=helix, E=extended (sheet), blank=other (loop)               *
*       PHD: Profile network prediction HeiDelberg                         *
*       Rel: Reliability index of prediction (0-9)                         *
*    detail:                                                               *
*       prH: 'probability' for assigning helix                             *
*       prE: 'probability' for assigning strand                            *
*       prL: 'probability' for assigning loop                              *
*            note: the 'probabilites' are scaled to the interval 0-9, e.g.,*
*                  prH=5 means, that the first output node is 0.5-0.6      *
*    subset:                                                               *
*       SUB: a subset of the prediction, for all residues with an expected *
*            average accuracy > 82% (tables in header)                     *
*            note: for this subset the following symbols are used:         *
*         L: is loop (for which above " " is used)                         *
*       ".": means that no prediction is made for this residue, as the     *
*            reliability is:  Rel < 5                                      *
*                                                                          *
*    Abbreviations: PHDacc                                                 *
*    ~~~~~~~~~~~~~~~~~~~~~                                                 *
*                                                                          *
*    solvent accessibility:                                                *
*       3st: relative solvent accessibility (acc) in 3 states:             *
*            b = 0-9%, i = 9-36%, e = 36-100%.                             *
*       PHD: Profile network prediction HeiDelberg                         *
*       Rel: Reliability index of prediction (0-9)                         *
*       P_3: predicted relative accessibility in 3 states                  *
*            note: for convenience a blank is used intermediate (i).       *
*       10st:relative accessibility in 10 states:                          *
*            = n corresponds to a relative acc. of n*n %                   *
*    subset:                                                               *
*       SUB: a subset of the prediction, for all residues with an expected *
*            average correlation > 0.69 (tables in header)                 *
*            note: for this subset the following symbols are used:         *
*       "I": is intermediate (for which above " " is used)                 *
*       ".": means that no prediction is made for this residue, as the     *
*            reliability is: Rel < 4                                       *
*                                                                          *
*                                                                          *
*    Abbreviations: PHDhtm                                                 *
*    ~~~~~~~~~~~~~~~~~~~~~                                                 *
*                                                                          *
*    secondary structure:                                                  *
*       HL:  T=helical transmembrane region, blank=other (loop)            *
*       PHD: Profile network prediction HeiDelberg                         *
*       PHDF:filtered prediction, i.e., too long transmembrane segments    *
*            are split, too short ones are deleted                         *
*       Rel: Reliability index of prediction (0-9)                         *
*    detail:                                                               *
*       prH: 'probability' for assigning helical transmembrane region      *
*       prL: 'probability' for assigning loop                              *
*            note: the 'probabilites' are scaled to the interval 0-9, e.g.,*
*                  prH=5 means, that the first output node is 0.5-0.6      *
*    subset:                                                               *
*       SUB: a subset of the prediction, for all residues with an expected *
*            average accuracy > 82% (tables in header)                     *
*            note: for this subset the following symbols are used:         *
*         L: is loop (for which above " " is used)                         *
*       ".": means that no prediction is made for this residue, as the     *
*            reliability is:  Rel < 5                                      *
*                                                                          *
****************************************************************************
*                                                                          *
*    protein:       t2_1296        length      554                         *
*                                                                          *
                  ....,....1....,....2....,....3....,....4....,....5....,....6
         AA      |MSAQISDSIEEKRGFFTRWFMSTNHKDIGVLYLFTAGLAGLISVTLTVYMRMELQHPGVQ|
         PHD sec |           HHHHHHHEEEE     HHHHHHHHHHHHHHHHHHHHHHHHHHHH     |
         Rel sec |997555777634454432144445331213345779998765599999999999467964|
 detail: 
         prH sec |000001111136676655422211334545566878988776699999999998621011|
         prE sec |001221000000000113466521000112322110000000000000000000000002|
         prL sec |987666777763222221111257554331010000001112200000000001377976|
 subset: SUB sec |LLLLLLLLLL...H.........L........HHHHHHHHHHHHHHHHHHHHHH.LLLL.|
 
 ACCESSIBILITY 
 3st:    P_3 acc |eeeebeeeeeeeee beebbbb bbeebbbbbbbbbbbbbbbbbbbbbbbbbebeeebee|
 10st:   PHD acc |998707777776774066000050076000000000000000000000000060777076|
         Rel acc |464404533750550112451210041664958665336332425665691415343031|
 subset: SUB acc |eeee.ee..ee.ee....bb.....e.bbbbbbbbb..b...b.bbbbbb.b.b.e....|
                  ....,....7....,....8....,....9....,....10...,....11...,....12
         AA      |YMCLEGMRLVADAAAECTPNAHLWNVVVTYHGILMMFFVVIPALFGGFGNYFMPLHIGAP|
         PHD sec |        HHHHHHHH     EEEEEHHHHHHHHHHHHHHHHHHHH HHHHHHHH     |
         Rel sec |137899731364457358886176433688577676333247665312467664013225|
 detail: 
         prH sec |000000135676678621101111236788777777665567777545677766321011|
         prE sec |431000000000000000000477653100001112333321000100000111333332|
         prL sec |468888754322321368887301000101211000000011112343321112345556|
 subset: SUB sec |..LLLLL...H..HH.LLLLL.EE...HHHHHHHHH.....HHHH....HHHH......L|
 
 ACCESSIBILITY 
 3st:    P_3 acc |bbeeeebebbeebbeebeeeeebbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb|
 10st:   PHD acc |006777060078008808977600000000000000000000000000000000000000|
         Rel acc |201234111135214615543164367967387899998977466213312778544230|
 subset: SUB acc |.....e.....e..ee.eee..bb.bbbbb.bbbbbbbbbbbbbb......bbbbbb...|
                  ....,....13...,....14...,....15...,....16...,....17...,....18
         AA      |DMAFPRLNNLSYWLYVCGVSLAIASLLSPGGSDQPGAGVGWVLYPPLSTTEAGYAMDLAI|
         PHD sec |         HHHHH  HHHHHHHHHHHH             EEE          HHHHHH|
         Rel sec |466345433131112236866544564325788778887415735876699997121368|
 detail: 
         prH sec |222332333454433367877666676531111100011000000010000001444578|
         prE sec |000000000011131000011222211211000010001347762001100000122210|
         prL sec |677566665434334532111001112246788888887642137877799997433110|
 subset: SUB sec |.LL..L...........HHHHH..HH...LLLLLLLLLL..EE.LLLLLLLLLL....HH|
 
 ACCESSIBILITY 
 3st:    P_3 acc |ebbbb bbbbbbbbbbebbbbbbbbbbbeebeeeeeebbbbbbbbbbbeeeeeebbbbbb|
 10st:   PHD acc |600005000000000060000000000078077779700000000000877977000000|
         Rel acc |126332400458562112446653213343036553400332613021345633141677|
 subset: SUB acc |..b...b..bbbbb....bbbbb.....e...eee.e.....b......eee...b.bbb|
                  ....,....19...,....20...,....21...,....22...,....23...,....24
         AA      |FAVHVSGATSILGAINIITTFLNMRAPGMTLFKVPLFAWAVFITAWMILLSLPVLAGGIT|
         PHD sec |HHHHHHHHHHHHHHHHHHHHEE        E     EHHHHHHHHHHHHHHHHHHHHHHH|
         Rel sec |999973267997555553112211676652123331113689999999986579998878|
 detail: 
         prH sec |889875577887666665443343111121100123245779999999987788988888|
         prE sec |000000000000111113445532100013443222333110000000000000000000|
         prL sec |000013421001211111000124677764355553311000000000012210001111|
 subset: SUB sec |HHHHH..HHHHHHHHHH.......LLLLL..........HHHHHHHHHHHHHHHHHHHHH|
 
 ACCESSIBILITY 
 3st:    P_3 acc |bbbbbbbbbbbbbbbbbbbbbbebeeebbeb ebbbbbbbbbbbbbbbbbbbbbbbbbbb|
 10st:   PHD acc |000000000000000000000060677007056000000000000000000000000000|
         Rel acc |866051354554565277675315133023001202456675769577765625664695|
 subset: SUB acc |bbb.b..bbbbbbbb.bbbbb..b............bbbbbbbbbbbbbbbb.bbbbbbb|
                  ....,....25...,....26...,....27...,....28...,....29...,....30
         AA      |MLLMDRNFGTQFFDPAGGGDPVLYQHILWFFGHPEVYMLILPGFGIISHVISTFARKPIF|
         PHD sec |HHHEE              HHHHHHHHHHHH    EEEEEE   HHHHHHHHHH      |
         Rel sec |762112455551258799707999999998457536899856842332323331436522|
 detail: 
         prH sec |775332122211000100148888899998621221000000124555556554210012|
         prE sec |013432211124320000000000000000000026888871001133333333121233|
         prL sec |101234666664568889841000000001378741000027863311000012557644|
 subset: SUB sec |HH.....LLLL..LLLLLL.HHHHHHHHHH.LLL.EEEEEELL.............LL..|
 
 ACCESSIBILITY 
 3st:    P_3 acc |bbbbb ebbb bbe eeee bbbbbbbbbbbb bebbbbbbbbbbbbbbbbbe eeeebb|
 10st:   PHD acc |000005600050065999950000000000003060000000000000000074787600|
         Rel acc |855101110114320363211575038745211219697951231576187040366202|
 subset: SUB acc |bbb........b....e....bbb..bbbb.....bbbbbb....bbb.bb.e..ee...|
                  ....,....31...,....32...,....33...,....34...,....35...,....36
         AA      |GYLPMVLAMAAIAFLGFIVWAHHMYTAGMSLTQQTYFQMATMTIAVPTGIKVFSWIATMW|
         PHD sec |EHHHHHHHHHHHHHHHHHHHHHHHHHEEE   EHHHHHHEEEEEEE    HHHHHHHHH |
         Rel sec |145799999999987786886644113313222244432166787379514432565521|
 detail: 
         prH sec |356799999999887787887755442110112465554311100000145555776654|
         prE sec |311000000000000001101122334543334322123477888610111233212110|
         prL sec |322100000000011101000111222245453211222210000379642111000134|
 subset: SUB sec |..HHHHHHHHHHHHHHHHHHHH..................EEEEE.LLL.....HHHH..|
 
 ACCESSIBILITY 
 3st:    P_3 acc |b bbbbbbbbbbbbbbbbbbbbbbbbbbbebebebbbbbbbbbbbbbebbebbbbbbbbe|
 10st:   PHD acc |030000000000000000000000000006070600000000000007006000000007|
         Rel acc |303577598668457576889446536232031140503748789513342676458213|
 subset: SUB acc |...bbbbbbbbbbbbbbbbbbbbbb.b.......b.b..bbbbbbb...b.bbbbbb...|
                  ....,....37...,....38...,....39...,....40...,....41...,....42
         AA      |GGSIEFKTPMLWALAFLFTVGGVTGVVIAQGSLDRVYHDTYYIVAHFHYVMSLGALFAIF|
         PHD sec |   EEE  HHHHHHHHHHEE     EEEEE    EEE   EEEEEHHHHHHHHHHHHHHH|
         Rel sec |894222213589755663341422133652576266525347852312245788999999|
 detail: 
         prH sec |100000145688866775311233332111110000000011123545566788999998|
         prE sec |003555310000000013563112355764101477632367865233322110000000|
         prL sec |896333443210122111114654311123677522256621001111110000000000|
 subset: SUB sec |LL.......HHHHHHHH..........EE.LLL.EEE.L..EEE......HHHHHHHHHH|
 
 ACCESSIBILITY 
 3st:    P_3 acc |eeebebebebbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb|
 10st:   PHD acc |866070707000000000000000000000000000000000000000000000000000|
         Rel acc |301161303353642257342231655650103065311343999484588533788974|
 subset: SUB acc |....e.....b.bb..bb.b....bbbbb.....bb....b.bbbbbbbbbb..bbbbbb|
                  ....,....43...,....44...,....45...,....46...,....47...,....48
         AA      |AGTYYWIGKMSGRQYPEWAGQLHFWMMFIGSNLIFFPQHFLGRQGMPRRYIDYPVEFSYW|
         PHD sec |H HHHEEEE      HHHHHHH EEEEEEE  EE HHHHH               HHHHH|
         Rel sec |412402452125332247643203578764132121113136756876346894212233|
 detail: 
         prH sec |645532112210013567766432110100000114445421122111221003455555|
         prE sec |001134564332332000000224678776435431110121100011111000000002|
         prL sec |343232223456554421133232100112463444443347767876567886543332|
 subset: SUB sec |.......E...L.....HH.....EEEEE............LLLLLLL..LLL.......|
 
 ACCESSIBILITY 
 3st:    P_3 acc |bbbb bbbebebeebeeebbebbbbbbbbbbbbbbbbbbbbbbebbb  beebbebbebb|
 10st:   PHD acc |000030006060662777007000000000000000000000070005407720700600|
         Rel acc |233004301201110454015308382635437254301645231101114410420101|
 subset: SUB acc |.....b.........eee..e..b.b.b.bb.b.bb...bbb........ee..e.....|
                  ....,....49...,....50...,....51...,....52...,....53...,....54
         AA      |NNISSIGAYISFASFLFFIGIVFYTLFAGKPVNVPNYWNEHADTLEWTLPSPPPEHTFET|
         PHD sec |HHHHHH HHHHHHHHHHHHHHHHHHHHH                EEE             |
         Rel sec |434433168999999999999999995237554556678897521653789995346787|
 detail: 
         prH sec |556555477899999999999999987531101111110101111000000002322101|
         prE sec |232210000000000000000000000001222211100000234773100000010001|
         prL sec |111123421000000000000000002357666666778887654225889996567787|
 subset: SUB sec |.......HHHHHHHHHHHHHHHHHHHH..LLL.LLLLLLLLLL..EE.LLLLLL..LLLL|
 
 ACCESSIBILITY 
 3st:    P_3 acc |bbbbbbbbbbbbbbbbbbbbbbbebbeeeeeeeeeeeeeeeeeebebbee eeee bbee|
 10st:   PHD acc |000000000000000000000007007777776777677877760600785677740077|
         Rel acc |016332443755466348746833003465521335027643412131260034401353|
 subset: SUB acc |..b...bb.bbbbbb.bbbbbb.....eeee....e..eee.e......e...ee...e.|
                  ....,....55...,....56...,....57...,....58...,....59...,....60
         AA      |LPKPEDWDRAQAHR|
         PHD sec |              |
         Rel sec |88754523212289|
 detail: 
         prH sec |01122233345410|
         prE sec |00000000000000|
         prL sec |88876655554589|
 subset: SUB sec |LLLL.L......LL|
 
 ACCESSIBILITY 
 3st:    P_3 acc |beeeeeeeeeeeee|
 10st:   PHD acc |07767777767799|
         Rel acc |03516535504549|
 subset: SUB acc |..e.ee.ee.eeee|
 

************************************************************
*    PHDhtm Helical transmembrane prediction
*           note: PHDacc and PHDsec are reliable for water-
*                 soluble globular proteins, only.  Thus, 
*                 please take the  predictions above with 
*                 particular caution wherever transmembrane
*                 helices are predicted by PHDhtm!
************************************************************
 
 PHDhtm
                  ....,....1....,....2....,....3....,....4....,....5....,....6
         AA      |MSAQISDSIEEKRGFFTRWFMSTNHKDIGVLYLFTAGLAGLISVTLTVYMRMELQHPGVQ|
         PHD htm |                               TTTTTTTTTTTTTTTTT            |
         Rel htm |999999999999999999999999998754204677888888877652146778999999|
 detail: 
         prH htm |000000000000000000000000000122357888999999988876421110000000|
         prL htm |999999999999999999999999999877642111000000011123578889999999|
 subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLL....HHHHHHHHHHHHHH...LLLLLLLLLL|
                  ....,....7....,....8....,....9....,....10...,....11...,....12
         AA      |YMCLEGMRLVADAAAECTPNAHLWNVVVTYHGILMMFFVVIPALFGGFGNYFMPLHIGAP|
         PHD htm |                         TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT   |
         Rel htm |999999999999999999999886314677888888888888887776532333331047|
 detail: 
         prH htm |000000000000000000000001357888999999999999998888766666665421|
         prL htm |999999999999999999999998642111000000000000001111233333334578|
 subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLL...HHHHHHHHHHHHHHHHHHHHHH..........L|
                  ....,....13...,....14...,....15...,....16...,....17...,....18
         AA      |DMAFPRLNNLSYWLYVCGVSLAIASLLSPGGSDQPGAGVGWVLYPPLSTTEAGYAMDLAI|
         PHD htm |           TTTTTTTTTTTTT                                   T|
         Rel htm |767667666513454345666641266788999999999999999999999998876411|
 detail: 
         prH htm |111111111246777677888875311100000000000000000000000000011245|
         prL htm |888888888753222322111124688899999999999999999999999999988754|
 subset: SUB htm |LLLLLLLLLL...H...HHHHH...LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL...|
                  ....,....19...,....20...,....21...,....22...,....23...,....24
         AA      |FAVHVSGATSILGAINIITTFLNMRAPGMTLFKVPLFAWAVFITAWMILLSLPVLAGGIT|
         PHD htm |TTTTTTTTTTTTTTTTTTTTT              TTTTTTTTTTTTTTTTTTTTTTTTT|
         Rel htm |456676777777777765431046777777777513567888888888888888887652|
 detail: 
         prH htm |778888888888888887765421111111111246788999999999999999998876|
         prL htm |221111111111111112234578888888888753211000000000000000001123|
 subset: SUB htm |.HHHHHHHHHHHHHHHHH.....LLLLLLLLLLL..HHHHHHHHHHHHHHHHHHHHHHH.|
                  ....,....25...,....26...,....27...,....28...,....29...,....30
         AA      |MLLMDRNFGTQFFDPAGGGDPVLYQHILWFFGHPEVYMLILPGFGIISHVISTFARKPIF|
         PHD htm |T                            TTTTTTTTTTTTTTTTTTTTTT         |
         Rel htm |013678999999999999999899986520110025677777777777542024677776|
 detail: 
         prH htm |543110000000000000000000001235555567888888888888776432111111|
         prL htm |456889999999999999999999998764444432111111111111223567888888|
 subset: SUB htm |...LLLLLLLLLLLLLLLLLLLLLLLLL.......HHHHHHHHHHHHHH.....LLLLLL|
                  ....,....31...,....32...,....33...,....34...,....35...,....36
         AA      |GYLPMVLAMAAIAFLGFIVWAHHMYTAGMSLTQQTYFQMATMTIAVPTGIKVFSWIATMW|
         PHD htm |   TTTTTTTTTTTTTTTTTTTTT             TTTTTTTTTTTTTTTTTTT    |
         Rel htm |531256788888888888776520123567888775202346788887776655410234|
 detail: 
         prH htm |234678899999999999888765432211000012356678899998888877754332|
         prL htm |765321100000000000111234567788999987643321100001111122245667|
 subset: SUB htm |L...HHHHHHHHHHHHHHHHHH.....LLLLLLLLL.....HHHHHHHHHHHHH......|
                  ....,....37...,....38...,....39...,....40...,....41...,....42
         AA      |GGSIEFKTPMLWALAFLFTVGGVTGVVIAQGSLDRVYHDTYYIVAHFHYVMSLGALFAIF|
         PHD htm |        TTTTTTTTTTTTTTTTTTTT             TTTTTTTTTTTTTTTTTTT|
         Rel htm |444455411456777788888877765301234542234302334456777777777788|
 detail: 
         prH htm |222222245778888899999988887644332223332346667778888888888899|
         prL htm |777777754221111100000011112355667776667653332221111111111100|
 subset: SUB htm |....LL....HHHHHHHHHHHHHHHHH......L............HHHHHHHHHHHHHH|
                  ....,....43...,....44...,....45...,....46...,....47...,....48
         AA      |AGTYYWIGKMSGRQYPEWAGQLHFWMMFIGSNLIFFPQHFLGRQGMPRRYIDYPVEFSYW|
         PHD htm |TTTTTTT                TTTTTTTTTTTTTT                       |
         Rel htm |777764203455677777655402567777777754102457899999999999999998|
 detail: 
         prH htm |888887643222111111122246788888888877543221000000000000000000|
         prL htm |111112356777888888877753211111111122456778999999999999999999|
 subset: SUB htm |HHHHH.....LLLLLLLLLLL...HHHHHHHHHHH.....LLLLLLLLLLLLLLLLLLLL|
                  ....,....49...,....50...,....51...,....52...,....53...,....54
         AA      |NNISSIGAYISFASFLFFIGIVFYTLFAGKPVNVPNYWNEHADTLEWTLPSPPPEHTFET|
         PHD htm |   TTTTTTTTTTTTTTTTTTTTTT                                   |
         Rel htm |762036778888888888888765304678999999999999999999999999999999|
 detail: 
         prH htm |113568889999999999999887642110000000000000000000000000000000|
         prL htm |886431110000000000000112357889999999999999999999999999999999|
 subset: SUB htm |LL...HHHHHHHHHHHHHHHHHHH...LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL|
                  ....,....55...,....56...,....57...,....58...,....59...,....60
         AA      |LPKPEDWDRAQAHR|
         PHD htm |              |
         Rel htm |99999999999999|
 detail: 
         prH htm |00000000000000|
         prL htm |99999999999999|
 subset: SUB htm |LLLLLLLLLLLLLL|
________________________________________________________________________________

***                                                                          ***
********************************************************************************
***                                                                          ***
***   Prediction of transmembrane regions (PHDhtm)                           ***
***   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                           ***
***                                                                          ***
***                                                                          ***
***   Note: The accuracy of predicting helical trans-membrane regions is     ***
***   some 95%.  In a test on 69 proteins only one was not predicted to be   ***
***   a trans-membrane protein (2mlt).  PHDsec for the prediction of glo-    ***
***   bular proteins predicted this protein more accuractely, than PHDhtm    ***
***   for trans-membrane proteins.  Vice versa, about 5% out of 300 globu-   ***
***   lar proteins were missclassified as trans-membrane molecules.  These   ***
***   results have two practical consequences:                               ***
***   (i)  	if you know that your sequence is partly in a membrane and   ***
***    		PHDhtm does not predict a clear membrane region:             ***
***   	    -> 	try PHDsec, it may be more accurate although in general not  ***
***   		suited for membrane proteins.                                ***
***   (ii)	if you assume your sequence is not at all in a membrane and  ***
***   		PHDhtm does predict a membrane segment:                      ***
***   	    ->	ignore the trans-membrane prediction.                        ***
***                                                                          ***
***   For residues predicted to be outside of the lipid bilayer (predicted   ***
***   as loop, PHDsec should give reasonably accurate results, provided the  ***
***   regions sticking out of the membrane or long enough.    		     ***
***                                                                          ***