CUBIC - TOC - Previous - Next - Bottom
Introductions, Search engines, Central sites - Literature, Pathways, Genomes - Databases - Services - Software
Contents: Services

  1. Services, Hotlists
  2. Services, general
  3. Services, alignments and database searches
    1.  Central sites
    2.  Fast db searches
    3.  Full dynamic progr
    4.  Profile-based ana
    5.  Hidden Markov models
    6.  Analyse and display alis
    7.  Find motifs
    8.  Composition bias
    9.  Other services
  4. Services, analysing nucleotide sequences
  5. Services, protein structure prediction
    1.  Collection of tools
    2.  Secondary structure
    3.  Sec Str from CD
    4.  Solvent accessibility
    5.  HTM + signal pep
    6.  Coiled-coils
    7.  O-glycosylation sites
    8.  Contact prediction
    9.  Homology modelling
    10.  Threading




Services

Services, Hotlists


Services, general
  • HyperCLDB:     HyperCLDB, the hypertext on cell culture availability extracted from the Cell Line Data Base of the Interlab Project.
  • QUEST:     The QUEST Protein Database Center is a facility for the construction and analysis of Protein Databases. The data is generated by two-dimensional (2D) electrophoresis of proteins on polyacrylamide gels. We are located at the Cold Spring Harbor Laboratory (CSHL) on Long Island, New York, and we have a computer facility where gels are analyzed and 2D gel protein databases are built. Our goal is the construction of protein databases for scientific investigations.
  • Compute pI/Mw:     Compute pI/Mw is a tool which allows the computation of the theoretical pI (isolectric point) and Mw (molecular weight) for a list of SWISS-PROT entries or for a user entered sequence.
  • K2d server:     Estimation of the percentages of protein secondary structure from UV circular dichroism spectra using a neural network.
  • Biotech validation:     Biotech validation suite for protein structures (quality checks of protein structures). The server gives you a comprehensive check report of your protein.
  • Dali server:     The Dali server is a network service for comparing protein structures in 3D. You submit the coordinates of a query protein structure and Dali compares them against those in the Protein Data Bank. A multiple alignment of structural neighbours is mailed back to you. In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences.
  • SCWRL:     Program for adding sidechains to a protein backbone based on a backbone-dependent rotamer library.
  • Structure2Function:     Structure determination of 'hypothetical' proteins from Haemophilus influenzae to gain an understanding of their function.
  • Rotamer library:     The protein sidechain webpage (dunbrack)


Services, alignments and database searches
TOP - Central sites - Fast db searches - Full dynamic progr - Profile-based ana - Hidden Markov models - Analyse and display alis - Find motifs - Composition bias - Other services
Central sites

  • EBI (England):   Sequence similarity searches (FASTA, BLITZ, PROSITE, BLAST, MAXHOM-PredictProtein).
  • BCM (USA):   General protein sequence/pattern searches. Programs include fast methods (BLAST, FASTA, PROSITE) and full dynamic programming methods (FASTA, BLAST, BLITZ, MPSEARCH).
  • BioSCAN:   The BioSCAN Server allows searching, retrieving and comparing of protein and DNA sequences.
  • NCSA Biology Workbench:   The NCSA Biology Workbench provides a point and click interface for rapid access to biological databases and analysis tools.
  • BCM search launcher:   The BCM (Baylor College of Medicine) search launcher is an on-going project to organise molecular biology-related search and analysis services available on the WWW by providing a single point-of-entry for related searches.
Fast db searches

  • BLAST:   BLAST performs fast database searching combined with rigorous statistics for judging the significance of matches. Five BLAST programs search all combinations of query and database sequences.
Full dynamic progr

  • BIOACCELERATOR:   Sequence database searches using a fast parallel computer at the Weizmann Institute.
  • ClustalW:   ClustalW is a progressive (tree guided) multiple alignment program.
  • SSEARCH:   Sequence comparison using a full dynamic programming algorithm.
  • ToPLign:   ToPLign implements standard pairwise and multiple alignment methods with flexible parameter handling. The analysis of alignments is supported by offering different visualisations of alignments. Furthermore, the stability of the resulting alignments can be explored.
Profile-based ana

  • ClustalW:   ClustalW is a progressive (tree guided) multiple alignment program.
  • MatchBox:   Alignment refinement program based on merging boxes of highly similar residues in all columns of a given multiple alignment.
Hidden Markov models

  • SAM:   The Sequence Alignment and Modeling system (SAM) is a collection of flexible software tools for creating, refining, and using linear hidden Markov models for biological sequence analysis.
Analyse and display alis

  • ToPLign:   ToPLign implements standard pairwise and multiple alignment methods with flexible parameter handling. The analysis of alignments is supported by offering different visualisations of alignments. Furthermore, the stability of the resulting alignments can be explored.
  • BOX:   Pretty Printing and Shading of Multiple-Alignment files.
Find motifs

  • PFSCAN:   This server uses the PFSCAN program to search a single sequence against all profile entries in the current release of PROSITE.
  • PatScan:   The PatScan pattern matcher is being offered to allow you to search protein or DNA sequence archives for instances of some pattern. You must provide the pattern, along with some indication of which protein or DNA sequences you wish to scan.
  • PROSITE:   Dictionary of protein sites and patterns. PROSITE is a method of determining what is the function of uncharacterized proteins translated from genomic or cDNA sequences. It consists of a database of biologically significant sites, patterns and profiles that help to reliably identify to which known family of protein (if any) a new sequence belongs
  • Pmotif:   Search of given protein motif in requested protein and nucleotide database.
  • REPRO:   A service for the recognition of protein sequence repeats.
  • FPAT:   Regular expression searches of seq db from Washington Univ
Composition bias

  • SAPS:   SAPS - Statistical Analysis of Protein Sequences. Alignments and analyses of composition bias.
Other services

  • Proteome:   Comparative analysis of protein coding sequences of completed genomes.
  • InterPro:   Integrated resource of protein domains and functional sites.
  • Sequence Alerting System:   The sequence alerting system in its present form will search each day in several databases for news on (homologues of) "your" sequence and will inform you by email if it has detected a new relative.
  • PSORT:   Prediction of protein sorting signals and localisation sites in amino acid sequences.


Services, analysing nucleotide sequences
  • GRAIL:     GRAIL (Gene Recognition and Assembly Internet Link) is DNA Sequence analysis tool. The GenQuest sequence comparison server is designed for rapid and sensitive comparison of DNA and Protein sequence to existing DNA and Protein sequence databases. Full database entries of any sequence found in the course of a search are retrieved.
  • GenQuest:     Running BLAST, FASTA or a full dynamic programming alignment of nucleotide sequences against Nucleotide and protein databases.
  • Splice site predictions:     The Center for Biological Sequence Analysis (CBS, Copenhagen, Denmark) offers a service for predicting intron splice sites in human and Arabidopsis thaliana DNA.
  • Gene recognition:     Gene recognition algorithm PROCRUSTES (UCS, USA) is based on the spliced alignment algorithm which explores all possible exon assemblies and finds the multi-exon structure with the best fit to a related protein via spliced alignments.
  • Collection of prediction services from Weizmann:     E-Mail server for analysis of uncharacterized sequences at the Weizmann institute.


Services, protein structure prediction
TOP - Collection of tools - Secondary structure - Sec Str from CD - Solvent accessibility - HTM + signal pep - Coiled-coils - O-glycosylation sites - Contact prediction - Homology modelling - Threading
Collection of tools

  • Documented collection of prediction services:   Overview and links to services for predicting secondary structure, solvent accessibility, homology modelling and threading (MRC, Cambdridge, England).
  • List of prediction services:   A list of useful protein prediction servers (Univ. Stockholm, Sweden).
  • PredictProtein:   Multiple sequence alignment (MAXHOM); prediction of secondary structure (PHDsec), solvent accessibility (PHDacc), transmembrane helices (PHDhtm), transmembrane topology (PHDtopology); and threading (PHDthreader).
Secondary structure

  • PHDsec:   Multiple alignment-based neural network system.
    Accuracy: > 72% (+/-10%, one standard deviation), higher for more reliably predicted residues. Evaluated by cross-validation on 720 unique proteins; comparisons to other methods based on identical sets.
  • NSSP:   Multiple alignment-based nearest-neigbour method.
    Accuracy: > 71%. Evaluated on > 200 unique proteins.
  • SOPM:   Multiple alignment-based method combining various other prediction programs.
    Accuracy: > 70%. Evaluated on 100 unique proteins.
  • DSC:   Multiple alignment-based program using statistics.
    Accuracy: 70%. Evaluated on standard set of 126 unique proteins (comparisons to other methods based on identical sets).
  • SSPRED:   Multiple alignment-based program using statistics.
    Accuracy: > 70%. Evaluated on 70 unique proteins (no comparison based on identical sets to other methods).
  • MultiPredict:   Multiple alignment-based method using physicochemical information from a set of aligned sequences and statistical secondary structure decision constants.
    Accuracy: > 65%. Evaluated on 13 unique proteins.
  • PSA:   The PSA server analyzes amino acid sequences to predict secondary structures and folding classes.
  • NNPREDICT:   Single-sequence based neural network prediction.
    Accuracy: > 65%. Evaluated on pairwise similar proteins.
Sec Str from CD

  • K2d:   Algorithm for the estimation of the percentages of protein secondary structure from UV circular dichroism spectra using a Kohonen neural network with a 2-dimensional output layer. You can either use k2d via a web server or get the program and run it on your machine.
Solvent accessibility

  • PHDacc:   Multiple alignment-based neural network system.
    Accuracy: > 75% (+/-10%, one standard deviation), higher for more reliably predicted residues. Evaluated by cross-validation on 720 unique proteins; comparisons to other methods based on identical sets.
HTM + signal pep

  • PHDhtm:   Multiple alignment-based neural network system predicting the locations of transmembrane helices.
    Accuracy: > 95% (+/-10%, one standard deviation), higher for more reliably predicted residues. Evaluated by cross-validation on 132 proteins; comparisons to other methods based on identical sets.
  • TMAP:   Single sequence-based statistical prediction of the locations of transmembrane helices.
    Accuracy: > 95%. Evaluated on 28 proteins WITHOUT cross-validation.
  • PHDtopology:   Refinement of PHDhtm by dynamic programming and prediction of topology (orientation of N-term with respect to membrane).
    Accuracy: for > 85% of all proteins all helices and topology are predicted correctly. Evaluated by cross-validation on 132 proteins; comparisons to other methods based on identical sets.
  • TMpred:   Single sequence-based prediction of location and topology for helical transmembrane proteins using statistics and similarity metrices.
  • DAS:   Single sequence-based prediction of location for helical transmembrane proteins.
  • TopPred2:   Single sequence-based prediction of topology for helical transmembrane proteins.
  • Signalp:   Neural network prediction of presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive and Gram-negative prokaryotes, and eukaryotes.
Coiled-coils

  • COILS:   Single sequence-based prediction for coiled-coil regions using statistical patterns of coiled-coil proteins in the database.
O-glycosylation sites

  • NetOglyc:   Neural network predictions of mucin type O-glycosylation sites in mammalian proteins.
Contact prediction

Homology modelling

  • SWISS-MODEL:   An automated knowledge-based protein modelling server ; first approach and optimise (Peitsch M.C. Protein Modelling by E-mail. Bio/Technology 13:658-660. (1995)
Threading

  • TOPITS:   Prediction-based threading detecting the fold type and aligning a protein of unknown structure and a protein of known structure for low levels of sequence identity ( < 25%).
    Accuracy: < 30% , i.e., less than 30% of the predicted first hits are true remote homologues. Evaluated by cross-validation on 89 unique protein structures.
  • T3P2:   Prediction-based threading detecting the fold type and aligning a protein of unknown structure and a protein of known structure for low levels of sequence identity ( < 25%).
  • PSCANN:   Threading method combining sequence and structure profiles. Performance accuracy: more likely to recognise similar folds than simple sequence alignment.


Introductions, Search engines, Central sites - Literature, Pathways, Genomes - Databases - Services - Software
CUBIC - TOC - Previous - Next - Top