An analysis of the Candida albicans genome database for soluble secreted proteins using computer-based prediction algorithms.

abstract

We sought to identify all genes in the Candida albicans genome database whose deduced proteins would likely be soluble secreted proteins (the secretome). While certain C. albicans secretory proteins have been studied in detail, more data on the entire secretome is needed. One approach to rapidly predict the functions of an entire proteome is to utilize genomic database information and prediction algorithms. Thus, we used a set of prediction algorithms to computationally define a potential C. albicans secretome. We first assembled a validation set of 47 C. albicans proteins that are known to be secreted and 47 that are known not to be secreted. The presence or absence of an N-terminal signal peptide was correctly predicted by SignalP version 2.0 in 47 of 47 known secreted proteins and in 47 of 47 known non-secreted proteins. When all 6165 C. albicans ORFs from CandidaDB were analysed with SignalP, 495 ORFs were predicted to encode proteins with N-terminal signal peptides. In the set of 495 deduced proteins with N-terminal signal peptides, 350 were predicted to have no transmembrane domains (or a single transmembrane domain at the extreme N-terminus) and 300 of these were predicted not to be GPI-anchored. TargetP was used to eliminate proteins with mitochondrial targeting signals, and the final computationally-predicted C. albicans secretome was estimated to consist of up to 283 ORFs. The C. albicans secretome database is available at http://info.med.yale.edu/intmed/infdis/candida/

authors

publication date

January 1, 2003

published in

Yeast (Chichester, England) Journal

keywords

Algorithms
Candida albicans
Databases, Genetic
Fungal Proteins
Genome, Fungal
Open Reading Frames
Protein Sorting Signals
Proteome
Software Design
Solubility

Digital Object Identifier (DOI)

https://doi.org/10.1002/yea.988

PubMed ID

12734798

start page

595

end page

610

volume

20

number

7

VIVO

An analysis of the Candida albicans genome database for soluble secreted proteins using computer-based prediction algorithms. Academic Article

Overview

abstract

authors

publication date

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

start page

end page

volume

number