Author Correspondence author
Computational Molecular Biology, 2014, Vol. 4, No. 7 doi: 10.5376/cmb.2014.04.0007
Received: 05 Aug., 2014 Accepted: 21 Sep., 2014 Published: 22 Oct., 2014
Meinken et al., 2014, FunSecKB2: a fungal protein subcellular location knowledgebase, Computational Molecular Biology, Vol.4, No.6, 1-17 (doi: 10.5376/cmb.2014.04.0007)
FunSecKB2 is an improved and updated version of the fungal secretome and subcellular proteome, i. e. protein subcellular location, knowledgebase. The fungal protein sequence data were retrieved from UniProtKB, consisting of nearly 2 million entries with 167 species having a complete proteome. The assignments of protein subcellular locations were based on curated information and prediction using seven computational tools. The tools used for subcellular location prediction include SignalP, WoLF PSORT, Phobius, TargetP, TMHMM, FragAnchor, and PS-Scan. Secreted proteins, i.e. secretomes, along with 15 other subcellular proteomes were predicted. The database can be searched by users using several different types of identifiers, gene name or keyword(s). A subcellular proteome from a species can be searched or downloaded. BLAST searching whole fungal protein data or secretomes is available. Community annotation of subcelluar locations based on experimental evidence is also supported. A primary analysis revealed that the secretome size of a fungal species is one of the determining factors to its lifestyle. The Gene Ontology and protein domain analysis of fungal secretomes revealed that fungal secretomes contain a large number of hydrolases, peptidases, oxidoreductases, and lysases, which may have potential applications in bio-processing of chemical wastes or biofuel production. The database provides an important and rich resource for the fungal community looking for protein subcellular location information and performing comparative subcellular proteome analysis.
Fungi play important roles in nature and in our daily life. In nature, fungal species serve as decomposers of biomass, which is critical for carbon and nutrient cycling. In our daily life, edible mushrooms are well-known examples of fungi. Saccharomyces cerevisiae, known as a baker’s yeast, is widely used in winemaking, baking and brewing. Some fungi are also known as producers for drugs, such as antibiotics. Fungal species are also important pathogens in insects, animals, human and plants.
Table 1 Evaluation of prediction accuracies of fungal protein subcellular locations |
We also compared the accuracy of mitochondrial proteins predicted by WoLF PSORT and TargetP. We found that the MCC values were 0.67 for WoLF PSORT and 0.56 for TargetP, and we also found using both tools increased the mitochondrial protein prediction specificity, from 0.93 using WoLF PSORT only to >0.98 when both were used. However, using both tools did not improve the MCC value due to the decrease in prediction sensitivity. Thus, we selected WoLF PSORT for assigning mitochondrial proteins. However, a user should be aware that if both WoLF PSORT and TargetP predicted the protein is a mitochondrial protein, the prediction is more reliable than prediction just from one of them.
Table 2 Summary of some major subcullar locations of proteins in different fungal different speces. Data of other subcellular locations of fungal proteins are in Supplementary Table 1 |
The variability of genome sizes and thus the proteome sizes is pretty large in different fungal species. However, it should be noted that in the database, as showed in Table 2, the total proteins of a given species is not necessarily the proteome size, but rather a collection of all proteins available from the species. For example, for Saccharomyces cerevisiae, its reference proteome size as compiled UniProtKB consists only of 6,621 proteins, there are a total of 79,093 proteins in our database under the name of Saccharomyces cerevisiae, thus obviously consisting of proteins obtained from multiple strains. The subcellular distributions of fungal proteins were estimated based on the pooled data for each phylum for Ascomycota, Basidiomycota and Microsporidia. Interestingly, we found that the nucleus represents the largest compartment for protein destination: 39.2% in Ascomycota, 39.2% in Basidiomycota, and 57.4% in Microsporidia, respectively, were predicted to be located in the nucleus. Mitochondria represent another large compartment for protein targeting: 19.5% in Ascomycota, 21.1% in Basidiomycota, and 16.7% in Microsporidia, respectively, were located in mitochondria. Approximately 18 – 21% of proteins are located in cytosol or cytoplasm. The proportions of secretomes vary from 0.3% to 10.5% with an average of 4.6% in Ascomycota, from 1.9% to 7.4% with an average of 4.4% in Basidiomycota, and from 0.5% to 1.7% with an average of 0.9% in Microsporidia, respectively. However, here the secretome is limited to including curated secreted proteins and highly likely secreted proteins, thus the number represents a lower bound of a species secretome. Including other proteins predicted as likely secreted and weakly likely secreted proteins, the size of secretome certainly will be significantly increased, but there would be an increase in the number of false positives, i.e., non-secreted proteins in the set.
Figure 1 Relationship between proteome size and secretome size in fungal species having different lifestyles |
2.4 Functional analysis of fungal secreted proteins
Table 3 Gene Ontology (GO) classification of fungal secreted proteins |
We further categorized the functions of predicted secreted fungal proteins using the rpsBLAST tool to search the Pfam database with a cutoff E-value of 1e-10. Among a total of 93430 predicted secreted proteins, 43953 protein sequences have a Pfam match and a total of 880 protein families were detected. The summary of the Pfam analysis with 33 highly encoded secreted protein families in fungi is shown in Table 4. A complete list can be downloaded (http://proteomics.ysu.edu/publicaiton/ data/). The top 10 highly encoded secreted protein families in fungi were eukaryotic aspartyl protease, carboxylesterase family, FAD binding domain containing family, subtilase family, glycosyl hydrolase family 61, glycosyl hydrolases family 28, glycosyl hydrolases family 18, GMC oxidoreductase, serine carboxypeptidase, and glycosyl hydrolase family 3. These proteases identified here such as aspartyl protease, subtilase, and other peptidase families are likely to be required for synergistic degradation of the proteins present in the various growth medium or substrate materials in the environments (Druzhinina et al. 2012; Girard et al. 2013). GO analysis and functional domain analysis are consistent in showing these proteins are mainly involved in biodegrading complex bio-molecules including carbohydrates, proteins, lipids, and other molecules.
Table 4 Highly encoded secreted protein families in fungi |
3 Discussion
References
http://dx.doi.org/10.1016/j.jprot.2014.03.001
Bendtsen J.D., Jensen L.J., Blom N. et al. 2004a, Feature based prediction of non-classical and leaderless protein secretion, Protein Eng Des Sel, 17: 349-356
http://dx.doi.org/10.1093/protein/gzh037
Bendtsen J.D., Nielsen H., von Heijne, G. et al. 2004b, Improved prediction of signal peptides: SignalP 3.0, J Mol Biol, 340: 783-795
http://dx.doi.org/10.1016/j.jmb.2004.05.028
Bouws H., Wattenberg A. and Zorn H, 2008, Fungal secretomes-nature's toolbox for white biotechnology. Appl. Microbiol. Biotechnol. 80: 381-388
http://dx.doi.org/10.1007/s00253-008-1572-5
http://dx.doi.org/10.1186/1471-2164-11-584
Brown N.A., Antoniw J., and Hammond-Kosack K.E., 2012, The predicted secretome of the plant pathogenic fungus Fusarium graminearum: a refined comparative analysis, PLoS One, 7: e33731
http://dx.doi.org/10.1371/journal.pone.0033731
http://dx.doi.org/10.1016/j.bbapap.2013.01.039
Choi J., Park J., Kim D., et al. 2010, Fungal secretome database: integrated platform for annotation of fungal secretomes, BMC Genomics, 11: 105
http://dx.doi.org/10.1186/1471-2164-11-105
http://dx.doi.org/10.1007/s00726-013-1649-z
De Castro E., Sigrist C.J., Gattiker A., et al. 2001 ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins, Nucleic Acids Res., 34: W362-365
Do Vale L.H., Gómez-Mendoza D.P., Kim M.S., et al. 2012, Secretome analysis of the fungus Trichoderma harzianum grown on cellulose, Proteomics, 12: 2716-2728
http://dx.doi.org/10.1002/pmic.201200063
http://dx.doi.org/10.1111/j.1574-6968.2012.02665.x
Emanuelsson O., Brunak S., von Heijne G., et al. 2007, Locating proteins in the cell using TargetP, SignalP and related tools, Nat. Protoc., 2: 953-971
http://dx.doi.org/10.1038/nprot.2007.131
http://dx.doi.org/10.1002/pmic.201200228
Girard V., Dieryckx C., Job C. et al. 2013, Secretomes: the fungal strike force, Proteomics, 13: 597-608
http://dx.doi.org/10.1002/pmic.201200282
Horton P., Park K.-J., Obayashi T., et al. 2007, WoLF PSORT: protein localization predictor, Nucleic Acids Res., 35: W585-587
http://dx.doi.org/10.1093/nar/gkm259
Jung Y.H., Jeong S.H., Kim S.H., et al. 2012, Secretome analysis of Magnaporthe oryzae using in vitro systems, Proteomics, 12: 878-900
http://dx.doi.org/10.1002/pmic.201100142
Käll L., Krogh A., and Sonnhammer E.L.L., 2007, Advantages of combined transmembrane topology and signal peptide prediction - the Phobius web server, Nucleic Acids Res., 35: W429-432
http://dx.doi.org/10.1093/nar/gkm256
Krogh A., Larsson B., von Heijne G., et al. 2001, Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes, J. Mol. Biol., 305: 567-580
http://dx.doi.org/10.1006/jmbi.2000.4315
http://dx.doi.org/10.5598/imafungus.2012.03.01.09
Lee S.A., Wormsley S., Kamoun S., et al. 2003, An analysis of the Candida albicans genome database for soluble secreted proteins using computer-based prediction algorithms, Yeast, 20: 595-610
http://dx.doi.org/10.1002/yea.988
Lowe R.G., and Howlett B.J., 2012, Indifferent, affectionate, or deceitful: lifestyles and secretomes of fungi, PLoS pathogens, 8: e1002515
http://dx.doi.org/10.1371/journal.ppat.1002515
Lum G., and Min X.J., 2011, FunSecKB: the fungal secretome knowledgebase, Database (Oxford), 2011, doi: 10.1093/database/bar001
http://dx.doi.org/10.1093/database/bar001
Lum G., and MinX.J., 2013, Bioinformatic protocols and the knowledge-base for secretomes in fungi, In: Gupta V.K., Tuohy M.G., Ayyachamy M., Turner K.M. and O’Donovan A. (eds), Laboratory Protocols in Fungal Biology: Current Methods in Fungal Biology, Springer, pp 545-557
http://dx.doi.org/10.1007/978-1-4614-2356-0_54
Lum G., Meinken J., Orr J., et al. 2014, PlantSecKB: the plant secretome and subcellular proteome knowledgebase. Comput. Mole. Biol., 4: 1-17
Martinez D., Challacombe J., Morgenstern I., et al. 2009, Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion, Proc Natl Acad Sci U S A,106: 1954-1959
http://dx.doi.org/10.1073/pnas.0809575106
http://dx.doi.org/10.1038/nbt967
http://dx.doi.org/10.1038/nbt0704-899b
http://dx.doi.org/10.1038/nbt0704-899a
McCarthy F.M., Wang N., Magee G.B., et al. 2006, AgBase: a functional genomics resource for agriculture, BMC Genomics, 7: 229
http://dx.doi.org/10.1186/1471-2164-7-229
Meinken J., and Min X.J., 2012, Computational prediction of protein subcellular locations in eukaryotes: an experience report, Comput. Mole. Biol., 2: 1-7
Melhem H., Min X.J., and Butler G., 2013, The impact of SignalP 4.0 on the prediction of secreted proteins. IEEE Symposium Series on Computational Intelligence (IIEEE SSCI 2013): The 10th annual IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, Singapore, pp.16-22 (doi: 10.1109/CIBCB.2013.6595383)
Min X.J., 2010, Evaluation of computational methods for secreted protein prediction in different eukaryotes, J. Proteomics Bioinform., 3: 143-147
http://dx.doi.org/10.1371/journal.pone.0049904
Mueller O., Kahmann R., Aguilar G., et al. 2008, The secretome of the maize pathogen Ustilago maydis, Fungal Genet. Biol., 1: S63-S70
http://dx.doi.org/10.1016/j.fgb.2008.03.012
Murphy C., Powlowski J., Wu M., et al. 2011, Curation of characterized glycoside hydrolases of fungal origin, Database (Oxford). 2011, doi: 10.1093/database/bar020
http://dx.doi.org/10.1093/database/bar020
Paper J.M., Scott-Craig J.S., Adhikari N.D., et al. 2007, Comparative proteomics of extracellular proteins in vitro and in planta from the pathogenic fungus Fusarium graminearum, Proteomics, 7: 3171-3183
http://dx.doi.org/10.1002/pmic.200700184
PetersenT.N., Brunak S., von Heijne G., et al. 2011, SignalP 4.0: discriminating signal peptides from transmembrane regions, Nature Methods, 8: 785-786
http://dx.doi.org/10.1038/nmeth.1701
http://dx.doi.org/10.1128/AEM.00252-11
Poisson G., Chauve C., Chen X., et al. 2007, FragAnchor a large scale all Eukaryota predictor of Glycosylphosphatidylinositol-anchor in protein sequences by qualitative scoring, Genomics Proteomics Bioinform., 5: 121-130
http://dx.doi.org/10.1016/S1672-0229(07)60022-9
Powers-Fletcher M.V., Jambunathan K., Brewer J.L., et al. 2011, Impact of the lectin chaperone calnexin on the stress response, virulence and proteolytic secretome of the fungal pathogen Aspergillus fumigatus, PLoS One, 6: e28865
http://dx.doi.org/10.1371/journal.pone.0028865
Ribeiro D.A., Cota J., Alvarez T.M., et al. 2012, The Penicillium echinulatum secretome on sugar cane bagasse, PloS One, 7: e50571
http://dx.doi.org/10.1371/journal.pone.0050571
http://dx.doi.org/10.1186/1754-6834-6-115
Sigrist C.J.A., Cerutti L., de Casro E., et al. 2010, PROSITE, a protein domain database for functional characterization and annotation, Nucleic Acids Res., 38: 161-166
http://dx.doi.org/10.1093/nar/gkp885
http://dx.doi.org/10.1093/nar/gkt1140
Tjalsma H., Bolhuis A., Jongbloed J.D., et al. 2000, Signal peptide-dependent protein transport in Bacillus subtilis: a genome-based survey of the secretome, Microbiol. Mol. Biol. Rev., 64: 515-547
http://dx.doi.org/10.1128/MMBR.64.3.515-547.2000
http://dx.doi.org/10.1016/j.fgb.2008.07.014
Weber S.S., Parente A.F.A., Borges C.L., et al. 2012, Analysis of the secretomes of Paracoccidioides mycelia and yeast cells, PloS One, 7: e52470
http://dx.doi.org/10.1371/journal.pone.0052470
Wymelenberg A.V., Sabat G., Martinez D., et al. 2005, The Phanerochaete chrysosporium secretome: database predictions and initial mass spectrometry peptide identifications in cellulose-grown medium, J. Biotechnol., 118: 17-34
http://dx.doi.org/10.1016/j.jbiotec.2005.03.010
Yajima W., and Kav N.N., 2006, The proteome of the phytopathogenic fungus Sclerotinia sclerotiorum, Proteomics, 6: 5995-6007
http://dx.doi.org/10.1002/pmic.200600424
. PDF(2386KB)
. FPDF(win)
. HTML
. Online fPDF
Associated material
. Readers' comments
Other articles by authors
. John Meinken
. David K. Asch
. Kofi A. Neizer-Ashun
. Guang-Hwa Chang
. Chester R.Cooper JR
. Xiang Jia Min
Related articles
. Computational prediction
. Fungi
. Secreted protein
. Secretome
. Signal peptide
. Subcellular location
. Subcellular proteome
Tools
. Email to a friend
. Post a comment