Research Article

ProtSecKB: The Protist Secretome and Subcellular Proteome Knowledgebase  

Brian Powell1 , Vamshi Amerishetty2 , John Meinken2,3 , Geneva Knott4 , Feng Yu1 , Chester Cooper2,4 , Xiang Jia Min2,4
1 Department of Computer Science & Information Systems, Youngstown State University, Youngstown, OH 44555, USA
2 Center for Applied Chemical Biology, Youngstown State University, Youngstown, OH 44555, USA
3 Center for Health Informatics, University of Cincinnati, Cincinnati, OH 45267-0840, USA
4 Department of Biological Sciences, Youngstown State University, Youngstown, OH 44555, USA
Author    Correspondence author
Computational Molecular Biology, 2016, Vol. 6, No. 4   doi: 10.5376/cmb.2016.06.0004
Received: 19 Sep., 2016    Accepted: 01 Nov., 2016    Published: 14 Dec., 2016
© 2016 BioPublisher Publishing Platform
This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article:

Powell B., Amerishetty V., Meinken J., Knott G., Feng Y., Cooper C., and Min X.J., 2016, ProtSecKB: the protist secretome and subcellular proteome knowledgebase, Computational Molecular Biology, 6(4): 1-12

Abstract

Kingdom Protista contains a large group of eukaryotic organisms with diverse lifestyles. We developed the Protist Secretome and Subcellular Proteome Knowledgebase (ProtSecKB) to host information of curated and predicted subcellular locations of all protist proteins. The protist protein sequences were retrieved from UniProtKB, consisting of 1.97 million entries generated from 7,024 species with 101 species including 127 organisms having complete proteomes. The protein subcellular locations were based on curated information and predictions using a set of well evaluated computational tools.  The database can be searched using several different types of identifiers, gene names or keyword(s). Secretomes and other subcellular proteomes can be searched or downloaded. BLAST searching against the complete set of protist proteins or secretomes is available.  Protein family analysis of secretomes from representing protist species, including Dictyostelium discoideum, Phytophthora infestans, and Trypanosoma cruzi, showed that species with different lifestyles had drastic differences of protein families in their secretomes, which may determine their lifestyles. The database provides an important resource for the protist and biomedical research community. The database is available at http://bioinformatics.ysu.edu/secretomes/protist/index.php.

Keywords
Computational Prediction; Protest; Protista; Secreted Protein; Secretome; Signal Peptide; Subcellular Location; Subcellular Proteome; Lifestyle

1 Introduction

Protists consist of a large number of diverse eukaryotic organisms that are not classified into the kingdoms of Fungi, Plantae, or Animalia (Foissner, 1999, 2006; Slapeta et al., 2005).  Some protists are parasites of animals and humans, such as Plasmodium falciparum causing malaria, and many others cause similar diseases in other vertebrates (D'Acremont et al., 2010).  The oomycete Phytophthora infestans causes late blight in potato and tomato plants (Nowicki et al., 2012). Understanding the metabolism of these protists and their roles in ecology may allow these diseases to be treated more efficiently.

 

In eukaryotes, proteins are synthesized within a cell and then transported to different subcellular locations including extracellular space or matrix to perform their biological functions. Identification and analysis of protein subcellular locations in eukaryotes is one of the important subjects for annotating a proteome. The term secretome is often used to describe the set of proteins secreted outside of a cell (Lum and Min, 2011).  The parasite P. falciparum causes malaria by replicating inside red blood cells of infected individuals. Secreted proteins of P. falciparum were identified and experimentally examined (Przyborski and Lanzer, 2004; Hiller et al., 2004; Van Ooij et al., 2008). These secreted proteins are potential targets for drug treatment of the malaria disease (Bhatt,  2012).

 

Classical eukaryotic secreted proteins contain a secretory signal peptide at the N-terminus (von Heijne, 1990). Classical secreted proteins of eukaryotes can be computationally predicted accurately with our developed computational protocols of combining multiple prediction tools (Min, 2010).  Thus we have made efforts to construct secretome databases for fungi, plants, and animals (Lum and Min, 2011; Lum et al., 2014; Meinken et al., 2014; Meiken et al., 2015).  In this work, we describe the Protist Secretome and Subcellular Proteome Knowledgebase (ProtSecKB, http://bioinformatics.ysu.edu/secretomes/protist/index.php). The database will serve a useful resource for the community working with protist organisms for biomedical research.

 

2 Methods of Database Construction

2.1 Data collection

The protist protein sequences were retrieved from the UniProtKB/Swiss-Prot dataset and the UniProtKB/TrEMBL dataset (release 2016-02) (http://www.uniprot.org/downloads) using our in-house script. As proteins in the Kingdom Protista are actually not labeled as “Protist” or “Protista”, we retrieved all entries belonging to “Eukaryota” but not further classified as “Fungi”, “Metazoa”, or “Viridiplantae”. The UniProtKB/Swiss-Prot dataset contains manually annotated and reviewed protein sequences. The UniProtKB/TrEMBL dataset contains computationally analyzed protein sequences. The combined protist dataset consisted of a total of 1,970,022 protein entries with 8,661 and 1,961,361 entries retrieved from the Swiss-Prot dataset and the TrEMBL dataset, respectively. The identifier mapping data including UniProt accession number (AC), UniProt ID, RefSeq accession number, and gi number were retrieved from the UniProt ID mapping data file. All data used in the database construction and analysis can be downloaded from the website at http://proteomics.ysu.edu/publication/data/ProtSecKB/.

 

2.2 Prediction of protein subcellular locations

As similar approaches to using the same set prediction tools have been employed in construction of FunSecKB (Lum and Min, 2011), FunSecKB2 (Meinken et al., 2014), PlantSecKB (Lum et al., 2014), and MetazSecKB (Meinken et al., 2015) in our group, we only briefly introduce these tools in this work. For detailed information, the relevant references for each tool or the exemplar introduction by Lum and Min (2013) can be consulted.  The software tools used in this work include SignalP (version 4.0), TargetP, Phobius, WoLF PSORT, TMHMM, and PS-Scan.  In brief, SignalP 4.0 was used for secretory signal peptide prediction (Petersen et al., 2011). However, we also included prediction information from SignalP 3.0 (Bendtsen et al., 2004) as it provides more accurate cleavage site prediction than SignalP 4.0 (Petersen et al., 2011). Phobius is a combined signal peptide and a transmembrane topology predictor (Käll et al., 2007). TargetP predicts the presence of any signal sequences such as signal peptide (SP), chloroplast transit peptide (cTP), or mitochondrial targeting peptide (mTP) in the N-terminus (Emanuelsson et al., 2007). TMHMM predicts the presence and topology of transmembrane helices and their orientation to the membrane (in/out) (Krogh et al., 2001). PS-Scan was used to scan the PROSITE database (http://www.expasy.org/tools/scanprosite/) for identifying ER targeting proteins (Prosite: PS00014) (Sigrist et al., 2010). WoLF PSORT predicts multiple subcellular locations including cytosol, cytoskeleton, ER, extracellular (secreted), Golgi apparatus, lysosome, mitochondria, nuclear, peroxisome, plasma membrane, and vacuolar membrane (Horton et al., 2007). As for all these programs, there were no specific parameters available for protists yet, the default parameters for eukaryotes or fungi, if available, were used, based on our previous evaluation (Min, 2010). We took the following procedure to assign a protein subcellular location. The annotated subcellular location in UniProtKB and our manual curation take precedence over computational prediction. Thus, only proteins not having an annotated subcellular location are subjected to computational assignment. However, the prediction information generated by all the tools is available for all proteins. It should be noted that some of the proteins may have more than one subcellular location.

 

Membrane proteins: A membrane protein is a protein having one or more transmembrane domains predicted by TMHMM. However, if there is only one transmembrane domain predicted and located within the N-terminus 70 amino acids, and also a signal peptide is predicted by SignalP 4.0, then this protein is not counted as a membrane protein.

 

Mitochondrial proteins: Assignment of mitochondrial proteins was based on WoLF PSORT prediction. If it is also classified as a membrane protein, then it is further classified as a mitochondrial membrane protein.

 

ER proteins: Proteins predicted to contain a signal peptide by SignalP 4.0 and an ER target signal (Prosite: PS00014) by PS-Scan were treated as luminal ER proteins.

 

Secretomes: A secretome is all secreted proteins from a species. There were four subcategories of secreted proteins. Curated secreted proteins include proteins which are annotated to be “secreted” or “extracellular” or “cell wall” in the subcellular location from the UniProtKB/Swiss-Prot data set which are “reviewed” as well as manually collected secreted proteins from recent literature by our curators. “Highly likely secreted” proteins are predicted to have a secretory signal peptide by at least three of the four predictors including SignalP 4.0, Phobius, TargetP and WoLF PSORT, but are not classified as any of the above categories. “Likely secreted” proteins are predicted to have a secretory signal peptide by two of the four predictors, and “Weakly likely secreted” proteins are predicted to have a secretory signal peptide by one of the four predictors. We recommend combining both curated and highly likely secreted proteins as a secretome for a species (see accuracy evaluation section).

 

Proteins in other subcellular locations: Other subcellular locations - including cytosol (cytoplasm), cytoskeleton, Golgi apparatus, lysosome, nucleus, peroxisome, plasma membrane and vacuole - were predicted by WoLF PSORT. It should be noted that we did not predict the category of plastid proteins and all entries in this category were from UniProtKB curation.

 

2.3 Prediction accuracy evaluation of protein subcellular locations

The prediction tools we chose above were based on our previous evaluation (Min, 2010). To further evaluate the prediction accuracy of each subcellular location in this dataset, we retrieved protein entries having an annotated, unique subcellular location from UniProtKB/Swiss-Prot dataset. Proteins having multiple subcellular locations, labeled as “fragment”, not starting with “M”, or having a length < 70 amino acids were excluded.  Proteins with a subcellular location having a term including “By similarity”, “Probable”, or “Potential” were excluded. The prediction accuracy for each subcellular location was evaluated using prediction sensitivity (Equation 1), specificity (Equation 2) and Matthews Correlation Coefficient (MCC) (Equation 3).

 

Sensitivity (%) = TP/(TP + FN) x 100    (1)

Specificity (%) = TN/(TN + FP) x 100    (2)

MCC (%) = (TP x TN – FP x FN) x 100 /((TP + FP) (TP + FN) (TN + FP) (TN + FN))1/2    (3)

 

TP is the number of true positives, FN is the number of false negatives, FP is the number of false positives, and TN is the number of true negatives. The MCC takes into account true and false positives and negatives and is generally regarded as a balanced measure, with +1 representing a perfect prediction and 0 meaning no better than random chance (Matthews, 1975). The dataset contains a total of 2,407 proteins. For each category, the number of actual positives equals TP plus FN and the number of actual negatives equals FP plus TN (Table 1).

 

3 Results

3.1 Prediction accuracy

3.1.1 Mitochondrial proteins

The prediction accuracy results for each subcellular location are shown in Table 1. As both TargetP and WoLF PSORT can predict mitochondrial proteins, we evaluated the prediction accuracy of these two tools both individually and combined (Table 1a). When an individual tool was used, WoLF PSORT prediction showed a much higher sensitivity but a slightly lower specificity than TargetP prediction. Thus, the MCC value was higher using WoLF PSORT (0.53) than using TargetP (0.32). If only positives predicted by both tools were used, the specificity was slightly increased but the sensitivity decreased. In contrast, including positives predicted by either tool increased the sensitivity but decreased the specificity resulting in a lower MCC value (0.50) than using WoLF PSORT alone. Thus, we based our predictions for mitochondrial proteins on WoLF PSORT alone.

 

 

Table 1 Prediction accuracy evaluation of protist protein subcellular locations

Note: TP: true positives; FP: false positives; TN: true negatives; FN: false negatives. Sn: sensitivity; Sp: specificity; MCC: Matthews correlation coefficient. Secreted: predicted by 4 predictors; HLS: highly likely secreted, predicted by 3 out of 4 predictors; LS: likely secreted, predicted by 2 out of 4 predictors; WLS: weakly likely secreted, predicted by 1 out of 4 predictors 

 

3.1.2 Secreted proteins

Our previous evaluation showed that secreted protein prediction accuracy can be improved by removing transmembrane proteins and ER resident proteins (Min, 2010). As we employed four tools - SignalP, TargetP, WoLF PSORT, and Phobius - for predicting secreted proteins or secretory signal peptides, we had to determine which should be included in the secretome set. After removing transmembrane proteins and ER proteins, the protein set predicted to be secreted are divided into four categories: (1) Secreted: predicted by 4 predictors; (b) Highly likely secreted (HLS): predicted by 3 out of 4 predictors; (3) Likely secreted (LS): predicted by 2 out of 4 predictors; and (4) Weakly likely secreted (WLS): predicted by 1 out of 4 predictors. The dataset consisted of 146 curated secreted proteins as positives and 2,261 proteins located in other subcellular locations as negatives. The accuracy results are shown in Table 1b.

 

As expected, when only entries were predicted by all four tools to be positives as true positives, the prediction specificity was highest. However, the sensitivity was lowest. On the other hand, when including all entries predicted by any of the four tools to be positives as true positives, the prediction specificity was decreased while the sensitivity was increased. Based on the MCC values, the most accurate prediction (0.71) for a secretome includes secreted entries predicted by at least three out of four predictors with a specificity of 96.2% and a sensitivity of 89.0% (Table 1b). Thus, we recommend including only curated secreted proteins and highly likely secreted proteins for estimating the secretome size for a species. It should be noted that both entries predicted by 4 of 4 tools and 3 of 4 tools were assigned to the category of highly like secreted in the database.

 

3.1.3 Proteins in other subcellular locations

The prediction accuracy results for proteins located in cytoplasm, cytoskeleton, ER, Golgi apparatus, lysosome, nucleus, peroxisome, plasma membrane, and vacuole are shown in Table 1c. Proteins for the cytoplasm subset also include cytosol as these two terms are used interchangeably in the UniProtKB annotation. The annotated cytoskeleton entries are also annotated as cytoplasm in UniProtKB. However, in our evaluation cytoskeleton proteins were not counted in the subset of cytoplasm. We would also like to point out that plasma membrane proteins were annotated as “cell membrane” in UniProtKB, thus cell membrane proteins were retrieved for evaluating the category of plasma membrane. The prediction accuracies for these subcellular locations vary significantly. Predictions of proteins located in cytoplasm, cytoskeleton, and nucleus were relatively accurate with a MCC value of 0.54, 0.46 and 0.49, respectively. The specificities for cytoskeleton, ER, and peroxisome predictions were high (> 98%), but the sensitivities were low (< 50%). There were no positives predicted for proteins localized in Golgi, lysosome or vacuoles. These results showed there is a need to train the predictors with protist-specific proteins for protist protein subcellular location prediction.

 

3.2 Overview of subcellular proteome distribution in different species

ProtSecKB contains a total of 1.97 million protein sequences generated from 7,024 protist species including 101 unique species with some of them having multiple strains totaling 127 organisms with complete proteomes. The main categories of subcellular proteomes - including highly likely secreted and likely secreted, cytoplasm, plasma membrane, mitochondrial, and nuclear proteins - for species having complete proteomes are summarized in Table 2. Curated secreted proteins, ER proteins, etc. are not included but can be obtained from the website mentioned above in the Data section (Table 1). There are not many proteins with curated subcellular locations in protist species.  The curated secreted proteins were mainly from D. discoideum with 113 proteins. D. discoideum is a soil-living amoeba belonging to the phylum Amoebozoa and commonly referred to as slime mold (Bakthavatsalam and Gomer, 2010). We also curated 29 secreted proteins in P. falciparum, a protozoan parasite causing malaria in humans (Singh et al., 2009; Soni et al., 2016).

 

The species in Protista kingdom have quite variable proteome sizes - from about 5000 proteins in P. falciparum to over 50,000 in Trypanosoma cruzi, a parasitic euglenoid protozoan causing Chagas' disease in humans (Bern et al. 2011) (Table 2). The distribution of subcellular proteomes varied tremendously in different species, with nucleus, cytoplasm, mitochondria representing the larger subcellular compartments.  There were from 14.4% to 77.0% proteins located in the nucleus, from 7.4% to 40.3% in mitochondria, from 4.6% to 35.4% in cytoplasm, and 0.8% to 15.2% secreted. On average for all protist species with complete proteomes, approximately 44% of proteins were located in the nuclear compartment, 22% in mitochondria, 17% in cytoplasm, and 6% secreted outside the plasma membrane of the cell (Table 2).

 

 

Table 2 Summary of the protein distribution in some major subcellular locations in different protist species

Abbreviation: HLS: highly likely secreted; LS: likely secreted; Cyt: cytoplasm (or cytosol); Plasm: plasma membrane; Mt mem: mitochondrial membrane; Mt non-m: mitochondrial non-membrane; Nuc mem: nuclear membrane; Nuc non-m: nuclear non-membrane; Sec: secretome

 

3.3 Comparative protein family analysis of protist secretomes

Complete comparative evolutionary analyses of protist secretomes or other sub-proteomes were beyond the scope of this study. As complete secretome or other sub-proteome sequences can be downloaded directly from our database, researchers with their specific aims can carry out further detailed comparative study of these sub-proteomes in different species of their interest. However, we performed an rpsBLAST search against the Pfam database for all predicted curated secreted, highly likely secreted and likely secreted proteins (Table 2). Here we only included Pfams of the secretomes from the highly likely secreted and curated secreted protein sets of three species to demonstrate the functional diversities of the secreted proteins in protists (Table 3). 

 

 

Table 3 Comparison of protein families in secretomes of three protist species having different lifestyles

The table only contains protein families having 6 or more members in a species. A complete list can be found as supplementary Table3

  

The three species were D. discoideum, a soil-living amoeba; P. infestans, a plant pathogen; and T. cruzi, a human parasite. D. discoideum had 832 secreted proteins with 388 of them with a Pfam, P. infestans had 1,748 secreted proteins with 583 of them with a Pfam, and T. cruzi had 4,122 secreted proteins with 1,599 of them with a Pfam (Table 3). The distribution of protein families having at least 6 members in each family was listed in Table 3 and a complete list of data can be downloaded (Table 3). In different protist species, not only the total numbers of secreted proteins were different but also the categories of protein families as well as the number of members in each family were vastly different (Table 3). For example, D. discoideum had 52 secreted proteins with DUF3430 domain (unknown function) and 44 secreted proteins with carbohydrate binding domain CBM49, while the other two species had no such protein family at all.  As expected, there were a large number of secreted Elicitin, RXLR phytopathogen effector protein, necrosis inducing protein (NPP1), phytotoxin PcF protein, trypsin, etc. in P. infestans, which may be related to its lifestyle as a plant pathogen (Meijer et al., 2014). T. cruzi, not surprisingly as a human parasite pathogen, had 345 Mucin-like glycoprotein, 198 BNR repeat-like domain, and 102 Peptidase_M8 (Leishmanolysin), etc. in its secretome while the other two species did not have any for those categories. These secreted proteins may play an important role for T. cruzi for invading and infecting humans and causing Chagas' disease (Costa et al., 2016).

 

4 Discussion

We constructed the ProtSecKB to provide a resource of curated and predicted subcellular locations of protist proteins. As all the tools we selected to use were not specifically trained for protists, the prediction accuracies were lower than prediction accuracies in other eukaryotes including fungi, plants and animals (Lum and Min, 2011; Lum et al., 2014; Meiken et al., 2014; Meiken et al., 2015).  However, our evaluation using curated protein subcellular locations showed that the prediction specificities for nearly all subcellular locations except nucleus were > 90%, and in particular, prediction of secreted proteins had an MCC value of 0.71 with 89.0% sensitivity and 96.2% specificity (Table 1). Thus we concluded that the prediction of secreted proteins was relatively reliable. Other tools are also available as webservers including the Cell-PLoc servers (Chou and Shen, 2008) and some others (Meinken and Min, 2012). These tools and their related publications can be found at our website (http://bioinformatics.ysu.edu/tools/subcell.html) (Meinken and Min, 2012). As standalone tools are not available for some, such as Cell-PLoc, or too slow to processing large datasets, we were not able to use them for our data processing. However, we suggest users utilize these tools to get a second prediction for proteins of interest as our experience showed that using multiple tools improves prediction specificity.

 

Recently the efforts had been made by our research group to improve the prediction accuracies of subcellular locations in plant proteins (Neizer-Ashun et al., 2015), fungal proteins (Munyon et al., 2015), and animal/human proteins (Khavari, 2016) using various statistics algorithms. The results were mixed for different subcellular locations using different methods with different eukaryotic proteins. However, some of the algorithms were promising in improving the prediction accuracy. When enough experimental protist protein subcellular location data are available, a specific tool will need to be implemented for protist protein subcellular location prediction.

 

ProtSecKB contains 101 unique protist species within some of them having multiple strains resulting in a total of 127 organisms having complete proteomes. The database allows that each subcellular proteome in each species can be searched and downloaded for detailed comparative analysis. As an example for the usage of the database, our analysis on protein families using three species having different lifestyles demonstrated that the secretome in each species may play an important role in determining their lifestyles (Table 3). We also have implemented a curation tool accessible through ProtSecKB for the community to manually curate subcellular locations of protist proteins having experimental evidence. We anticipate the database resource will facilitate the protist research community to design further experiments characterizing protist proteins and understanding protist biology, particularly of the plant, human and animal protist pathogens.

 

Authors' contributions

XM and CC conceived the work; BP, VA and JM implemented the database; GK curated proteins. XM, BP, FY analyzed the data. XM, BP, JM and CC prepared the manuscript. All authors read and approved the final manuscript.

 

Acknowledgements

BP was supported by the College of Graduate Studies, Youngstown State University (YSU), VA and JM were supported by the Center for Applied Chemical Biology, YSU. The work is also supported by a Research Professorship to XM by YSU.

 

References

Bakthavatsalam D., and Gomer R.H., 2012, The secreted proteome profile of developing Dictyostelium discoideum cells, Proteomics, 10: 2556-2559

https://doi.org/10.1002/pmic.200900516

 

Bendtsen J.D., Nielsen H., von Heijne G., and Brunak S., 2004, Improved prediction of signal peptides: SignalP 3.0, Journal of Molecular Biology, 340(4): 783-795

https://doi.org/10.1016/j.jmb.2004.05.028

 

Bern C., Kjos S., Yabsley M.J., and Montgomery S.P., 2011, Trypanosoma cruzi and Chagas' disease in the United States, Clinical Microbiology Reviews, 24(4): 655-681

https://doi.org/10.1128/CMR.00005-11

 

Bhatt T.K., 2012, Malaria Parasite ‘Secretome’: A Potential Drug Target, Research and Reviews: Journal of Computational Biology, 1(2): 1-5

 

Watanabe C.R., Da S.J., and Bahia D., 2016, Interactions between trypanosoma cruzi secreted proteins and host cell signaling pathways, Frontiers in Microbiology, 7(e102)

https://doi.org/10.3389/fmicb.2016.00388

 

D'Acremont V., Lengeler C., and Genton B., 2010, Reduction in the proportion of fevers associated with plasmodium falciparum parasitaemia in Africa: a systematic review, Malaria Journal, 9(1): 1

https://doi.org/10.1186/1475-2875-9-240

 

Emanuelsson O., Brunak S., von Heijne G., and Nielsen H., 2007, Locating proteins in the cell using TargetP, SignalP and related tools, Nature Protocols, 2(4): 953-971

https://doi.org/10.1038/nprot.2007.131

 

Foissner W., 1999, Protist diversity: estimates of the near-imponderable, Protist, 150: 363-368

https://doi.org/10.1016/S1434-4610(99)70037-4

 

Foissner W., 2006, Biogeography and dispersal of micro-organisms: a review emphasizing protists, Acta Protozoologica, 45: 111-136

 

Hiller N.L., Bhattacharjee S., van Ooij C., Liolios K., Harrison T., Lopez-Estrano C., and Haldar K., 2004, A host-targeting signal in virulence proteins reveals a secretome in malarial infection, Science, 306(5703): 1934-1937

https://doi.org/10.1126/science.1102737

 

Horton P., Park K.J., Obayashi T., Fujita N., Harada H., Adams-Collier C.J., and Nakai K., 2007, WoLF PSORT: protein localization predictor, Nucleic Acids Research, 35(suppl 2): W585-W587

https://doi.org/10.1093/nar/gkm259

 

Käll L., Krogh A., and Sonnhammer E.L., 2007, Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server, Nucleic Acids Research, 35(suppl 2): W429-W432

https://doi.org/10.1093/nar/gkm256

 

Khavari S., 2016, Predicting human and animal protein subcellular location, Thesis for Master of Science in Mathematics, Advisor: Chang G-H, Youngstown State University, pp 1-68

 

Krogh A., Larsson B., Von Heijne G., and Sonnhammer E.L., 2001, Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes, Journal of Molecular Biology, 305(3): 567-580

https://doi.org/10.1006/jmbi.2000.4315

 

Lum G., Meinken J., Orr J., Frazier S., and Min X.J., 2014, PlantSecKB: the plant secretome and subcellular proteome knowledgebase, Computational Molecular Biology, 4(4)

 

Lum G., and Min X.J., 2011, FunSecKB: the fungal secretome knowledgebase, Database, bar001

https://doi.org/10.1093/database/bar001

 

Lum G., and Min X.J., 2013, Bioinformatic protocols and the knowledge-base for secretomes in fungi, In Laboratory Protocols in Fungal Biology (pp. 545-557), Springer New York

https://doi.org/10.1007/978-1-4614-2356-0_54

 

Matthews B.W., 1975, Comparison of the predicted and observed secondary structure of T4 phage lysozyme, Biochimica et Biophysica Acta (BBA)-Protein Structure, 405(2): 442-451

https://doi.org/10.1016/0005-2795(75)90109-9

 

Meijer H.J., Mancuso F.M., Espadas G., Seidl M.F., Chiva C., Govers F., and Sabidó E., 2014, Profiling the secretome and extracellular proteome of the potato late blight pathogen Phytophthora infestans, Molecular and Cellular Proteomics, 13(8): 2101-2113

https://doi.org/10.1074/mcp.M113.035873

 

Meinken J., Asch D.K., Neizer-Ashun K.A., Chang G.H., Cooper J.R., C.R., and Min X.J., 2014, FunSecKB2: a fungal protein subcellular location knowledgebase, Computational Molecular Biology, 4(4)

 

Meinken J., Walker G., Cooper C.R., and Min X.J., 2015, MetazSecKB: the human and animal secretome and subcellular proteome knowledgebase, Database, bav077

https://doi.org/10.1093/database/bav077

 

Meinken J., and Min J., 2012, Computational prediction of protein subcellular locations in eukaryotes: an experience report, Computational Molecular Biology, 2(1): 1-7

https://doi.org/10.5376/cmb.2012.02.0001

 

Min X.J., 2012, Evaluation of computational methods for secreted protein prediction in different eukaryotes, Journal of Proteomics and Bioinformatics, 2010

 

Munyon J.D., Min X., Khavari S., and Chang G.H., 2015, Prediction of subcellular locations for fungal proteins, Proceeding of the Joint Statistics Meeting 2015 (JSM2015), pp. 2497-2508

 

Neizer-Ashun K., Yu F., Meinken J., Min X., and Chang G.H., 2015, Prediction of plant protein subcellular locations, 7th International Conference on Bioinformatics and Computational Biology (BICoB 2015), Honolulu, Hawaii, USA, pp. 91-96

 

Nowicki M., Foolad M.R., Nowakowska M., and Kozik E.U., 2012, Potato and tomato late blight caused by Phytophthora infestans: an overview of pathology and resistance breeding, Plant Disease, 96(1): 4-17

https://doi.org/10.1094/PDIS-05-11-0458

 

Petersen T.N., Brunak S., von Heijne G., and Nielsen H., 2011, SignalP 4.0: discriminating signal peptides from transmembrane regions, Nature methods, 8(10): 785-786

https://doi.org/10.1038/nmeth.1701

 

Przyborski J., and Lanzer M., 2004, The malarial secretome, Science, 306: 1897-1898

https://doi.org/10.1126/science.1107072

 

Sigrist C.J.A., Cerutti L., Castro E.D., Langendijk-Genevaux P.S., Bulliard V., and Bairoch A., and Hulo N., 2010, Prosite, a protein domain database for functional characterization and annotation, Nucleic Acids Research, 38(suppl_1): 161-166

 

Singh M., Mukherjee P., Narayanasamy K., Arora R., Sen S.D., Gupta, S., Natarajan K., and Malhotra P., 2009, Proteome analysis of plasmodium falciparum extracellular secretory antigens at asexual blood stages reveals a cohort of proteins with possible roles in immune modulation and signaling, Molecular and Cellular Proteomics, 8(8): 2102-2118

https://doi.org/10.1074/mcp.M900029-MCP200

 

Slapeta J., Moreira D., and Lópezgarcía P., 2005, The extent of protist diversity: insights from molecular ecology of freshwater eukaryotes, Proceedings of the Royal Society B Biological Sciences, 272(1576): 2073-2081

https://doi.org/10.1098/rspb.2005.3195

 

Soni R., Sharma D., and Bhatt T.K., 2016, Plasmodium falciparum secretome in erythrocyte and beyond, Frontiers in Microbiology, 7(6049)

https://doi.org/10.3389/fmicb.2016.00194

 

Van O.C., Tamez P., Bhattacharjee S., Hiller N.L., Harrison T., Liolios K., Kooij T., Ramesar J., Balu B., Adams J., Waters A.P., Janse C.J., and Haldar K., 2008, The malaria secretome: from algorithms to essential function in blood stage infection, Plos Pathogens, 4(6): e1000084-e1000084

https://doi.org/10.1371/journal.ppat.1000084

 

Heijne G.V., 1990, The signal peptide, The Journal of Membrane Biology, 115(3): 195-201

https://doi.org/10.1007/BF01868635

Computational Molecular Biology
• Volume 6
View Options
. PDF(421KB)
. FPDF
. HTML
. Online fPDF
Associated material
. Readers' comments
Other articles by authors
. Brian Powell
. Vamshi Amerishetty
. John Meinken
. Geneva Knott
. Feng Yu
. Chester Cooper
. Xiang Jia Min
Related articles
. Computational Prediction
. Protest
. Protista
. Secreted Protein
. Secretome
. Signal Peptide
. Subcellular Location
. Subcellular Proteome
. Lifestyle
Tools
. Email to a friend
. Post a comment