TY - JOUR
T1 - A thorough analysis of the contribution of experimental, derived and sequence-based predicted protein-protein interactions for functional annotation of proteins
AU - Makrodimitris, Stavros
AU - Reinders, Marcel
AU - van Ham, Roeland
PY - 2020
Y1 - 2020
N2 - Physical interaction between two proteins is strong evidence that the proteins are involved in the same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting the cellular functions of proteins. However, PPI networks are largely incomplete for non-model species. Here, we tested to what extent these incomplete networks are still useful for genome-wide function prediction. We used two network-based classifiers to predict Biological Process Gene Ontology terms from protein interaction data in four species: Saccharomyces cerevisiae, Escherichia coli, Arabidopsis thaliana and Solanum lycopersicum (tomato). The classifiers had reasonable performance in the well-studied yeast, but performed poorly in the other species. We showed that this poor performance can be considerably improved by adding edges predicted from various data sources, such as text mining, and that associations from the STRING database are more useful than interactions predicted by a neural network from sequence-based features.
AB - Physical interaction between two proteins is strong evidence that the proteins are involved in the same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting the cellular functions of proteins. However, PPI networks are largely incomplete for non-model species. Here, we tested to what extent these incomplete networks are still useful for genome-wide function prediction. We used two network-based classifiers to predict Biological Process Gene Ontology terms from protein interaction data in four species: Saccharomyces cerevisiae, Escherichia coli, Arabidopsis thaliana and Solanum lycopersicum (tomato). The classifiers had reasonable performance in the well-studied yeast, but performed poorly in the other species. We showed that this poor performance can be considerably improved by adding edges predicted from various data sources, such as text mining, and that associations from the STRING database are more useful than interactions predicted by a neural network from sequence-based features.
UR - http://www.scopus.com/inward/record.url?scp=85096816818&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0242723
DO - 10.1371/journal.pone.0242723
M3 - Article
C2 - 33237964
AN - SCOPUS:85096816818
SN - 1932-6203
VL - 15
SP - 1
EP - 18
JO - PLoS ONE
JF - PLoS ONE
IS - 11
M1 - e0242723
ER -