CogNet: Classification of Gene Expression Data based on ranked Active-Subnetwork-Oriented KEGG Pathway Enrichment Analysis
Malik Yousef1,2, Ege Ülgen3 and Osman Ugur Sezerman3
1Department of Information Systems, Zefat Academic College, Zefat, 13206, Israel.
2Galilee Digital Health Research Center (GDH), Zefat Academic College, Israel
3Department of Biostatistics and Medical Informatics, School of Medicine, Acibadem Mehmet Ali Aydinlar University, Istanbul, Turkey
In this study we propose a new computational approach named CogNet that is based on biological knowledge as a function for grouping the genes for the task of ranks and classification. The pathfindR tool serves to be the biological grouping function allowing the main algorithm to rank active-subnetwork-oriented KEGG pathway enrichment analysis. Even Though, the main aim of the current tools is not improving results of any existing tools, the performance of the CogNet outperforms a similar approach called maTE while getting similar performance of other similar tools SVM-RCE. CogNet was tested on 13 gene expression datasets that include a variety of diseases.
CogNet provides a list of significant KEGG pathways including its genes that are able to separate the classes of the data. The list would serve the biology researcher for deep analysis and better interpretability of the role of KEGG pathways in the data or the case that is being studied. As a future work we would develop CogNet to explore the effectiveness of different combinations of the KEGG pathways in the data. In the current version we treat each KEGG pathway individually.
Keywords: Gene Expression Data Analysis; Integrative Gene Selection; Knowledge Bases; Prior Knowledge, Classification
Availability and implementation
The KNIME workflow, implementing CogNet, is available at https://malikyousef.com -> Bioinformatics Tools