Intelligent Engineering Systems through Artificial Neural Networks, Volume 16
29 Developing New Variants of the Self-Organizing Feature Map Algorithms
Download citation file:
- Ris (Zotero)
- Reference Manager
Advances in sequencing technology have led to an explosion in the amount of sequence data available; protein structures and functions from protein coding genes are traditionally determined using time-consuming laboratory methods, but the sheer volume of data has rendered these methods impractical on large scales. In our attempt to construct automated methods for structural and functional annotations of the Human Genome and comparative genomes, we developed new variants of self-organizing feature map algorithms for identifying transmembrane proteins. This because, in human genome, membrane proteins account for roughly one third of protein coding genes, among which, in particular, transmembrane proteins play crucial roles ranging from signal transduction, cell communication to energy metabolism. However determining a membrane protein structure is always challenging the limits of laboratory methods, needless to mention finding the complete functions. We therefore use machine learning techniques to predict transmembrane proteins from polypeptide structure information alone. Features that are useful for this task are identified as a result of a detailed analysis of various physiochemical properties of proteins. The new variants of self-organizing feature map algorithms are designed to handle the noise, class-imbalanced complex biomedical data. We evaluate the effectiveness of our classifier in discriminating residues belonging to transmembrane segments from those belonging to non-transmembrane segments, and compare our results to other popular classifiers such as decision trees and support vector machines.