Intelligent Engineering Systems through Artificial Neural Networks, Volume 20
42 Rule Visualization of Protein Motif Sequence Data for Secondary Structure Prediction
Download citation file:
Protein secondary structure prediction has been a well studied research problem in bioinformatics for years. In previous papers, we presented a rule-based data mining method called RT-RICO (Relaxed Threshold Rule Induction from Coverings) that addressed this problem. Our method surpassed the accuracy, or Q3 score, that had been reported for other computational methods for protein secondary structure prediction using the standard datasets, RS126 and CB396. The success of our rule-based method supported the belief that there are meaningful statistical relationships between any secondary structure position and its neighboring amino acids. However, because of the vast amount of rules generated by RT-RICO, potentially useful information within a rule set was difficult to identify. Herein we discuss the results of examining those RTRICO rules using an existing association rule visualization tool, modified to account for the non-Boolean characterization of protein secondary structure.