Prognosis of muscular dystrophy with extrinsic and intrinsic descriptors through ensemble learning

Muscular dystrophy is a neuromuscular disorder that impairs the functioning of the locomotive muscles. Large deletion and duplication mutations in the gene sequences pave the way for these muscular dystrophies. Any heritable change can be used as input in computational studies such as pattern and classification models. Mutated gene sequences are generated by adopting the positional cloning approach on the reference cDNA sequence with mutational information from the Human Gene Mutational Database (HGMD). The extrinsic and intrinsic descriptors of the mutated gene sequence are indispensable to identifying the disease. This work describes a computational approach of building a disease classification model by extracting the exonic and intronic descriptors from the mutated gene sequences through a combined learning technique. An ensemble hybrid model is developed through LibD3C classifier. The hybrid learned model gained an accuracy of 98.3% in diagnosing the neuromuscular disorder, based on deletion and insertion/duplication mutations. Furthermore, this paper analyzes the implementation of ensemble-learning classifiers based on features related to synonymous and nonsynonymous mutations, in order to detect muscular dystrophy performed with the same data set. Experiments showed high accuracy for the models built using LibD3C classifier, which proves that ensemble learning is effective for predicting disease. To the best of our knowledge, for the first time the models established here explore a scheme of disease prediction through pattern recognition from the sequence of nucleic acid molecule and associated mutations.