Constraint Governed Association Rule Mining for Identification of Strong SNPs to Classify Autism Data
- Title
- Constraint Governed Association Rule Mining for Identification of Strong SNPs to Classify Autism Data
- Creator
- Umme Salma M.; Rajashekar P.K.M.
- Description
- Autism is a heterogeneous neuro developmental disorder found among all age groups. Nowadays more patients are detected with autism but very less awareness is prevailing in the society related to it. This paved a way for many researchers to carry out serious study on autism and its characteristics. Studying behavior and characteristics of Autistic patients is very important for diagnosing the level of autism. Classifying the association of different characteristic in autistic patients at gene level using machine learning techniques can give an important insight to the doctors and the care takers of the patients. Research is being carried out to identify the genes responsible for autism. The changes in gene sequence may lead to different characteristics in different people. Thus genotypic research is found to reveal well defined insight about various characteristics in autistic patients and their associations with genes. Single Nucleotide Polymorphism (SNP) being high in features indicate human genome variability and is associated with identification of traits for many human diseases including autism. The main aim of the proposed work is to identify SNP sequences which are responsible for carrying the autistic traits. This paper explore the application of Constraint Governed Association Rule Mining (CGARM) technique on SNP data for dimensionality reduction and thereby selecting the strong predominant SNP features which are relevant enough to accomplish classification with high accuracy. The research work incorporates the application of CGARM and is carried out in two stages. In the first stage CGARM was used to choose significant SNP features resulting in dimensionality reduction. In the second stage classification was carried out by subjecting the selected features to Artificial Neural Network (ANN) algorithm. The main advantage of the proposed work is its ability to reduce the dimensions without compromising the quality i.e. using CGARM strong SNPs were selected by applying various constraints like Syntactical constraints, Semantical constraints and Dimensionality Constraints resulting in higher accuracy. The CGARM technique is applied on Autism data collected from National Center for Biotechnology Information (NCBI) repository. The data is divided into a set of 118 features, out of 118 features CGARM contributed in identifying 22 predominant SNPs. Further by applying forward selection method top 17 features were selected and were given as input to ANN. The 10 fold cross validation resulted in 76.9% accuracy which was found to be 50% more than that of original features. The proposed work contributed in reducing the dimension by 85% and provided 76.9% accuracy with the help of only 15% features. 2020 IEEE.
- Source
- Proceedings of the 2020 IEEE International Conference on Communication, Computing and Industry 4.0, C2I4 2020
- Date
- 2020-01-01
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- Artificial Neural Network (ANN); Association Rule mining (ARM); Autism; Constraint Governed Association Rule Mining (CGARM); Dimensionality reduction; Machine Learning; SNP
- Coverage
- Umme Salma M., Christ Deemed to be University, Department of Computer Science, Bangalore, India; Rajashekar P.K.M., Christ Deemed to be University, Department of Computer Science, Bangalore, India
- Rights
- Restricted Access
- Relation
- ISBN: 978-172818312-1
- Format
- Online
- Language
- English
- Type
- Conference paper
Collection
Citation
Umme Salma M.; Rajashekar P.K.M., “Constraint Governed Association Rule Mining for Identification of Strong SNPs to Classify Autism Data,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 25, 2025, https://archives.christuniversity.in/items/show/20659.