EMPIRICAL ANALYSIS FOR CLASSIFICATION AND PREDICTION OF PROTEIN FAMILY USING MACHINE LEARNING
Authors: Rashmi TS , VEENA M R, KAMARAJ R AND JYOTHI NM

ABSTRACT
Proteins are fundamental to life, and understanding their structures is crucial for deciphering their functions. Despite the efforts that have unveiled around 100,000 unique protein structures, this represents a small fraction of the vast protein sequence space. The laborious and time-consuming process of determining a protein's structure has been a bottleneck. To bridge this gap and enable large- scale structural bioinformatics, computational methods are essential. The challenge of predicting a protein's three-dimensional structure from its amino acid sequence, known as the 'protein folding problem,' has persisted for over five decades. Existing methods have limitations, especially when there are no structurally similar proteins as references. Recently, a groundbreaking machine learning approach was introduced, capable of consistently predicting protein structures with atomic accuracy, even in cases with no structural homologs. This approach leverages both physical and biological knowledge about protein structure and incorporates multiple sequence alignments into the machine learning algorithm's design. This research focuses on empirical analysis of protein structure and classification and prediction of protein family using machine learning algorithm and attained high accuracy of 95%. Keyword: Protein family, DNA, protein sequence, K-Nearest Neighbors
Publication date: 15/12/2023
    https://ijbpas.com/pdf/2023/December/MS_IJBPAS_2023_DECEMBER_SPCL_1073.pdf
Download PDF
https://doi.org/10.31032/IJBPAS/2023/12.12.1073