|
ABSTRACT
Title |
: |
Tree-kNN: A Tree-Based Algorithm for Protein Sequence Classification |
Authors |
: |
Khaddouja Boujenfa, Nadia Essoussi, Mohamed Limam |
Keywords |
: |
Pair-wise alignment, multiple alignment, protein classification, kNN classifier, similarity measures. |
Issue Date |
: |
February 2011. |
Abstract |
: |
The phylogenomic classification of protein sequences attempts to categorize a given protein within the evolutionary context of the entire family. It involves mainly four steps: selection of homologous sequences, multiple sequence alignment, phylogenetic tree construction and tree-based classification. This supposes that the tree used as a basis of protein classification is correct. Sequence alignment is the first step for tree construction. Thus, the accuracy of the alignment produced should affect the topology of the phylogenetic tree. This work proposes a kNN tree-based algorithm for protein classification, namely Tree-kNN, which uses a phylogenetic tree estimated from pair-wise and multiple alignment approaches. We compare the classification performance of Tree-kNN with an existing method, called TreeNN. Results show that Tree-kNN gives better results than TreeNN. Based on four datasets we show that classification performances of the two algorithms using pair-wise alignment are better than using multiple alignment. |
Page(s) |
: |
961-968 |
ISSN |
: |
0975–3397 |
Source |
: |
Vol. 3, Issue.02 |
|