Argonne National Laboratory

A Knowledge-Based Voting Algorithm for Automated Protein Functional Annotation

TitleA Knowledge-Based Voting Algorithm for Automated Protein Functional Annotation
Publication TypeReport
Year of Publication2005
AuthorsYu, GX, Glass, EM, Karonis, NT, Maltsev, N
Series TitleProteins: Structure, Function, and Bioinformatics
Date Published10/2005
InstitutionWiley-Liss, Inc.
Other NumbersANL/MCS-P1197-0904

Automated annotation of high-throughput genome sequences is one of the earliest and indispensable steps for a progress toward a comprehensive understanding of the dynamic behaviors of living organisms. It is, however, often an error-prone procedure because underlying algorithms in current analysis systems rely mainly on simple similarity analysis and lack guidance from biological rules. We present here a knowledge-based protein annotation algorithm. Our objectives are to reduce annotation errors, improve confidences, and finally to categorize the annotation confidences and explicitly associate them with the annotations. This algorithm consists of two major components: a knowledge system, called \"RuleMiner\" and a voting procedure. The knowledge system provides biological rules and functional profiles for each function that will guide functional annotation of sequences. The voting procedure, relying on the knowledge system, is designed to make (possibly) unbiased judgments in functional assignments among complicated and, sometimes, conflicting information. We applied this algorithm to ten prokaryotic bacterial genomes and observed significantly improved annotation confidences. We also noticed the limitation of the algorithm and the potential for future improvement.