A. Rodriguez, D. Sulakhe, E. Marland, V. Nefedova, G. X. Yu, and N. Maltsev, "GADU - Genome Analysis and Database Update Pipeline," Preprint ANL/MCS-P1029-0203, February 2003. [pdf]
Realizing the enormous scientific potential of exponentially growing biological information requires the development of high-throughput automated computational environments that integrate large amounts of genomic and experimental data, and powerful tools for knowledge discovery and data mining. To assist high-throughput analysis of the genomes, we have developed the Genome Analysis and Databases Update system. GADU efficiently automates major steps of genome analysis: data acquisition and data analysis by a variety of tools and algorithms, as well as data storage and annotation. We are developing a TeraGrid technology-based backend for large-scale computations using GADU. GADU can function in either an automated or interactive mode via a Web-based user interface. Programs monitor every operation in GADU and report the status of the process. This architecture ensures GADU's robust performance and allows simultaneous processing of a large number of sequenced genomes regardless of their size.