Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation

TitleChimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation
Publication TypeReport
Year of Publication2002
AuthorsFoster, IT, Voeckler, J, Wilde, M, Zhao, Y
Date Published05/2002
Other NumbersANL/MCS-P954-0502
Abstract

Much scientific data is not obtained from measurements but rather derived from other data by the application of computational procedures. We hypothesize that explicit representation of these procedures can enable documentation of data provenance, discovery of available methods, and on-demand data generation (so-called \"virtual data\"). To explore this idea, we have developed the Chimera virtual data system, which combines a virtual data catalog, for representing data derivation procedures and derived data, with a virtual data language interpreter that translates user requests into data definition and query operations on the database. We couple the Chimera system with distributed \"Data Grid\" services to enable on-demand execution of computation schedules constructed from database queries. We have applied this system to two challenge problem, the reconstruction of simulated collision event data from a high-energy physics experiment, and the search of digital sky survey data for galactic clusters, with promising results.

PDFhttp://www.mcs.anl.gov/papers/P954.pdf