NASA scientist to give presentations at NCSA
released October 19, 2005
On Oct. 27, Bruce Barkstrom, head of the Atmospheric Sciences Data Center at the NASA Langley Research Center, will be visiting NCSA and will give two presentations in Room 1040 of the NCSA Building.
At 11 a.m., Barkstrom will discuss "The NASA Langley ASDC: Introduction to Large, Successful Earth Science Data." He will give an overview of the Atmospheric Sciences Data Center, one of the eight NASA Data Centers developed as part of the Earth Observing System (EOS) Data and Information System (EOSDIS). This collection of federated data centers has been in successful operation for more than a decade. The collection houses several petabytes of earth science data and distributes a data volume equivalent to about 10 percent of the total archive each year. ASDC serves as an example of the system as a whole. It contains slightly more than 1 PB and provided about 114 TB of data to more than 12,000 data users last year. The data are stored in files (not databases), with ASDC having about 20 million files stored on robotic tapes. Most of the files have about 300 metadata fields that can be used to search for the files in a variety of ways—by parameter, by time interval, and by geographic location. While many people expect NASA data to be images, little of the data in ASDC has that form. Rather, the HDF file structures containing the data must accommodate the "data world view" of the scientists who have provided the production algorithms. For CERES data, the instantaneous data in a file consist of atmospheric columns that may contain cloud layers (or not). For MISR data, the instantaneous data consist of multi-color, stereoscopic movies. Barkstrom will illustrate key features of the data and of the data center's operations. He will also include a bit of ASDC development history and challenges for the future.
Barkstrom's second talk at 2 p.m. will focus on "Maturity Models for Future Data Archives." A recent National Research Council report recommended development of metrics that could be used for climate data records. Such records provide the fundamental data needed for understanding the Earth's climate and for improving models of its future behavior. Barkstrom will describe a useful approach to such metrics, in which the maturity of climate data records for archival is assessed by placing the data and archive within a three-axis framework: scientific (or measurement) maturity, preservation maturity, and societal benefit. For each axis, the metrics are divided into attributes that are ranked on a scale from 0 to 5. This approach allows the metrics to assess archival readiness for attributes that come from very different spaces.
To make the maturity modeling approach practical for strategic management of archives containing climate data records, it is helpful to recognize that each axis of the maturity model corresponds to archive accounts. Scientific maturity and preservation maturity are both "debit" accounts, while the societal benefit axis may well be a "credit" account. Barkstrom will provide an outline of the cost modeling that is needed for this kind of strategic management, noting that rigorous work involves interesting (and computationally challenging) research in stochastic planning and scheduling This research is also likely to be useful in managing the large-scale, distributed computational systems usually discussed under the guise of grid computing.
Briefs Archive