Framework for processing LSST astronomy data undergoing first annual challenge
released 07.11.06
Contact
Trish Barker
NCSA Public Information Specialist
tlbarker@ncsa.uiuc.edu
217.265.8013
URBANA, IL
The Large Synoptic Survey Telescope (LSST) won't begin operation until 2013, but researchers are already rehearsing for the massive volume of data the telescope will produce. As the leader of two LSST data-management teams, the National Center for Supercomputing Applications (NCSA) is coordinating the current LSST Data Challenge, the first annual test of the planned end-to-end astronomy cyberenvironment that will meet the challenge of transferring, processing, storing, and sharing the terabytes of data LSST will produce every night.
The telescope's comprehensive, time-lapse imaging will provide an unprecedented census of the solar system, including transient objects like comets and potentially hazardous near-Earth asteroids. LSST's repeated sweeps of the sky will also help to reduce noise, allowing astronomers to home in on fainter and fainter objects; by seeing farther and farther into the universe they are also seeing further and further into the past. And LSST aims to discover the nature of "dark energy," the enigma that is causing the expansion of the universe to accelerate.
It's estimated that the LSST will generate 15 terabytes of raw data and more than 100 terabytes of processed data every night. The raw data will move from the telescope to a nearby base camp, where near-real-time processing will occur in order to provide feedback to the telescope to optimize imaging and to promptly alert the astronomy community to interesting observations. The raw data will then be transmitted to the archive center for thorough processing, with the processed data stored and disseminated to the astronomy research community.
In the current Data Challenge, three sites in the National Science Foundation-funded TeraGrid network are standing in for the telescope (Texas Advanced Computing Center), the base camp (San Diego Supercomputer Center), and the archiving center (NCSA). Data will be transferred from site to site and processed along the way in order to evaluate the design of the prototype data management system. This prototype integrates grid technologies with components developed by NCSA's partners at the LSST Corporation, the National Optical Astronomy Observatory (NOAO), the Stanford Linear Accelerator Center, and the University of Washington.
"The challenge mimics the data transport and processing as it will happen in real life once the telescope is operating," says Cristina Beldica, project manager for NCSA's LSST effort.
First, the challenge will test the data replication software, which is used to transfer data from site to site. Developed at NOAO, the Data Service (DS) software leverages SDSC's Storage Resource Broker (SRB) software. Then the basic functionality proposed for the data processing pipeline will be evaluated using prototype science codes and "resource consumers" that model how actual algorithms would consume compute cycles. These codes will be stitched together through middleware components developed by NCSA and its partners, to mimic the actions and applications that will be components of the final pipeline.
Information gathered through the challenge will guide the team's further development of the LSST data management system.
NCSA (National Center for Supercomputing Applications) is a unique state-federal partnership to develop and deploy national-scale cyberinfrastructure that advances science and engineering. Located at the University of Illinois at Urbana-Champaign, NCSA is one of the leading National Science Foundation-supported supercomputing centers. Additional support comes from the state of Illinois, the University of Illinois, private sector partners, and other federal agencies. For more information, see http://www.ncsa.uiuc.edu/.
The TeraGrid, sponsored by the National Science Foundation Office of Cyberinfrastructure, is a partnership of people and a comprehensive collection of resources and services that enables and accelerates discovery in U.S. science and engineering research. Through coordinated grid middleware, policy, and high-performance network connections, TeraGrid integrates a distributed set of high capability computational, data management and visualization resources to make U.S. research more productive. TeraGrid's Science Gateway collaborations and education and mentoring programs interconnect and broaden scientific communities.
Releases Archive