Skip all navigation and jump to content Jump to site navigation Jump to section navigation.
NASA Logo - Goddard Space Flight Center + Visit NASA.gov
Computational and Information Sciences & Technology Office banner

  +Home

 

CISTO News
HOME
ARCHIVES

Editor
Mike Hollis

Associate Editor
Jarrett Cohen

Consultants
Lara Clemence
Jim Fischer
Nick Burke

FEATURE ARTICLES

MERRA Project to Reconstruct Last 30 Years of Earth’s Climate and Weather

By Jarrett Cohen

MERRA, the largest application ever hosted by the NASA Center for Computational Sciences (NCCS), will soon be running on an augmented “Discover” Linux cluster. The Modern Era Retrospective-analysis for Research and Applications, as MERRA is formally known, will consume 544 processors of the NCCS’s Linux Networx Custom Supersystem for 18 to 24 months.

As its name implies, a retrospective analysis, or reanalysis, starts in the past and marches forward in time to computationally reconstruct atmospheric conditions. Reanalyses are commonly undertaken by climate and weather centers, which use the same data assimilation systems as they apply to their forward-looking forecasts.

MERRA is an endeavor of the Global Modeling and Assimilation Office (GMAO) at Goddard Space Flight Center. One of the GMAO’s primary functions is to support NASA Earth Observing System (EOS) satellite instrument teams and field experiments with assimilated data products in near real-time. The GMAO also uses models and assimilation systems to document and understand climate variability and predictability.

Traditionally, weather centers have an archive of analyses built up from their real-time systems, which are continually changing as upgrades occur. In contrast, the GMAO “can go back with a reanalysis and have a consistent system over time for doing the processing,” said Michael Bosilovich, GMAO meteorologist and MERRA co-principal investigator with fellow GMAO meteorologist Siegfried Schubert.

With MERRA, GMAO scientists particularly hope to gain new insights into Earth’s water cycle and understand how it changes with underlying climate variability. Research community access to the resulting data will expand the prospects for advances.

Data assimilation systems pair a computer model with an analysis that ingests, calibrates, and quality controls observations and then feeds them to the model. “Observational products are discontinuous in space and time,” Bosilovich said. “Where we don’t have data, the model fills in the gaps.” Together, he explained, model and analysis provide globally gridded meteorological information that is uniform and reliable.

The GMAO is currently doing production-scale testing of the next version of its Goddard Earth Observing System-5 Data Assimilation System (GEOS-5 DAS), planned for MERRA. GEOS-5 DAS incorporates the GEOS-5 atmospheric general circulation model and the Gridpoint Statistical Interpolation (GSI) analysis. GSI was developed by the National Weather Service’s National Centers for Environmental Prediction (NCEP) with GMAO contributions.

TOTAL PRECIPITABLE WATER
This visualization shows Total Precipitable Water (TPW) in centimeters over the United States on July 10, 2004. The data come from an experimental reanalysis of Summer 2004 running the GEOS-5 Data Assimilation System (DAS) at the 1/2-degree resolution being used for MERRA.

Reanalysis at a new scale

By using the GEOS-5 DAS, MERRA is leveraging the GMAO’s most capable analysis tool to date, as well as the researchers developing it.

MERRA will produce a comprehensive record of Earth’s climate and weather from 1979, the beginning of the Earth-observing satellite era, up to the day it finishes sometime in 2009. MERRA will cover twice as much time as the first GEOS reanalysis, which the predecessor Data Assimilation Office conducted with the GEOS-1 assimilation system back in 1993.

In addition, MERRA will have a horizontal resolution of 1/2-degree—roughly one model grid point every 55 kilometers—and 72 vertical levels. That compares to the 1-degree resolution used by the most recent large-scale reanalysis, the Japanese 25-year Reanalysis Project that ended in March 2006. Bosilovich said that finer resolution better locates observational inputs geographically, improving precipitation and other details in the analysis.

The NCCS computational offerings drove the GMAO’s decision to alter their original plans for 1-degree resolution and 36 levels. “Since the computational capability is there, we are running MERRA at this higher resolution,” said Gi-Kong Kim, GMAO production group lead. “Five years ago, such high spatial resolution was inconceivable.”

MERRA also will have fine temporal resolution, with much of the diagnostic data produced every hour. Other diagnostics will be released at 3-hour intervals, which is typical for recent reanalyses. Among MERRA’s more than 300 diagnostic variables are temperature, moisture, wind, surface pressure, and—unique with reanalyses—fields for the chemistry transport community. Hourly diagnostics will resolve the diurnal (daily) cycle of minimum and maximum values more precisely than existing 3- and 6-hourly reanalysis data products. They also will allow close study of individual weather events that can pop up within a few hours.

“Since it appears early in production, I am especially interested to see the Presidents’ Day Storm of February 1979,” Bosilovich said. This storm spread record-breaking snow amounts from the Ohio Valley to the Mid-Atlantic, with snowfall rates greater than 4 inches per hour around Washington, DC. At the time, weather models failed to predict the storm’s intensity. Bosilovich also mentioned the 1988 Central U.S. droughts and floods, the 1993 Midwest floods, and several major El Niños as events of interest.

Besides increased spatial and temporal resolution, the variety and number of observations that MERRA must assimilate is much greater than in the previous reanalysis. Observation sources include satellites, ground stations, weather balloons, ships, and aircraft. “GSI can handle a lot more observations per day than in the past,” Bosilovich said. He said this capability is especially important for assimilating 151 channels from the Atmospheric Infrared Sounder (AIRS) instrument on board the EOS Aqua satellite. AIRS represents 40 percent of the observations to be assimilated since Aqua’s 2002 launch. For the Aqua period, the GSI will read in 8 million observations per day for potential inclusion in the assimilation. Through quality control and other thinning, 5 million of those finally get assimilated into GEOS-5.

EVOLUTION OF OBSERVING SYSTEMS
The number of Earth observations to be assimilated has increased dramatically over the last few decades. The panels demonstrate the evolution of observing systems from 1973 (pre-satellite) to 1979 (TIROS Operational Vertical Sounder [TOVS]) to 1987 (add Special Sensor Microwave Imager [SSMI] and several TOVS) to 2006 (add Atmospheric Infrared Sounder [AIRS] and several each of TOVS and SSMI). The headers list the number of observation points for an example 6-hour period during each year.

Production strategies

These new scale factors translate into substantial computing and storage requirements during MERRA’s production phase. For example, doubling the spatial resolution increases processing needs eight-fold. Add the factors together, and requirements grow exponentially.

As the MERRA design expanded, it became clear that no current supercomputer could analyze all 30 years in sequence and finish within the 18-to-24-month window. To meet that timetable, MERRA must analyze about 30 “data days” per wall-clock day. Consequently, the GMAO divided the reanalysis into three concurrent streams of 10 years each. By running on 128 Discover processors, each stream can analyze 10 to 11 data days per day. An additional 128 processors will run supporting experiments.

All MERRA output will reside in the NCCS mass storage system. With an estimate of 15 gigabytes of output per data day, expected total data volume is 165 terabytes.

“The NCCS has done a very good job in having its production systems ready,” Kim said. “The GMAO and NCCS have been working together very closely to keep the computing requirements updated.”

The original plan was to run MERRA on “Explore,” a 1,152-processor SGI Altix 3700 system. After test runs showed GEOS-5 performing faster on Discover, the GMAO decided to take advantage of the cluster’s expansion to 2,560 processors this summer (see “Discover Cluster Expands with New Processors and Visualization Capabilities” in this issue). They also wanted to ensure completing the reanalysis on one supercomputer. “Given the length of the run, Explore’s service contract might run out,” Bosilovich said. Citing a large weather center that had to stop a reanalysis because the computer went away, he stressed the need for “continuity in computing systems and a stable platform.”

The Discover expansion includes not only 1,024 new processors but also an additional 120 terabytes of online disk, which will better support MERRA production assessment and post-processing. Specialized Visualization Nodes enable close monitoring of the output streams. To speed data transfer between the mass storage system and Discover, the NCCS technical staff increased the network pipes to 10 gigabits per second. Staff also reconfigured the mass storage system to accommodate MERRA input/output, which is particularly write-intensive, explained Harper Pryor, CISTO Programs Development Manager (SAIC).

For the GMAO, Kim said one of the most challenging aspects about preparing for MERRA production has been collecting, identifying, and preparing all the input observations. Over several months, the GMAO made a “data sweeper” run of the reanalysis at 2-degree resolution using Explore. “The purpose was not so much to validate the scientific product but to ensure that the input observations would not give us surprises while we do production,” Kim said. “If we encountered GEOS-5 system crashes due to problems related to input observations, we investigated and then designed solutions for production.” To finish as quickly as possible, the sweeper also used multiple streams.

The final preparatory step is validation studies, which began running on Discover and Explore during July. “We are making sure the science has been applied properly,” Bosilovich said. With the water cycle receiving prime attention, one of the comparative data sets is from the Goddard Laboratory for Atmospheres’ Global Precipitation Climatology Project. Additional validation data come from remotely sensed data products (including MODIS, the Moderate Resolution Imaging Spectrometer on Terra and Aqua), other reanalyses, and ground stations. Such variety will enable verifying correct large-scale and regional circulation in the atmosphere.

Review of validation results is the responsibility of an External Users Group of scientists from universities, other climate and weather centers, and NASA field centers. “Rather than determining on our own when we are or are not ready to start production, we assembled an External Users Group to evaluate community needs,” Bosilovich said.

MERRA VALIDATION STUDIES
MERRA validation studies compare merged satellite and in situ observations from the Global Precipitation Climatology Project (GPCP) to reanalysis data. The panels show the differences in monthly mean precipitation (in millimeters per day) between GPCP and data from the GEOS-5 DAS and three completed reanalyses: the National Centers for Environmental Prediction Reanalysis-2 (NCEP R2), the European Centre for Medium-Range Weather Forecasts 40 Year Reanalysis (ERA-40), and the Japanese 25-year Reanalysis Project (JRA 25). GEOS-5 has a mean of zero and the smallest standard deviation, indicating the best agreement with GPCP. Of particular note is GEOS-5’s improved tropical precipitation bias compared to the reanalyses.

Sharing the results

The broader Earth science community will have online access to a subset of MERRA data through the Goddard Earth Sciences Data and Information Services Center (GES DISC), formerly the Goddard Distributed Active Archive Center (DAAC). The GMAO estimates the available data will equal 55 terabytes, including the chemistry transport fields

The NCCS will funnel the data to GES DISC before archiving to tape, so scientists will have access to provisional data while MERRA is running. In collaboration with the GMAO, GES DISC is building a disk-based data storage and distribution system with a Web user interface. Kim said that planned capabilities include search using a variety of parameters, on-the-fly subsetting, and visualization to aid data search and comparison.

MERRA has a range of potential applications. For one, a researcher may pick case studies from the historical record. Because MERRA spans several generations of Earth-observing platforms, the GMAO will be able to look at climate variability and predictability from a uniform perspective. This research is part of their contribution to the U.S. Climate Change Science Program. Modeling groups could also use the data to drive mesoscale/regional models or global chemistry models. “Transferring a scientific product to the community has been our goal since writing the proposal,” Bosilovich said. “That will be exciting to see in action.”

MERRA funding comes from NASA’s Research, Education, Applications Solutions Network (REASoN) and Modeling, Analysis, and Prediction (MAP) programs.

http://gmao.gsfc.nasa.gov/research/merra
http://disc.gsfc.nasa.gov

USAGov logo + Privacy Policy and Important Notices
+ Sciences and Exploration Directorate
+ CISTO
NASA Curator: Pamela Ricks
NASA Official: Phil Webster, CISTO Chief
Last Updated: Thursday, 06-Dec-2007 10:41:56 EST