News | Lead Story

Lead Story

An opportunity to grow the UFS community and the RRFS

Winter 2024

The Rapid Refresh Forecast System (RRFS) is NOAA’s next-generation high-resolution, rapidly- updating ensemble prediction system that is underpinned by the Finite Volume Cubed Sphere (FV3) dynamical core of the Unified Forecast System (UFS). The RRFS has been in development over the past 5-7 years as part of a major collaborative effort between numerous organizations in NOAA, an ongoing partnership with the DTC, and academia.

The RRFS must meet or exceed the performance of the current operational high-resolution deterministic and ensemble systems. Accordingly, the RRFS features many ambitious capabilities that set it apart from the present era of high-resolution Numerical Weather Prediction (NWP) systems, such as a large 3-km domain covering all of North America (Fig. 1). So far, its overall performance has been quantitatively promising in the cool season (Fig. 2), but the same cannot be said for warm-season convective precipitation. In these scenarios, the RRFS tends to produce storms that are too intense and have a high bias in precipitation (Fig. 3).

Figure 1. The 3-km North American computational domain for RRFS.

Figure 2. Bias (dotted) and RMSE (solid) of 24-h forecast of upper air temperature over the period December 2022-February 2023 comparing the operational HRRR (red) to RRFS (blue).

Figure 3. Frequency bias by precipitation threshold comparing HRRR (red) and RRFS-A (gray) for 3-h accumulation intervals for a 48-h forecast period from 1 April to 31 August 2023. Figure taken from Carley et al. (2024,

In the Spring and Summer 2023, the NOAA National Severe Storms Laboratory (NSSL) ran several versions of the Model for Prediction Across Scales, or MPAS, using configurations similar to RRFS. The results were impressive, with performance exceeding that of the RRFS for key convective forecast fields. In light of these results, and the continued struggles to improve RRFS performance for convective prediction, NOAA leadership requested a study be performed to review the efforts to address the challenge in the RRFS and recommend a path forward (

As a part of this study, a large number of homogeneous and idealized convective simulations were conducted to identify the source of the RRFS bias. FV31 solutions were compared to solutions from well-known convective models, Cloud Model 1 (CM1), and the Advanced Weather Research and Forecast Model (WRF-ARW). CM12 and WRF-ARW were modified to resemble the FV3 configuration as closely as possible. The FV3 was set up to use RRFS settings, and all models used the “Kessler” microphysics scheme.

In the Spring and Summer 2023, the NOAA National Severe Storms Laboratory (NSSL) ran several versions of the Model for Prediction Across Scales, or MPAS, using configurations similar to RRFS. The results were impressive, with performance exceeding that of the RRFS for key convective forecast fields.

The results shown are from an environment similar to the southeast U.S. summertime environment (moderate Convective Available Potential Energy [CAPE] and low vertical shear). Figure 4 displays the squall-line solutions after 5 hours.  The two most noticeable features are the differences in cold pool size (related to the amount of precipitation that evaporates) and the size of the color-filled “updraft objects” (see caption for object criteria). The FV3 solution shows a broader cold pool with larger storm objects than CM1 (ARW not shown).  Even with a homogenous environment and very simple microphysics, kernel density estimates (Fig. 4, far right panel) from the accumulated rainfall at each grid point show that FV3 produces many more points with moderate to heavy rainfall above 50 mm. This behavior is very consistent with the full physics NWP results.  It strongly suggests that the dynamical core in FV3 is behaving in a fundamentally different manner than CM1 or WRF.  FV3’s uses a “D” grid staggering, which has ~½ the resolution of the “C” grid staggering used in CM1, WRF, and MPAS. This likely results in larger storms and excessive rainfall. Unfortunately, the fundamental grid discretization of a model is a core design component that is not straightforward to change.

Figure 4: (a) Horizontal cross-sections from squall line at 5 hours. The gray-shaded regions are the cold pool (perturbation theta less than -1 K) and the solid-colored regions indicate storm objects identified as regions where the composite reflectivity > 35 dBZ and vertical velocity
above 700 mb is at least 2 m/s. (b) Kernel density estimates of the accumulated precipitation over the 6-h period from the squall lines using the three models (low-shear, moderate CAPE). Figure taken from Carley et al. (2024,

With the source of the convective storm bias identified and promising results of MPAS in-hand, the study recommends that version 2 of RRFS should transition to the MPAS dynamical core. MPAS features a more favorable C-grid staggering for RRFS applications, has a limited area capability, and presents an exciting opportunity to grow the UFS community.

[1] We employed FV3 SOLO for these simulations. FV3 SOLO is GFDL’s simplified version of the dynamical core.  SOLO’s dynamical core is nearly identical to the RRFS model.

[2] “Cloud Model-1” (CM1) is a numerical model for idealized studies and is considered the standard for convective storm research models.  It has been cited in more than 350 peer-reviewed published articles in the last decade in 30 different journals.

NOAA’s New Fire Weather Testbed and DTC Fire Weather Verification

Autumn 2023

With wildfires increasingly impacting society and ecosystems spanning local to global scales, NOAA has expanded its strategic investments and plans to address wildfire-related hazards. In 2023, NOAA established a new Fire Weather Testbed (FWT), a joint effort between three of NOAA’s Line Offices: NOAA Research (Oceanic and Atmospheric Research), NOAA Satellites (National Environmental Satellite, Data, and Information Service), and the National Weather Service. Housed within NOAA’s Global Systems Laboratory in Boulder, Colorado, the FWT will comprise physical and virtual facilities for conducting evaluations and experiments to facilitate the transfer of new technologies and applications into operational platforms as quickly as possible. 

The FWT will convene a broad range of fire-related communities across agencies and jurisdictions, including decision makers, researchers, and operational fire weather forecasters to improve and tailor tools, applications, products, and information. A primary focus centers on understanding the information needs of stakeholders involved with wildland fire before, during, and after the flames to create a fire-ready nation: one that is fire-adapted and fire-resilient.

One of the many significant challenges to improving fire-weather forecasts is that wildland fires often begin in complex terrain with very sparse data observations. This complexity, plus the gap in observations, makes it difficult to observe, much less accurately model and skillfully forecast, weather near wildfires.

One of the unique goals of the FWT will be a thorough “User Needs Assessment” of the fire-weather community to better understand the needs and gaps impacting the front-line firefighters and decision makers across the many timelines (instantaneous to seasonal) pertinent to fire management. The FWT is also hiring multiple Social and Behavioral Scientists to ensure that relevant social science is incorporated throughout the evaluation process, from designing and facilitating evaluations to ensuring user feedback is incorporated into new tools and technologies. Social science will guide the development of new tools so that they are effective in helping operational decision-makers understand often complex weather forecasts to make well-informed decisions amidst uncertainties typical in wildland fire situations.  

One of the many significant challenges to improving fire-weather forecasts is that wildland fires often begin in complex terrain with very sparse data observations. This combination of mountainous geography, varying vegetation types, and lack of observations makes it difficult to observe, much less accurately model and skillfully forecast, weather near wildfires. To successfully model weather in these complex environments, we need high-quality data and verification. The FWT will work closely with other NOAA Testbeds and Proving Grounds to facilitate the evaluation and transition of state-of-the-art wildfire weather-related products and tools..

Fire weather forecasting includes many aspects requiring verification. Upper left: Fire behavior, spread, and emissions are dependent on local and regional weather including humidity, temperature, wind, radiation, vegetation (fuel), and topography. Upper right: Skillful forecasts of upslope winds (flow from left to right), transport winds (flow from right to left), and mixing height forecasts are critical to estimate dispersion and transport from wildland fires. Lower left: New ensemble forecasts, such as the Warn-On-Forecast Smoke system developed by the National Severe Storms Laboratory, can provide probabilistic guidance for values-at-risk, such as the location of a parade (X) that has a 50% chance of experiencing problematic ground-level smoke concentrations. Lower right: Verification of forecasts using remotely sensed data (inset shows the Suomi NPP Aerosol Index) and in situ observations (main) will improve our ability to predict and communicate impactful fire weather-related conditions that ultimately protect life and property.

The FWT and the Developmental Testbed Center (DTC) will collaborate to ensure that the most advanced verification methods and techniques and technologies are being tested and implemented in fire weather-based forecasting within the Unified Forecast System (UFS).  For example, the DTC is currently working on adding verification capabilities to the enhanced Model Evaluation Tools (METplus) software system, as well as testing methods within the Short-Range Weather App. The DTC will build on antecedent air quality evaluation efforts to include the ability to assess the Rapid Refresh Forecast System (RRFS)-Smoke forecasts of PM2.5, PM10, and aerosol optical depth forecasts within METplus.  Once verification of these fields is finalized, METplus use-cases, or examples, will be added to the SRW App for future use by the general modeling community, including the FWT. Tools such as the Method for Object-based Diagnostics Evaluation (MODE) will be used by the Short Range, Medium Range, Subseasonal to Seasonal (S2S), and the Seasonal Forecast System (SFS) to evaluate precursor fields such as precipitation, and atmospheric moisture.  It will also help evaluate fire parameters such as spread and impact of short term atmospheric phenomena.

The DTC provided two training series on how to configure METplus for fire weather in November 2023.  The recordings and presentations can be found on the DTC website.

A long-term vision of Hierarchical System Development for UFS

Summer 2023

Despite significant improvement of model development and computational resources, model forecast issues still exist in numerical weather prediction (NWP) and Earth system models (ESMs), including the Unified Forecast System (UFS). Hierarchical System Development (HSD) (Ek et al. 2019) is an efficient approach for model development, enabling the community with multiple entry points for research efforts, spanning simple processes to complex systems. It accelerates research-to-operations-to-research (R2O2R) by facilitating interactions between a broad scientific community and the operational community.

HSD is critical for research because it enables the research community to have multiple entry points into development that reflect their interests.

The Developmental Testbed Center (DTC), in collaboration with the Earth Prediction Innovation Center (EPIC), prepared a white paper describing a long-term vision of HSD for the UFS and a plan for its phased implementation. This paper outlines proposed hierarchical axes relevant to the UFS, HSD capabilities that currently exist in the UFS, and recommendations for future development by EPIC aligned with each axis. A survey on HSD for UFS was distributed to a broad research and operational community to collect valuable insight and feedback on participants' experience with, and use of, HSD tools, testing infrastructure, current and future capability needs, and perceived gaps to help inform the future HSD within the UFS. The results were used to inform the white paper and prioritize the proposed recommendations.

Several unique perspectives exist for this topic, where hierarchical systems have been defined by axes such as model complexity (Bony et al. 2013), model configuration (Jeevanjee et al. 2017), or principles of large-scale circulations (Maher et al. 2019). Building upon these perspectives, the white paper introduces four main axes that can be applied to the UFS framework. The first discusses sample size, where the common approach during the model issue-identification stage is to start from one case study and expand to multiple similar cases to determine any systematic biases. The second is hierarchy of scales, spanning coarse-to-fine grid spacing and global-to-regional scales, including dynamic downscaling techniques such as nesting and variable resolution modeling. The third is simulation realism, which refers to the simplification of models to help improve theoretical understanding of atmospheric processes and interactions. The fourth is mechanism/interaction denial, where the model can be configured such that any given mechanism(s) can be turned off in place of data models to examine the impact of specific mechanisms on atmospheric or other model component processes.

Key recommendations, as well as level of effort and necessity within each of the axes, were then prioritized to help inform the future progress of the UFS HSD (Fig. 1). To address sample size, a compilation of case studies that represent model issues is recommended to be developed and maintained to help facilitate development and improvement of modeling systems. These cases should span from single cases to multiple similar cases, and from short-term runs to long-term runs. The priority for hierarchy of scales is to include the current capabilities of nesting in publicly released applications and to establish sub-3-km capabilities. Mechanism/interaction denial recommendations include continued enhancements to the Common Community Physics Package Single Column Model (CCPP-SCM), capabilities for removing feedback in a coupled model system, such as CDEPS, and continued development of data assimilation. A highly configurable framework for idealized simulations is desired for simulation realism.

Figure 1. Axes for UFS HSD with proposed recommendations. The numbers listed above each axis represent the level of effort and necessity, respectively. The level of effort was estimated subjectively based on the current existence of capabilities from the community or developers. The necessities were evaluated based on the priority rankings from the survey results.

The vision and ongoing development for the HSD testing framework by the EPIC team, in collaboration with the UFS community, is to accelerate the research and development capabilities of the UFS and to facilitate the day-to-day development work for the UFS. This will increase the readiness of the UFS and its applications for releases and deployments.


Bony, S., and Coauthors, 2013: Monograph on Climate Science for Serving Society: Research, Modelling and Prediction Priorities.

Ek, M., and coauthors, cited 2023: Hierarchical System Development for the UFS. [Available online at]

Jeevanjee, N., P. Hassanzadeh, S. Hill, and A. Sheshadri, 2017: A perspective on climate model hierarchies. Journal of Advances in Modeling Earth Systems, 9, 1760-1771,

Maher, P., and Coauthors, 2019: Model Hierarchies for Understanding Atmospheric Circulation. Reviews of Geophysics, 57, 250-280,

CCPP Goes Operational

Spring 2023

The Hurricane Analysis and Prediction System (HAFS) v1 has been approved for operational implementation for the 2023 hurricane season by NOAA NCEP and, with it, the Common Community Physics Package (CCPP) will be deployed operationally for the first time. This is a major milestone for this software infrastructure. The first set of requirements for the CCPP was established in 2017, following numerous discussions by the Physics Interoperability Team, which was assembled under the auspices of the Earth System Prediction Capability (ESPC, now Interagency Council for Advancing Meteorological Services [ICAMS]).

The HAFS version 1 and the CCPP go operational.

Over the last six years, the CCPP was established with two major components: the CCPP Physics, a library of physical parameterizations, and the CCPP Framework, the infrastructure that connects the library to host models. The Framework has undergone extensive development to meet the needs of the research and operational communities. One example is the static build, in which a number of suites can be provided at compile time, resulting in the auto-generation of suite-specific physics interfaces to the model (called caps), to be included inside of the model executable. This approach gives the flexibility required by the research and development community (by providing the ability to choose a menu of suites at runtime), while being efficient in computational memory use and timing.

The benefits of the CCPP go beyond those provided by the Framework. The CCPP Physics contains a number of physical parameterizations that can be assembled into suites. This approach enabled the UFS Hurricane Application team to experiment with different designs for HAFS, finally settling on two configurations to satisfy the needs of the National Hurricane Center (NHC). The physics suites for these two HAFS implementations apply different physics suite configurations: HAFS-A uses the GFDL single-moment microphysics and HAFS-B uses the Thompson double-moment microphysics. The configurations also differ in customizations to the convective and planetary boundary layer (PBL) schemes and in the frequency of calls to the radiation parameterization.

HAFS model-simulated hurricane using the CCPP.

The DTC Visitor Program also played an important role in supporting the community engagement with the development of physics for HAFS. Dr. Andrew Hazelton, from the University of Miami Cooperative Institute of Marine and Atmospheric Science (CIMAS) and affiliated with NOAA’s Atlantic & Oceanographic Meteorological Laboratory, was the recipient of a DTC award for the improvement of PBL representation for better forecasts of tropical cyclone structure and large-scale steering (see Winter 2023 DTC Newsletter).

The DTC is responsible for the code management of the CCPP repositories, and co-manages the UFS fork of CCPP Physics in Github. This fork, which is used by those contributing physics intended for the UFS, is directly connected to the UFS Weather Model and used in all UFS Applications. The DTC reviews git repository pull requests for bug fixes and innovations, and connects with experts to obtain additional input. The DTC also represents the CCPP component on the UFS Weather Model code management team, ensuring that changes in CCPP are well coordinated with the rest of the model. Physics contributions from the community, such as those from Dr. Hazelton, are critical to improving the UFS skill. Given the distributed nature of the UFS development, with contributions from the NOAA Weather Service and Research Laboratories, as well as NCAR and academia, a robust code management strategy is critical to foster collaborations while maintaining agility.

In summary, the CCPP has come a long way since its inception, laid a strong foundation for research and development (R&D) and transitions, and achieved readiness to support operational implementations of the UFS Weather Model and its applications. We are looking forward to seeing HAFS v1 go live and contributing to the suite of models used by the NHC to create its forecasts.

For more news related to CCPP, see the Community Connections article: A Forward-looking Virtual Get-together of the CCPP Community.

Using the CCPP SCM as a Teaching Tool

Winter 2023

The climate graduate programs at George Mason University offer an Earth System Modeling course. The course is divided into two subtopics, theory and practicum. The theoretical session offers lectures introducing students to the physical and dynamical components of an Earth system model, their interactions, and how these components are used to predict the behavior of weather and climate. When I became the instructor of the Earth System Modeling course, I added a module to the practicum session that provides students the technical skills to contribute to the development of an Earth system model. For this module, I opted for the single-column model (SCM) approach. Given the number of Earth system models developed in the U.S. alone, coupled with our aim to familiarize our students with more than one model, the Earth system model and the SCM were selected to come from two modeling groups.

As a researcher, I have experience running various Earth system models, yet never had the opportunity to work with a SCM. My decision to select this model, developed by the DTC as a simple host model for the Common Community Physics Package (CCPP), was influenced by my current work with the NOAA Unified Forecast System (UFS), which uses the CCPP for the majority of physical parameterizations in the atmospheric component. I was further motivated by the detailed user and technical guide that accompanies the public release of the CCPP-SCM code.

I was cautiously optimistic about successfully porting the code to the Mason high-performance computing (HPC) clusters, which are not part of the preconfigured platforms on which the CCPP-SCM code has been tested. If only one step from the list of instructions fails, it can cause a domino effect on the subsequent steps. To my surprise, step after step was successfully completed. The biggest challenge to port the code was building the three libraries that are part of the UFS hpc-stack package. Thankfully, the developers of the UFS hpc-stack have done an excellent job in providing a system for building the software stack. Building the library required an entire day of suspense, yet its successful completion was well worth the wait.

Thankfully, the developers of the UFS hpc-stack have done an excellent job in providing a system for building the software stack. Building the library required an entire day of suspense, yet its successful completion was well worth the wait.

In addition to the relatively easy process of porting and compiling the code, there are other attractive elements in the CCPP-SCM framework that expand its appeal as a teaching tool. It offers a relatively large library of physical parameterizations (or physics suites) that have been scientifically validated, and provides a variety of pre-processed forcing data. These allow students to design experiments to understand the behavior of physical parameterizations in different environments, and explore the limitations of the approach.

Students discussing an instructional slide

Following the developer instructions, which I adapted to work with Mason’s HPC cluster, students quickly installed their own copy of CCPP-SCM and were ready to work on the practical application. The goal of the assignment was to understand the similarities and differences between the behavior of a cloud parameterization scheme when tested over land and ocean environmental conditions. The variety of observations included with the package allow students to focus on the science without spending time on finding the data sets required to drive the SCM.

The outcome of the assignment exceeded my expectations. Students set up their own numerical experiments without any help from me. This was a rewarding experience for me as instructor and for the students who gained confidence they can master a model that allows them to zoom into the complexity of an Earth system model. Next, students will learn how to run an Earth system model and the NCAR CESM will be used for that purpose.

SIMA: Constructing a Single Atmospheric Modeling System for Addressing Frontier Science Topics

Autumn 2022

The System for Integrated Modeling of the Atmosphere (SIMA) project aims to unify existing NCAR community atmosphere modeling efforts across weather, climate, chemistry, and geospace research. NCAR scientists, in partnership with the atmospheric and geospace sciences research community, are developing a SIMA framework and infrastructure that enables simulations of atmospheric processes and atmospheric interactions with other components of the coupled Earth system ranging from the surface to the ionosphere, and across scales from cloud-resolving weather to decadal climate studies. One of SIMA’s goals is to enhance atmospheric and earth-system modeling applications for frontier-science problems. An example is the sub-seasonal to seasonal predictability of tropical cyclone formation, which requires the capability to represent convective-permitting scales over the tropics coupled to an Earth System Model (ESM). This can extend to investigating the role of aerosols on tropical cyclone formation. Another frontier-science application is quantifying the impact of biomass burning on air quality, atmospheric chemistry, and weather from local to global scales. This requires the capability to represent fires on convective-permitting scales and detailed chemistry and aerosol processes. Additional frontier science application examples are described in the SIMA Vision document

SIMA will allow NCAR to shift from using a complex modeling ecosystem composed of several atmosphere models, each with their own specific application (e.g., Weather Research and Forecasting (WRF) and Model for Prediction Across Scales (MPAS) standalone atmosphere models for weather research, Community Atmosphere Model (CAM) for climate research, Whole Atmosphere Community Climate Model eXtension (WACCM-X) for thermosphere and ionosphere research) into a single modeling system that can be configured for a range of applications (Figure 1). In November 2021, SIMA version 1 was released to the community. This initial version of SIMA includes development of a CAM configuration that contains regionally-refined grids over the Arctic and Greenland and development of high-resolution capability of WACCM-X and one-way coupling between WACCM-X and a geomagnetic grid mesh for magnetohydrodynamics calculations. For SIMA v1, atmospheric chemistry input, emissions data, and chemistry code have been modified to be compatible with unstructured grid meshes and regional refinement of grids. Atmospheric chemistry simulation output for a CAM configuration with regional refinement over the contiguous US was made available via the Geoscience Data Exchange site. A model-independent chemistry module, which enhances the flexibility of prescribing the chemical constituents and reactions, was released in a box model configuration for testing and classroom teaching.

NCAR atmospheric modeling ecosystem in the mid-2010s and the anticipated structure under SIMA in the mid-2020s.

When SIMA is mature, it will provide functionality for interoperable model components, including physics and chemistry schemes, as well as dynamical cores, by using the Community Common Physics Package (CCPP) in its functionality. A suite of physics parameterizations from WRF and CAM are being modified to be CCPP compliant and thus, will become available as part of SIMA. Recently, software engineers and scientists at NCAR, the DTC, and NOAA held a series of discussions about the usefulness of CCPP for the physics suite in CAM as well as any future SIMA needs that CCPP does not yet address. In summary, the CCPP should improve the efficiency of future CAM code development and maintenance, which will enable other modeling system advancements. CCPP is easy to build upon, is explicit, and its implementation facilitates the detection of bugs and inconsistencies. At the same time, it is understood that the CCPP framework will not do everything needed or desired for the modeling system, and additional steps will need to be taken to achieve specific requirements.

One major achievement in developing SIMA is implementing the MPAS dynamical core in CAM giving CAM new functionality to resolve convective motions with a non-hydrostatic dynamical core. Several tests are being conducted using a global MPAS mesh at 60-km grid spacing with regional-refinement to 3-km grid spacing over a specified region. An example application from the NSF-funded EarthWorks project demonstrates the capability of predicting precipitation amounts over the Pacific Northwest region of the US (Figure 2). 

Wet-season (November-March) average precipitation rate (mm/day) over the western U.S. for 1999-2004. Left panel shows results from CESM-MPAS at 3-km grid spacing, middle panel observations from PRISM on a 4-km grid, and right panel results from WRF at 4-km grid spacing. CESM-MPAS has a small underestimation compared to the observations, while WRF tends to overestimate precipitation rate. The probability distributions of daily precipitation show that CESM-MPAS captures the PDF better than WRF, especially for more extreme precipitation. From X. Huang et al. (2022) in Geosci. Mod. Dev.

During the past year, the SIMA governance structure has been broadened to establish a SIMA steering committee, a SIMA Project Lead, a SIMA Scientific and Technical Co-Leads group, and a SIMA external advisory panel. Under this expanded structure, SIMA is taking a two-pronged approach to continued infrastructure development. The first is continuing to produce capabilities already identified as important. For example, there is planned work for enabling simulations with the SIMA-provided configuration of using the non-hydrostatic MPAS dynamical core in CAM using a global grid spacing of 3.75-km, which would provide the ability to perform subseasonal to seasonal forecasts in an Earth System Model. Other high-priority targets include refactoring CAM physics to be compliant with CCPP; developing an online, flexible regridding tool to improve input preprocessing for desired model grids; and enhancing the Model Independent Chemistry Module, while implementing it into CAM.

Another path for SIMA development is to identify and pursue a frontier-science application that will guide infrastructure development. NCAR has asked their staff to propose projects that require SIMA to develop additional functionality that can be utilized in investigations that will understand processes or predictability from local to regional to global scales and synthesize cross-disciplinary science. The SIMA developments for this science application should open the door for many other groups to apply SIMA for advancing their own science interests. 

SIMA leadership is looking forward to deeper engagement with the community to further advance the single atmospheric modeling system and conduct exciting, new science. See more information about SIMA at

Advances in the Rapid-Refresh Forecast System as Seen in NOAA’s Hazardous Weather Testbed’s Spring Forecasting Experiment

Summer 2022

The Rapid Refresh Forecast System (RRFS) is a critical component of NOAA’s Unified Forecasting System (UFS) initiative which has been in development for several years and is planned for operational implementation in late 2024. The RRFS will provide the NWS an hourly updating, high-resolution ensemble capability that uses a state-of-the-art, convective-scale ensemble data assimilation and forecasting system with the Finite-Volume Cubed Sphere Dynamical core (FV3). Further, the RRFS will greatly simplify NCEP’s model production suite by subsuming several regional modeling systems such as the North American Mesoscale model (NAM), the Rapid Refresh (RAP), and the High-Resolution Rapid Refresh (HRRR), which is a significant step forward for the UFS vision to unify development efforts around a simplified and streamlined system. Since 2018, Spring Forecasting Experiments (SFEs) in NOAA’s Hazardous Weather Testbed have played an important role in evaluating convective scale FV3-based model configurations for severe weather forecasting applications. Each year, more and more model configurations utilizing the FV3 dynamical core have been evaluated during the SFE, and SFE 2022 was no exception. With contributions from multiple agencies, 59 FV3-based model configurations were contributed to the 2022 SFE, up from 24 FV3-based model configurations in 2021 and 10 FV3-based model configurations in 2020. This increase is in part due to multiple agencies, such as the University of Oklahoma’s Multi-scale data Assimilation and Predictability (MAP) group, running ensembles to determine how to best configure a future RRFS.

Feedback was provided to the developers through multiple methods during the SFEs. Formal evaluations were conducted, asking participants to subjectively evaluate convection-allowing model and ensemble performance in forecasting severe convective weather such as tornadoes, hail, and winds. Feedback was also collected on the specific aspects of model performance that the developers were interested in, such as how well models using different data assimilation schemes depicted ongoing storms an hour into the forecast. In 2022, for the first time in the SFE, blinded evaluations were conducted so participants did not know which model configuration was used. Blinding the evaluations and randomly displaying the configurations removed any bias participants had toward or away from certain configurations based on name alone.

Spring Forecasting Experiments (SFEs) in NOAA’s Hazardous Weather Testbed. Photo credit NOAA/James Murnan.


Feedback from these subjective evaluations gave developers clues as to which elements to target to improve the forecast performance. For example, in SFE 2021, participants noted that storms in some configurations were overly circular in nature, indicating strong isolated updrafts. Developers were able to adjust the configurations in the off-season, thus this issue was not flagged in SFE 2022. Subjective evaluations can also provide developers with the best avenue of attack for new model developments. In SFE 2021, a Valid-Time-Shifting (VTS) approach in the RRFS configurations contributed by the OU MAP group was tested versus a more traditional data assimilation method. The subjective evaluations indicated that the VTS improved the subsequent forecasts, as did objective verification performed after the SFE by the MAP group. Therefore, for SFE 2022, all MAP RRFS configurations used VTS assimilation, so the focus shifted to determining which observations a VTS approach should be applied to for the greatest forecast benefit.

SFEs have often encompassed comparisons between currently operational model configurations and the next generation of model guidance. In SFE 2022, deep-dive comparisons were conducted between the High-Resolution Ensemble Forecast System (HREF) and the RRFS prototype 2 ensemble (RRFSp2e), and the High-Resolution Rapid-Refresh version 4 (HRRRv4) and the RRFSp2e Control member (RRFSp2 Control). These comparisons considered not only the typical fields utilized for severe weather, but also the environmental mean fields for the ensembles, and upper-air fields for the deterministic comparison. Results from these comparisons revealed which aspects of the guidance were performing better in the newer iterations of the models, and which aspects are still best-depicted by the operational guidance (Figures 1 and 2).

Participant evaluations of the HREF and RRFSp2e forecasts of mean 2-m Temperature, mean 2-m Dewpoint, mean SBCAPE, and probabilities of UH exceeding the 99.85th percentile

Answers to the question, “Which model configuration performed best for this field?”, in which participants were asked to select at least two of the five fields presented to evaluate.


Over the years that the SFE has evaluated FV3-based configurations, evolving toward the future RRFS, we have seen great improvement in the configurations contributed to the SFE. Results from SFE 2022 show the skill of the RRFS and its control member approaching the skill of the HREF and the HRRR. These advancements would not be possible without the dedicated efforts of a community of developers implementing feedback from participants across the meteorological enterprise who contribute their evaluations to the SFE each year.

Burkely T. Gallo. Photo credit NOAA/James Murnan.

The Local Land-Atmosphere Coupling (LoCo) Project

Building Tools and Knowledge to Facilitate R2O and Improved Weather and Climate Prediction

Spring 2022


Over the last two decades, the hydrometeorological community has made significant progress identifying, understanding, and quantifying the land-atmosphere (L-A) interactions that influence Earth’s water and energy cycles. Under the Global Energy and Water Exchanges (GEWEX; project, scientists from around the world have been studying coupled model development and improved observations of the global water and energy cycles to improve prediction of weather and climate. The GEWEX structure is composed of four focused, small-group panels of scientists. The four areas of study tasked to understand Earth’s water cycle and energy fluxes, at and below the surface and in the atmosphere, are global datasets (GDAP), atmospheric processes and models (GASS), hydroclimate applications (GHP), and land models and L-A interactions (GLASS). The GLASS LoCo working group is focused specifically on local L-A coupling (LoCo; Santanello et al. 2018), and has developed and applied coupled metrics to Earth system model development.

The LoCo project set out to develop integrative, process-level metrics to quantify complex L-A interactions and feedback that can be applied to models and observations. Specifically, the “LoCo process chain” (see figure 1) describes the water and energy-cycle pathways that connect soil moisture to clouds and precipitation via surface heat and moisture fluxes, and the evolution of the planetary boundary layer (PBL). Over the last 15 years, quantitative metrics have been developed by the LoCo working group that address specific links in this process chain. This led to the development of LoCo community resources such as Coupling Metrics “Cheat Sheets” and the Coupling Metrics Toolkit (CoMeT) to encourage the model-development communities to use these metrics.

Figure 1. Schematic of the LoCo process chain describing the components of L-A interactions linking soil moisture to precipitation and ambient weather (T2m, Q2m), where SM represents soil moisture; EFsm is the evaporative fraction sensitivity to soil moisture; PBL is the PBL characteristics (including PBL height); ENT is the entrainment flux at the top of the PBL; T2m and Q2m are the 2-m temp and humidity, respectively; and P is precipitation. Citation: AMS 99, 6; 10.1175/BAMS-D-17-0001.1

A key feature of LoCo metrics is that they address multiple components of the coupled system as opposed to traditional “one at a time” approaches to model evaluation (see figure 2). Offline, or uncoupled model development (as was typically performed in the past for land-surface models) is suboptimal because it ignores the interaction and feedback with other components of the system (i.e., the atmosphere). Although LoCo metrics are more complex and require multiple observation inputs, which are sometimes difficult to obtain (e.g. PBL profiles), the payoff of their application is significant for a clearer understanding of the coupled processes in the models, and quantitatively assessing how new model physics, datasets, and development cycles impact those processes, including their positive and negative feedbacks. As a result, LoCo metrics are an ideal, though still underutilized, resource for the research-to-operations community because they can serve a beneficial role in facilitating the transfer of scientific knowledge and understanding to model development and, ultimately, operations.


Figure 2. LoCo metrics across temporal scales (x axis), relationship to the LoCo process chain along the y axis, and statistical vs process-based nature (elliptical outlines). Green background shading indicates land surface related states and fluxes, while blue indicates PBL and atmospheric variables. Citation: Bulletin of the American Meteorological Society 99, 6; 10.1175/BAMS-D-17-0001.1

To expand the reach of LoCo, the GEWEX community is participating in projects and outreach efforts across weather and climate modeling centers. These efforts include serving a major role in a NOAA Climate Process Team (CPT) project called CLASP (Coupling of Land and Atmospheric Subgrid Parameterizations) that convenes five climate modeling centers (NOAA, NCAR, NASA, DOE, and GFDL) and focuses on improving the L-A communication of heterogeneity in their respective global climate modeling (GCM) systems. LoCo has also been considered by the numerical weather prediction (NWP) community including NCEP, and is collaborating with DTC to incorporate LoCo metrics into their evaluation and Hierarchical System Development activities (via METplus). Although adoption has been slow, partly due to operational constraints at some of these centers, it is widely recognized that integrated and process-level metrics are an essential tool for the future improvement of Earth system models.

Engaging Forecast Community in UFS Model Development: UFS Forecasters Workshops 2020-21

Winter 2022

With the advent of the Unified Forecast System (UFS), the operational model development at the National Weather Service (NWS) is increasingly becoming a collective effort that includes contributions from multiple NOAA labs, national agencies, and university research groups. The NWS forecasters are the primary stakeholders of the UFS’ operational forecast products, and therefore, continued engagement with them is important for staying abreast of the real-world performance of the model and its utility in society. Feedback from forecasters is another a key driver that informs the NOAA funding offices on model development priorities. For the purposes of these modeling-oriented workshops, the NWS Center and Regional Offices served to represent the broader state, local, and private forecast community.

Figure: Engaging stakeholders in the UFS: Feedback from forecasters and other stakeholders help us better understand model performance and errors, and thereby define development priorities. For a successful community model development and transition to operations, close coordination between (1) the UFS research and development efforts, (2) the operational model development and implementation organizations (NCEP’s Environmental Modeling Center and Central Operations), and  (3) the forecasting community is essential. 


The NWS Office of Science and Technology Integration (OSTI) Modeling Program Division conducted three workshops for forecasters during 2020-21. The key objective was to identify forecasters’ top priorities and modeling gaps, particularly focusing on the UFS Medium Range Weather/Subseasonal to Seasonal (MRW/S2S) global and Short Range Weather (SRW) regional Applications (See more on UFS Applications here).

The first workshop (held on November 16, 2020) solicited forecasters’ concerns, which were organized under a set of topics including convection, winds and terrain issues, precipitation, floods and hydrology, tropical cyclones, visibility, marine/coastal issues, temperature and air quality, and space weather. These concerns were later synthesized and shared with the UFS model developers and evaluators. The follow-up workshops held on January 29, 2021 and February 11, 2021 further addressed these concerns as relevant to the  evolution of the UFS Medium Range Weather (MRW) and Short Range Weather (SRW) Applications, respectively. These two follow-up workshops, led by the UFS Application leads, along with operational model developers and evaluators at the NCEP EMC, focused on connecting the forecasters’ concerns with model-specific issues and known biases.

The key outcome of the workshop is a finalized list of forecasters' requests for the UFS MRW and SRW Applications that consists of a total of 23 issues classified under 7 major topics, which are 1) Surface temperature and moisture, 2) Precipitation, 3) Convection, 4) Winds, 5) Tropical cyclones, 6) Marine waves, winds, and sea ice, and 7) Space weather. 

Overall, the workshops elevated the UFS community to a new level of synergetic coordination that strengthened the link between model development and forecasting challenges. Discussions between the forecasters and the modelers proved fruitful in that many forecast issues that appeared as independent disconnected items at the beginning of the meeting could later be tied to a few underlying model issues. Some of the development priorities that emerged are Boundary Layer (BL) over the land and ocean, air-sea coupling, land-surface processes, land initialization, microphysics and sea ice, and marine winds and waves. The Model Evaluation Group (MEG) at the NCEP EMC, played a pivotal role in bringing some of the key underlying issues to the forefront.

While the above list is a starting point, the UFS community and NOAA Program Offices are committed to continuing the conversation and facilitating opportunities for ongoing engagement between the modeling and the forecaster communities.

The contributing author team includes
Hendrik Tolman, NOAA / OSTI,
Deepthi Achuthavarier, NOAA / OSTI,
Geoffrey Manikin NOAA / EMC,
Jason Levit, NOAA/ EMC,
Linden Wolf, NOAA / OSTI, and
Farida Adimi, NOAA / OSTI.

An EPIC Journey Toward Open Innovation and Development to Advance Operational Numerical Weather Prediction Systems

Autumn 2021

On April 26, NOAA announced that Raytheon Intelligence & Space (RI&S) would be the development partner that would unite the community in developing the most user-friendly and user-accessible Earth modeling system in the world. This marks the end of a two-year planning process to establish the Earth Prediction Innovation Center (EPIC), and an exciting starting point for community members from academia, industry and government to work together to enable the most accurate and reliable operational numerical forecast model in the world.

EPIC and DTC will join forces to allow a much faster rate of innovation to research and to operations.

The vision of EPIC can be traced back to 16 years ago when our beloved colleague and leader, Dr. William Lapenta, was the acting director of the Environmental Modeling Center, as part of his effort to reshape NOAA’s culture and stand up a path for the United States to reclaim and maintain international leadership in the development of operational numerical weather prediction systems. This vision has been embraced warmly by the weather enterprise, as well as NOAA leadership and U.S. lawmakers. In 2018, the National Integrated Drought Information System Reauthorization Act instructed NOAA to establish EPIC to accelerate community-developed scientific and technological enhancements into the operational applications for numerical weather prediction (NWP). NOAA’s Weather Program Office has started a journey to formally implement EPIC as a program since then. An overview of EPIC’s seven investment areas is shown in Figure 1. 

Figure 1. EPIC’s Seven Investment Areas. Light blue areas stand for functions of the program management team at NOAA WPO. Dark blue areas represent services provided in the contract led by RI&S.

The NOAA-RI&S partnership will take us to a next level toward seamless integration of the numerical modeling community across boundaries, to fulfill EPIC’s mission as catalyst for community research and modeling system advances that continually inform and accelerate advances in our nation’s operational forecast modeling systems, as illustrated in Figure 2. EPIC is working on setting up a public-facing virtual community model development platform populated with the Unified Forecast System codes, supporting datasets and test cases, providing community support in forms of online tutorials, community workshops, and a dedicated service desk, and implementing Continuous Integration/Continuous Delivery (CI/CD) pipelines on cloud and on-premises HPCs to enable Agile development, security, and operations (DevSecOps) processes that can greatly accelerate the infusion and testing of new ideas and innovations for model improvements and enhancements. 

DTC’s experience and expertise in code management and user and developer support for the UFS weather model code base and applications are indispensable assets to the community. EPIC looks forward to working with DTC closely to offload these responsibilities, so that DTC can focus its resources and expertise on testing and evaluation to accelerate hierarchical testing of model physics and final transition to operations. What’s more important, EPIC and DTC will join forces to allow a much faster rate of innovation to research and to operations. An EPIC journey will continue to advance operational numerical weather prediction systems in open innovation and development for years to come.

Figure 2. EPIC as catalyst for community research and modeling system advances.

Accelerating Progress of the Unified Forecast System through Community Infrastructure

Summer 2021

The goals of the Unified Forecast System (UFS) are ambitious: construct a unified modeling system capable of replacing dozens of independently developed and maintained operational prediction systems, while simultaneously paving the way for researchers in NOAA labs and the broader NWP community to access and use that system, improve it, and contribute their own innovations. 

Toward these goals, the UFS Research-to-Operations (UFS-R2O) project began contributing to the UFS just over a year ago aiming to deliver several systems ready to move into operational mode under a unified framework, including a global coupled ensemble-based system for medium-range and sub-seasonal to season prediction (Global (Ensemble) Forecast System - G(E)FS); an hourly-updating, ensemble-based, high resolution short-range prediction system (Rapid Refresh Forecast System - RRFS); and a hurricane prediction system (Hurricane Application Forecast System - HAFS).

Overall, the interdependency of the infrastructure packages are beginning to show a maturity and robustness that will lead to efficient model improvement for the UFS over the next few years.

Developing multiple modeling applications under a single, coordinated project is a new way of doing business for NOAA & NWS, and one that is starting to pay off. A key driver of success for the first year of the UFS-R2O project is a commitment to leveraging community infrastructure packages that form the backbone of the entire system. Infrastructure packages span the applications listed above and provide core functions: model coupling, interface between atmospheric physics and dynamics, data assimilation, hierarchical model development and testing, post-processing, and model verification. The infrastructure packages are backed by teams with substantial expertise, each focused on providing robust solutions to some of the most complex challenges in building Earth system models. Examples of community-developed infrastructure used in UFS include the Model Evaluation Tools verification system (METplus), the Earth System Modeling Framework (ESMF), the Community Mediator for Earth Prediction Systems (CMEPS), the Community Data Models for Earth Prediction Systems (CDEPS), the Common Community Physics Package (CCPP), and the Joint Effort for Data assimilation Integration (JEDI).

Use of community infrastructure software has accelerated progress within the UFS.  One way this has happened is through greater sharing and reuse of code. Another way is through specialized, focused teams solving complex problems in a generalized way such that multiple applications reap the benefits.  For example, the UFS has teamed with ESMF to provide a unified coupling framework capable of supporting a range of different UFS application requirements, from single component configurations to fully coupled. ESMF has been leveraged in a collaborative effort between NCAR and NOAA to develop the CMEPS (a coupler to handle information exchange across different earth system models) as a shared coupler and the new CDEPS (a data model functionality). Both CMEPS and CDEPS are used in NCAR’s Community Earth System Model, and in the UFS Medium Range Weather  and Hurricane applications.  This approach represents a substantial consolidation of effort. Code optimizations, as well as problems resolved within one system, have been immediately leveraged by another.  The use of CMEPS and CDEPS in multiple contexts and by a large user base has increased its flexibility--needed for experimentation by the research community--and robustness--required for reliability when run in operational environments.

Use of the CCPP has accelerated the progress of coupling atmospheric physics and dynamics.  The CCPP is a collection of atmospheric physical parameterizations (CCPP Physics) and a framework that couples the physics for use in Earth system models.  The CCPP-Physics is designed to contain operational and developmental parameterizations for weather through seasonal prediction timescales. Today the CCPP is used in the UFS and in the Navy’s next generation model NEPTUNE (Navy Environmental Prediction sysTem Utilizing the NUMA corE). Additionally, the CCPP Framework is being extended for use with the NCAR Model for Prediction Across Scales (MPAS) and Community Atmospheric Model (CAM). Because it enables host models to assemble parameterizations in flexible suites, and is distributed with a single-column model that permits tests in which physics and dynamics are decoupled, the CCPP facilitates hierarchical system development and is appropriate for both research and operations.

The infrastructure not only includes the physics, model, and coupling framework, but also the pre-processing, post-processing, and verification and diagnostics tools.  METplus is the UFS verification and validation tool that draws contributions from within the community, including academia, laboratories, and operational centers.  It has a full suite of traditional statistics and is now being expanded to support diagnostics used in model development at all spatial and temporal time scales.  These enhancements are being driven by the findings of the 2021 DTC UFS Metrics Workshop, held in February 2021. METplus is also being integrated into the UFS application workflows to be run after the Unified Post Processor (UPP) interpolates the model output onto standard grids.  Finally, it has also been extended to leverage the output of JEDI for evaluations using the observation dataset ingested by JEDI.  

Overall, the interdependency of the infrastructure packages are beginning to show a maturity and robustness that will lead to efficient model improvement for the UFS over the next few years. The second year of the UFS-R2O project is staged to continue to accelerate the development of an efficient, flexible, and well supported cross-cutting infrastructure.

UFS Metrics Workshop Refines Key Metrics

Spring 2021

The Developmental Testbed Center (DTC), in collaboration with the National Oceanic and Atmospheric Administration (NOAA) and the Unified Forecast System's Verification and Validation Cross-Cutting Team (UFS-V&V), hosted a three-day workshop to identify key verification and validation metrics for UFS applications. The workshop was held remotely 22-24 February 2021. Registration for the event totaled 315 participants from across the research and operational community. 

The goal of this workshop was to identify and prioritize key metrics to use during the evaluation of  UFS research products, guiding their transition from research-to-operations (R2O).  Because all UFS evaluation decisions affect a diverse set of users, workshop organizers invited members of the government, academic, and private sectors to participate. This outreach resulted in the participation of scientists not only from NOAA, but also from the National Center for Atmospheric Research (NCAR), National Aeronautics and Space Administration (NASA), US National Ice Center (USNIC), seventeen universities, seven commercial entities, and seven international forecast offices and universities.  Ten NOAA research labs were represented, as well as all of the National Weather Service’s National Centers for Environmental Prediction (NCEP), five Regional Headquarters, and ten Weather Forecast offices (WFOs). Rounding out the government organizations included Department of Defense (DOD) and Department of Energy (DOE) entities along with several state government Departments of Environmental Protection.

In preparation for the workshop, a series of three pre-workshop surveys were distributed to interested parties between October 2020 and February 2021. Questions pertaining to fields and levels, temporal and spatial metadata, sources of truth (i.e. observations, analyses, reference models, climatologies), and preferred statistics were included in the surveys. The results were then used to prepare the list of candidate metrics, including their meta-data, for curation by the workshop breakout groups.

Keynote speakers, including Drs. Ricky Rood, Dorothy Koch, and Hendrik Tolman, along with the Workshop Co-Chairs, kicked off the workshop. The R2O process, in which metrics are used to advance the innovations towards progressively higher readiness levels, proceed through stages and gates, described here, as the tools are assessed and vetted for operations. Presentations included 1) a discussion of R2O stages and gates; 2) the results of the pre-workshop surveys; and 3) how the workshop would proceed. Online instantaneous surveys were used throughout the workshop to gather quantitative input from the participants. During the first two days, the breakout groups refined the results of the pre-workshop surveys. At the end of the second day, the participants were invited to fill out 13 online surveys (listed below) to prioritize the metrics for the full R2O stages and gates . On the last day, breakout groups discussed numerous ways to assign metrics to the R2O gates.

Three prominent themes emerged from the workshop.  The first suggested that metrics used in the near term need to be tied to observation availability and should evolve as new observations become available.  A second emphasized that metrics should be relevant to the user and easy to interpret.  Lastly, the results of the ranking polls tended to place the sensible weather and upper-air fields as top priorities, leaving fields from the components of a fully coupled system (i.e. marine, cryosphere, land) nearer the bottom. Given this outcome, a tiger team of experts will be convened to help the UFS V&V team complete the consolidation and synthesis of the results to ensure component fields are also included across all gates.

These summary activities are wrapping up. The UFS V&V group is working with the chairs of other UFS working groups and application teams to finalize the metrics. The organizers intend to schedule a wrap-up webinar in mid-June to update the community on the final metrics.  For more information and updates on the synthesis work, please visit the DTC UFS Evaluation Metrics Workshop website. Additionally, the Verification Post-Processing and Product Generation Branch at EMC has begun developing evaluation plans for the final R2O gate, transition to operations. Look for updates at the EMC Users Verification website.

The Workshop Organizing Committee included Tara Jensen (NCAR and DTC), Jason Levit (NOAA/EMC), Geoff Manikin (NOAA/EMC), Jason Otkin (UWisc CIMSS), Mike Baldwin (Purdue University), Dave Turner (NOAA/GSL), Deepthi Achuthavarier (NOAA/OSTI), Jack Settelmaier (NOAA/SRHQ), Burkely Gallo (NOAA/SPC), Linden Wolf (NOAA/OSTI), Sarah Lu (SUNY-Albany), Cristiana Stan (GMU), Yan Xue (OSTI), and Matt Janiga (NRL).

2020 HFIP Annual Meeting Highlights

Winter 2021

The three-day HFIP Annual Meeting 2020 was held virtually on November 17-19, 2020. Approximately 130 participants from NOAA line offices, DTC, NCAR, and university partners participated in the meeting. NOAA/NWS/OSTI Modeling Program Division Director Dr. Dorothy Koch kicked off the meeting with welcoming remarks. Day one focused on HFIP programmatic updates, a discussion of forecasters' needs and current activities supporting operations and a summary of the results from the forecasters' HFIP display survey.  Day two consisted of a review of the current state of operational modeling capabilities and results from the 2020 hurricane season real-time experiments, as well as a special panel discussion on the operational modeling challenges faced by both forecasters and the developers that was helmed by panelists from the UK Met Office and ECMWF.  The final day opened with updates on the HFIP-funded external research, followed by discussions on the development of the next-generation Hurricane Analysis and Forecast System (HAFS). As is evident from the Agenda, each day was tightly packed with engaging presentations that represented the diverse interests of the field. NOAA/AOML Hurricane Research Division Director Dr. Frank Marks concluded the meeting with a summary and recommendations.

The primary objective of this meeting was to assess the progress made and challenges identified in achieving the HFIP goals, as documented in the 2019 HFIP Strategic plan. This plan was developed under the Weather Research and Forecasting Innovation Act, Section 4. To meet the plan’s objectives, six key strategies were developed: 1) advance the operational HAFS; 2) improve probabilistic guidance; 3) enhance communication of risk and uncertainty; 4) support dedicated high-performance computing allocation; 5) R2O enhancement; and 6) broaden expertise and expand interaction with the external community.

HAFS global nest presented at HFIP Annual Meeting 2020 courtesy of Andrew Hazelton.


Throughout the meeting, each key strategy's many successes were enumerated. The 2020 hurricane season marked the second year of successful real-time HAFS testing with four different HAFS model configurations. Improvements in graphical products, tropical roadmap, eight to nine ongoing social and behavioral science projects, and AWIPS/ATCF improvements attributed to the success of the HFIP strategies. As for computation efforts, 60M h/month is dedicated for development, testing, and evaluation of HAFS. The HFIP continues to foster the external university partners by funding three research proposals in 2020.

To address the implementation of HAFS, which is planned for 2022, the meeting convened a discussion of the design and testing of potential 2021 HAFS configurations that could run within the operational HPC resources. The HAFS session attendees recommended maintaining real-time test capability on Research and Development High-Performance Computing(RDHPC) systems(Jet, Hera, & Orion) to test the potential configuration. Other recommendations offered during the meeting included performing an evaluation of 2020 operational problem cases, evaluation of observational impact, evaluation of and improvements to the wind radii metrics, and data display capacity on AWIPS II.

2020 HFIP Annual Meeting participants

The Expanding METplus Community

Autumn 2020

Verification and validation activities are critical to the success of modeling and prediction efforts ongoing at organizations around the world.  Having reproducible results via a consistent framework is equally important for model developers and users alike.  The Model Evaluation Tools (MET) was developed over a decade ago by the Developmental Testbed Center (DTC) and expanded to the METplus framework with a view towards providing a consistent platform delivering reproducible results.

The METplus system is an umbrella verification, validation, and diagnostic tool at the core of the DTC testing and evaluation capability (Brown et al. 2020).  It is also supported by the community of thousands of users from both U.S. and international organizations. These tools are designed to be highly flexible to allow for quick adaptation to meet evaluation and diagnostic needs.  A suite of python wrappers has been implemented to facilitate a fast set-up and implementation of the system, and to enhance the pre-existing plotting capabilities.

Over the past few years, METplus has been driven by the needs of the Unified Forecast System (UFS) community, the U.S. Air Force, and the National Center for Atmospheric Research (NCAR) laboratories and collaborators.  Many organizations across the community have joined  with the DTC core partners in contributing to METplus development, fostering a more robust and dynamic framework for the entire Earth-system modeling community to use. During the past year, several leading organizations in verification and statistics research have joined the METplus contributor’s community.

The Naval Research Lab (NRL) started transitioning their capability to METplus during 2019 and began directly collaborating with METplus developers in late July 2020. They intend to contribute methods for the data assimilation and ensemble communities. Dr. Elizabeth Satterfield, a research scientist and the METplus transition coordinator at NRL, stated “By leveraging community-based tools, we can build a unified verification framework which is consistent across the suite of Navy atmospheric models and also be consistent with our operational partners. Such a framework allows us to make use of more modern verification metrics, including feature or process-based metrics, that can assist with identification and diagnosis of specific sources of model error. In addition, this framework will allow NRL to better tailor our verification products to inform data assimilation and model development, as well as the needs of the end user.  Finally, a consistent model verification framework aids in collaboration with other U.S. partners (e.g. NCEP, JEDI) who are employing the same community tools.”

Similarly, after months of discussion and project development activities, the Met Office in the United Kingdom has also begun transitioning their verification and diagnostics capability to MET.  The project is called NG-Ver, for Next Generation Verification. Adapting MET to work with unstructured grids is a key requirement and good reason for collaborating. Beyond the operational implementation, the collaboration will eventually focus on integrating innovative methods for verification and diagnostics while generalizing METplus support of file formats.  Dr. Marion Mittermaier, the manager of the model diagnostics and novel verification methods section at the Met Office, shared in a brief to her staff that stated “METplus offers a wide variety of highly configurable open source verification tools, which the Met Office can contribute to in the fullness of time. Given the open source nature of MET/METplus, it also enables the wider UM partnership to contribute more readily to getting common verification tools established across the partnership for the evaluation of model releases, especially regional configurations.” She also added “I am particularly excited to see the collaboration formalized at last and look forward to working with DTC scientists and developers. We have many common areas of interest and having a joint framework for leveraging new tools will accelerate the availability of these to the user community.”

Example of a feature-based diagnostic evaluation of CESM simulated precipitation using METplus.

Finally, the DTC has been coordinating with NCAR’s unified Earth-system modeling initiative, called System for Integrated Modeling of the Atmosphere (SIMA), for integrating METplus into the SIMA framework as a verification, validation, and diagnostics tool. The National Science Foundation (NSF) community is the target for the framework, and the activities directly support the NCAR/NOAA Memorandum of Agreement, signed in 2019, to work collaboratively on developing a common framework for Earth-system modeling. METplus was recently demonstrated to provide verification and diagnostics for weather-scale prediction (~1 deg) from the SIMA Community Atmosphere Model (CAM). Andrew Gettelman, the SIMA climate lead, said "METplus can provide new ways for us to look at cross scale models and do weather verification even on climate scale model output to help us improve processes important for extreme weather events."

Ultimately, these additions to the METplus community have substantially boosted the DTC’s ability to provide testing and evaluation capability to make Research-to-Operations more efficient and provide evidence-based decisions. The DTC looks forward to continuing to expand METplus capability through these collaborations.

Brown et al, 2020: The Model Evaluation Tools (MET): More than a decade of community-supported forecast verification. Bull. Amer. Meteor. Soc. Early Online Release. DOI:


UFS Medium-Range Weather Application Launched

Team Depth and Broad Engagement = Success

Summer 2020

The community aspect of NOAA’s Unified Forecast System (UFS) is off to a strong start with the release of the UFS Medium-Range Weather Application v1.0 on 11 March 2020.  The planning and preparations for this release were truly a community effort that convened a multi-institutional team of scientists and software engineers from NOAA’s Environmental Modeling Center (EMC), NOAA research laboratories (Global Systems Laboratory [GSL], National Severe Storms Laboratory [NSSL], Physical Sciences Laboratory [PSL], and Geophysical Fluid Dynamics Laboratory [GFDL]), Cooperative Institutes (Cooperative Institute for Research in Environmental Sciences [CIRES] and Cooperative Institute for Research in the Atmosphere [CIRA]), the Developmental Testbed Center (DTC), the National Center for Atmospheric Research (NCAR) and George Mason University (GMU).  This multi-institutional Release Team was assembled in September 2019, and charged with developing a streamlined project plan for the first public release of UFS, and then overseeing and executing this project plan. The aim was to build a well documented UFS modeling system that the community can download, set up, and run in multiple computing environments.  

The UFS is undergoing rapid development across multiple fronts to achieve its vision of meeting the needs for applications spanning from local to global domains and predictive time scales from sub-hourly analyses to seasonal predictions; therefore, an important first step was to define the scope of this initial release.  The team quickly converged on a plan that focused on global configurations for four different resolutions and two supported physics suites: the operational GFSv15 suite and an experimental physics suite under development for GFSv16.  To provide the community with some flexibility on which forecast cycles the model could be run, the team decided it would be important to include the capability to initialize the model using more widely available GRIB2 output.  The team also prioritized the portability of this complex software system, testing on multiple platforms, providing a robust, user-friendly workflow, assembling documentation, and establishing a support mechanism.

To address the portability priority, a small team overhauled the build system for the NCEP libraries, an area that has been an ongoing issue since the DTC first started working with EMC to make their operational codes more accessible to the community.  The NCEP libraries are the underpinnings of everything from the model code to the pre- and post-processing software.   Improvements to the build system also included the model itself.  While there’s always room for further improvement, the outcome is a package that is straightforward to build on different computing platforms, as well as using different compilers.  These platforms include NOAA’s research HPC, NCAR’s Cheyenne, and TACC’s Stampede2, as well as generic MacOS and Linux systems.  The team went one step further by establishing pre-configured platforms, which are platforms where all the required libraries for building the UFS community release are available in a central place.

To meet the needs of a robust, user-friendly workflow, the Release Team selected the Common Infrastructure for Modeling the Earth (CIME), a Python-based scripting infrastructure developed through a multi-agency collaboration (NSF, DOE, NOAA).  CIME, which now supports four distinctEarth modeling systems (CESM, E3SM, NORESM, and UFS), provides the user with the ability to create experiments by invoking only four commands.

To establish a solid foundation from which to build upon, and lighten the load for future releases, the team developed version-controlled documentation that is stored with the code and can be continuously updated.  For ease of navigation, the documentation can be displayed electronically and is easily searchable.  Building this framework and assembling all the pieces was a significant undertaking that relied on contributions from a number of subject-matter experts.

With an eye towards engaging the broader community in all aspects of the UFS, the Release Team selected community forums as the best approach for providing user support.  These forums are publicly viewable and users can post new topics and responses to existing topics by becoming registered users.  Bugs and deficiencies in the documentation brought to light by postings to the UFS forums are catalogued and will be addressed in future releases.

The feedback from the community about the UFS MRW Application v1.0 release has been very positive.  Bugs in the code and deficiencies in the documentation brought to light by postings to the UFS forums are being catalogued, as well as feedback collected through the Graduate Student Test set up by the UFS Communications and Outreach Working Group, and work is underway to address these issues through the release of UFS MRW Application v1.1 in the coming months.  In addition, planning and preparations are underway for the release of the Short-Range Weather Application v1.0 later this year, which will provide the community with the capability to run a Stand-Alone Regional configuration of the UFS-Atmosphere model.

More information

AMS Webinar – UFS MRW Application 1.0:

UFS MRW Application Users Guide:

In Memoriam: Bill Lapenta

Spring 2020

The DTC community mourns the passing of William “Bill” Lapenta, Ph.D. Bill was the Acting Director of NOAA’s Office of Weather and Air Quality (OWAQ) within NOAA’s Oceanic and Atmospheric Research that supports world-class weather and air quality research. He was also the guiding force and energy behind the Earth Prediction Innovation Center (EPIC) with the goal to launch the U.S. forward as the world leader in numerical weather prediction through public-academic-private partnerships. He was committed to conquering the Research to Operations divide.  Bill’s connections to the DTC date back to its early days. While the Director of EMC, Bill served as a DTC Management Board member and eventually transitioned to serving as the lead for the DTC Executive Community when he became the NCEP Director.

Bill had already prepared his presentation on EPIC for the American Meteorological Society Annual Meeting in January in Boston. DaNa Carlis presented on his behalf, followed by remarks from Acting NOAA Administrator Neil Jacobs. In his presentation, Bill illustrated how public awareness of modeling was raised when the European model predicted Hurricane Sandy would make a hard left turn into the NE U.S. He shared EPIC’s goal to advance numerical guidance skill, reclaim and maintain international leadership in NWP and improve the research to operations transition process. 

Bill then outlined how EPIC would fulfill this goal - by leveraging the weather enterprise and existing resources within NOAA, enabling scientists and engineers to effectively collaborate, strengthening NOAA’s ability to undertake research projects, and creating a community global weather research modeling system. 

Bill knew it was important to establish strong partnerships with academia, the private sector, and other federal agencies that share common goals and values, and that open communication would connect leadership, programs, and scientists across organizational boundaries to deliver the best forecasts possible to America. Bill’s energy and leadership to bridge organizations inspires us to carry on with his EPIC vision.

Bill Lapenta

An Overview of the Earth Prediction Innovation Center (EPIC)

Autumn 2019

The Earth Prediction Innovation Center, or “EPIC,” will advance Earth system modeling skills, reclaim and maintain international leadership in Earth system prediction, and improve the transition of research to operations (R2O) and operations to research (O2R) within NOAA by working closely with partners across the weather enterprise. 

EPIC’s legislative language is included as an amendment to the Weather Research and Forecasting Innovation Act (WRFIA) of 2017 (Public Law 115-25) in the National Integrated Drought Information System Reauthorization (NIDISRA) of 2018 (Public Law 115-423). The law states that EPIC will “accelerate community-developed scientific and technological enhancements into the operational applications for numerical weather prediction (NWP).” To achieve this goal, EPIC will 

  • leverage available NOAA resources and the weather enterprise to improve NWP; 
  • enable scientists and engineers to effectively collaborate; 
  • strengthen NOAA’s ability to perform research that advances weather forecasting skills; 
  • develop a community model that is accessible by the public, computationally flexible, and utilizes innovative computing strategies and methods for hosting or managing all or part of the system;
  • and is located outside of secure NOAA systems. 

EPIC builds on the Next-Generation Global Prediction System (NGGPS), which supports the design, development, and implementation of a global prediction system. The NGGPS will address growing service demands and increase the accuracy of weather forecasts out to 30 days. The goal of NGGPS is to expand and accelerate critical weather forecasting R2O by accelerating the development and implementation of current global weather prediction models, improve data assimilation techniques, and improve software architecture and system engineering.

A critical component of the EPIC is to support a community developed, coupled Earth modeling system, known as the Unified Forecast System (UFS). EPIC will be the interface between the community (both internal and external) and aid in the advancement of scientific innovations to the UFS and facilitate improvements in the R2O process by providing access to NOAA’s operational modeling code for co-development outside of the NOAA firewall. EPIC will enhance the research and development process by providing access to the UFS using a cloud-based infrastructure for development. EPIC will allow community members to conduct research and development through multiple architectures, whether they are cloud-based environments or traditional high-performance computing environments. 

The Earth Prediction Innovation Center, or “EPIC,” will advance Earth system modeling skills, reclaim and maintain international leadership in Earth system prediction, and improve the transition of research to operations (R2O) and operations to research (O2R) within NOAA by working closely with partners across the weather enterprise.

EPIC is managed in the Office of Weather and Air Quality (OWAQ) within NOAA’s Oceanic and Atmospheric Research (OAR) Line Office. An EPIC Vision Paper was released that outlines seven core investment areas, including software engineering, software infrastructure, user support services, cloud-based high-performance computing, scientific innovation, management and planning, and external engagement. NOAA also signed a Memorandum of Agreement (MoA) with the National Center for Atmospheric Research (NCAR) to support infrastructure development for a UFS community model. 

The EPIC Community Workshop, hosted by OWAQ and held 6-8 August 2019, was attended by over 180 members of the community. The workshop provided an opportunity for members of the weather enterprise to participate in EPIC’s strategic direction, especially sharing ideas about potential business models, governance structures, priority areas of funding, and how to initiate EPIC. Community members recommended that EPIC be located external to NOAA and exist in a physical location. Community members agreed that the highest priority funding areas are user support services, computing resources, and software engineering. Community members also developed EPIC mission and vision statements, which are below: 

       Community-developed Mission: Advance Earth system modeling skill, reclaim and maintain international leadership in Earth system prediction and its science, and improve the transition of research into operations.

       Community-developed Vision: Create the world’s best community modeling system, of which a subset of components will create the world’s best operational forecast model.

As EPIC progresses, the program is dedicated to fostering a collaborative community environment; providing transparent and frequent program updates; and being responsive to the needs of the community. 


For Further Reading: 
Legislative Language
The Unified Forecast System
Next-Generation Global Prediction System (NGGPS)
Earth Prediction Innovation Center (EPIC)
* View the NOAA-NCAR MoA, EPIC Vision Paper, and EPIC Community Workshop Strategy, Summary and Recommendations PowerPoint on the EPIC Webpage. Check back frequently for program updates, additional materials, and ways to get involved.
EPIC Community Workshop Article

For Questions Please Contact:

DaNa Carlis, PhD, PMP–OWAQ Program Manager for EPIC and NGGPS,
Krishna Kumar, PhD–OWAQ Program Coordinator for EPIC,
Leah Dubots–OWAQ Pathways Intern supporting EPIC,

Earth Prediction Innovation Center (EPIC) Workshop, Boulder, Colorado, Aug 6-8, 2019

NOAA and NCAR partner on new modeling framework


Spring 2019

NCAR and NOAA are each adopting a unified approach to coupled environmental modeling, where success for both efforts is critically dependent on community contributions.  At the end of January 2019, NCAR and NOAA signed a Memorandum of Agreement (MOA) to develop a shared infrastructure that encourages the broader community to engage in improving the Nation’s weather and climate modeling capabilities. Collaborating on the development of a common infrastructure will reduce duplication of effort and create common community code repositories through which future research advances can more easily benefit the operational community.  NOAA will also be able to leverage NCAR experience to provide community access and support for NOAA’s operational models and tools. The MOA focuses on seven key elements of common infrastructure for the NOAA Unified Forecast System (UFS) and the NCAR Unified Community Model (UCM).

Coupling between Components - NCAR and NOAA have already developed an initial design for a new framework for coupling component models. This common mediator framework will ultimately accommodate evolving model coupling strategies, facilitating community research contributions and accelerating the transition of research into operations.

Coupling within a Component - A flexible framework for a common interface that allows interoperability / integration of physics packages within component models offers many near term and longer term benefits.  NCAR and NOAA are initially focusing on implementing such an interface for the community atmospheric models. The Common Community Physics Package (CCPP) developed by the DTC’s Global Model Test Bed (GMTB) through NGGPS funding, which is being implemented in NOAA’s atmospheric models, has paved the way for a collaborative approach with NCAR’s Community Physics Framework (CPF).

“This new framework streamlines the entire process and gives both researchers and forecasters the same tools across the weather enterprise to accelerate the development of forecast models,” said NOAA Assistant Secretary of Commerce for Environmental Observation and Prediction, Neil Jacobs, Ph.D.

Workflow - Workflow in this context refers to all the infrastructure, code and datasets needed to configure, build and run an end-to-end forecast system utilizing a coupled model for a specific application.  NCAR’s well-documented and user-friendly workflow infrastructure known as CIME (Common Infrastructure for Modeling the Earth) and its Case Control System (CCS) makes coupled Earth System Modeling easily accessible in the face of increasing complexity. NCAR and NOAA have already begun exploring a common workflow infrastructure using a CIME/CCS based approach to provide a portable workflow for the larger community.

Quality Assurance Testing - Testing is critical to ensuring software quality and that code performs as expected.  Given CIME already contains elements of quality assurance testing, NCAR and NOAA are planning to adapt CIME and the CCS to provide a common testing framework for both research and operations.

Forecast Verification - Using the same toolkits in research and operations, as well as weather and climate, reduces duplication of work and accelerates the transition of innovations from research to operations. Supporting sustained improvement of coupled models will require tools that provide approaches more relevant to research, as well as considering output relevant to coupled processes.  The Model Evaluation Tools (MET), a community-supported software package, is a comprehensive set of tools for diagnostic evaluation of atmospheric models that can be expanded to address environmental component models beyond weather such as ocean, waves, and sea-ice. This expansion will take advantage of existing evaluation packages for other component models.

“By combining NCAR's community modeling expertise with NOAA's excellence in real-time operational forecasting, this agreement will accelerate our ability to predict weather and climate in ways that are vital for protecting life and property,” said president of the University Corporation for Atmospheric Research, Antonio Busalacchi, who manages NCAR on behalf of the National Science Foundation. “This will enable the nation to produce world-class models that are second to none, generating substantial benefits for the American taxpayer.”

Software Repository Management - Effective community software development requires open access repositories.  All infrastructure and supporting code developed under this MOA will reside in open-access GitHub repositories. The management of these repositories will enable collaborative development across the wider research community and include governance, quality assurance, and workflow tools.

User and Developer Support - A robust infrastructure for providing user and developer support is key to engaging the broader community in advancing the capabilities of both NCAR’s UCM and NOAA’s UFS. This infrastructure will leverage existing practices and protocols developed for the NCAR’s community models (CESM, WRF and MPAS), as well as support efforts provided by the DTC for NOAA’s community codes (GSI/EnKF, HWRF, UPP and CCPP) and MET, to provide active and passive user support for the UFS and UCM.

The result of this MOA will be a state-of-the-art, well-documented and easy-to-use modeling system. Coordinating existing and ongoing investments and governance between NOAA and NCAR ensures alignment with unified coupled community modeling, sets joint priorities and leverages resources.

Weather Prediction Center Meteorologist Andrew Orrison uses weather model data

“Building a Weather-Ready Nation by Transitioning Academic Research to NOAA Operations” Workshop

Spring 2018

The “Building a Weather-Ready Nation by Transitioning Academic Research to NOAA Operations” Workshop was held at the NOAA Center for Weather and Climate Prediction in College Park, Maryland, on November 1-2, 2017. NOAA and UCAR organized the meeting that drew more than one hundred participants from universities, government laboratories, operational centers, and the private sector. Members of the organizing committee included Reza Khanbilvardi, City College of New York, Chandra Kondragunta, NOAA/OAR, Jennifer Mahoney, NOAA/ESRL, Fred Toepfer, NOAA/NWS, and Hendrik Tolman, NOAA/NWS. John Cortinas, NOAA/OAR, and Bill Kuo, UCAR, served as Co-Chairs. A draft of the Workshop Report is available here.

The workshop was informative, stimulating and productive, as it allowed the academic community to have a direct dialogue with NOAA on research to operations transition issues.

The workshop was designed to inform the academic community about NOAA’s transition policies and processes and to encourage the academic community to actively participate in transitioning research to improve NOAA’s weather operations. It was also an opportunity to strengthen engagement between the research and operational communities.

The first day of the workshop consisted of a series of invited presentations on the policies, needs, requirements, gaps, successes, and challenges of NOAA transitions. During the second day, working groups discussed issues and made recommendations to improve the process of transitioning Research to Operations (R2O). In particular, the discussions led to several interesting suggestions on the participation of academic community in the NOAA R2O activities:

  1. NOAA needs to recognize the academic community has a different rewards system from that of an operational organization. Scientific publication is critical for the career advancement of university professors and students. Therefore, the academic community will be much more interested in research that can lead to publication.The availability of computing resources is critical for successful R2O in weather and climate modeling. Given the limited NOAA computing resources and the challenges of obtaining security clearance to use NOAA computing facilities, an alternative solution is needed. Making NOAA operational models and data, and computing resources available through the cloud is an attractive solution.

  2. The academic community cannot work for free. Therefore, appropriate funding to support their participation in R2O is critical. Good examples include the Hurricane Forecast Improvement Project (HFIP), Next Generation Global Prediction System (NGGPS), and Joint Technology Transfer Initiative (JTTI) announcement of opportunities.

Through NOAA support, several students were invited to participate in the workshop. One Ph.D. student from the University of Maryland shared that her eyes were opened to real-life challenges and issues that are confronting our field, something she was not able to learn from her classes. Interacting with these enthusiastic next-generation scientists, who are not afraid to tackle the challenging problems in our field, was the most rewarding part of the workshop.

Many participants commented that the workshop was informative, stimulating and productive, as it allowed the academic community to have a direct dialogue with NOAA on research to operations transition issues. The participants agreed that it would be desirable to have such a workshop once every two years.


Evaluation of New Cloud Verification Methods

Winter 2017
“These metrics can provide very succinct information about many aspects of forecast performance without having to resort to complicated, computationally expensive techniques. ”

The DTC has been tasked by the US Air Force to investigate new approaches to evaluate cloud forecast predictions. Accurate cloud forecasts are critical to the Air Force national intelligence mission because clouds can mask key targets, obscure sensors, and are a hazard to Remotely Piloted Aircraft. This work that will help forecast users and developers understand their  characteristics of these predictions, and suggest ways to make the predictionsm more accurate.

Clouds have significant impacts on many kinds of decisions.  Among other applications, accurate cloud forecasts are critical to the national intelligence mission. Clouds can mask key targets, obscure sensors, and are a hazard to Remotely Piloted Aircraft. The locations of clouds, as well as other cloud characteristics (e.g., bases, tops), are difficult to predict because clouds are 3three-dimensional and they form and dissipate quickly at multiple levels in the atmosphere. In addition, cloud predictions integrate across multiple components of numerical weather prediction systems. Evaluation of cloud predictions is not straightforward for many of the same reasons.

The DTC has been tasked by the US Air Force to investigate new approaches to evaluate cloud forecast predictions that will help forecast users and developers understand their characteristics and suggest ways to make the predictions more accurate.

The DTC effort, in collaboration with staff at the Air Force’s 557th Weather Wing, focuses on testing a variety of verification approaches. , including tTraditional verification methods for continuous and categorical forecasts provide a baseline evaluation of quality (e.g.,  Mean Error, Mean Absolute Error, Gilbert Skill Score, Probability of Detection). that provide a baseline evaluation of quality, sSpatial methods (e.g., the Method for Object-based Diagnostic Evaluation [MODE]) and field deformation approaches that provide greater diagnostic information about cloud prediction capabilities., Nand new distance metrics that characterize the distances between forecast and observed cloud features. This evaluation will help identify new tools, including a cloud-centric NWP index, to consider for implementation and operational application in the Model Evaluation Tools (MET) verification software suite.

For the evaluation, the team is focusing initially on forecast and observed total cloud amount (TCA) fractional cloud coverage datasets for six cloud products for one week-periods for each of four seasons, for six cloud products:

  • WorldWide Merged Cloud Analysis (WWMCA) developed by the Air Force;
  • A WWMCA reanalysis (WWMCAR) product that includes latent observations not included in the real-time version of WWMCA;
  • Forecasts (out to 72 h) of TCA from the USAF Global Air Land Weather Exploitation Model (GALWEM) which is the Air Force implementation of the United Kingdom’s Unified Model;
  • TCA forecasts (out to 72 h) from the NCEP Global Forecast System (GFS) model;
  • Biascorrected versions of the GALWEM and GFS predictions (GALWEM-DCF and GFS-DCF); and
  • Shortterm TCA predictions (out to 9 h) from the Advective cloud model (ADVCLD).

Datasets used in the evaluation were from one week-periods for each of four seasons.

Methods and results

Results of the application of the various verification methods indicated that continuous approaches are not very meaningful for evaluating cloud predictions, particularly because due to they are discontinuous in nature of clouds.  In contrast, categorical approaches can provide information that is potentially quite useful, particularly when applied to thresholds that are relevant for AF decision-making (e.g., overcast, clear conditions), and when the results are presented using a multivariate approach such as the performance diagrams first applied by Roebber (WAF, 2009).  The MODE spatial method also shows great promise for diagnosing errors in cloud predictions (e.g., size biases, displacements).  However, more effort is required to identify optimal configurations of the MODE tool for application to clouds for AF decision making.

Initial testing of field deformation methods indicated that these approaches have a good are potentially of being useful for evaluation of cloud forecasts. Field deformation methods evaluate how much a forecast would have to change in order to best match the observed field. Information about the amount and type of deformation required can be estimated, along with the resulting reduction in error.

The results also indicated that, in general, cloud amount forecasts lend themselves to verification through binary image metrics because a cloud’s presence or absence can be ascertained through categories of cloud amount thresholds. These metrics can provide very succinct information about many aspects of forecast performance in this context without having to resort to complicated, computationally expensive techniques. For example, Baddley’s ∆ metric gives an overall useful summary of how well two cloud-amount products compare in terms of size, shape, orientation and location of clouds., and tThe Mean Error Distance (MED) gives meaningful information about misses and false alarms, but is sensitive to small changes in the field. In addition to the distance metrics, a geometric index that measures three geometric characteristics (area, connectivity, and shape ) could potentially provide additional useful information, especially when the cloud field is not too complex (i.e., is comprised of a small number of features).

Ongoing and future efforts

Ongoing efforts on this project are focused on extending the methods to global cloud amounts (the initial work focused on North America), and further refinements and tests of the methods.  For example, MODE configurations are being identified in collaboration with the AF 557th Weather Squadron, to ensure the configurations are relevant for AF decision-making.  In addition, canonical evaluations (i.e., with “artificial” but realistic cloud distributions) of the distance metrics [1] are being examined to determine if any unknown biases or poor behavior exist that would influence the application of these methods. As these extensions are completed, a set of tools will be identified that provide meaningful – and complete – information about performance of TCA forecasts.  Further efforts will focus on other cloud parameters such as cloud bases and tops.

The canonical evaluations only apply to the distance metrics, not all of the methods.

Community Modeling Workshop Outcome

Summer 2017
“The most common feedback from the workshop participants noted the increase in transparency within the EMC and NOAA at large, the increasing effort to engage the entire community, and the general sense of positive momentum of the community coming together to embrace the opportunity to use NGGPS as a foundation to build a true community modeling resource for the Nation.”

DTC Article on NOAA Community Modeling Workshop and SIP Working Group meetings

The NOAA Community Modeling Workshop and meetings of the Strategic Implementation Plan (SIP) Working Groups were held 18-20 April  2017 at the National Center for Weather and Climate Prediction in College Park, Maryland.  The goal of the meetings was to seek engagement with the Earth system science community to form and shape the nascent unified modeling community being built upon the Next Generation Global Prediction System (NGGPS), and to consider how to best execute shared infrastructure, support, management, and governance.  Other topics addressed include identifying “best practices,” discussing how a community-based unified modeling system will actually work, and to evolve and coordinate between SIP/NGGPS Working Groups (WGs). A complete set of documents for the meeting, including the agenda, participant list, presentations, and summary reports are found on the workshop webpage: For more information on the SIP effort, see the “Director’s Corner” article in the Winter 2017 issue of DTC Transitions.

The NOAA Community Modeling Workshop, which ran from 18 April through noon on 19 April, was designed to interact with the broader model R&D community.  As such, this portion was completely open to the public, and included a dial-in capability for the plenary sessions.  The opening talks set the stage by describing the approach and goals of the Next Generation Global Prediction System (NGGPS), and a summary of the SIP and its goals and objectives.  These opening talks were followed by a panel discussion of senior leaders from the weather enterprise, including the Directors of NWS and NOAA Research, UCAR President, and senior leaders from academia, private sector, NASA, National Science Foundation, and DoD (Navy).  Each were asked to provide their perspective on three items:

  1. What aspects of a NOAA-led community to develop next-generation unified modeling system would your organization and sector find advantageous?  In other words, how do you think your organization/sector would benefit?
  2. For which parts of a community unified modeling effort would your organization or sector be best able (and most likely) to contribute? In other words, what do you feel is the best role for your organization/sector to play?
  3. From the perspective of your organization or sector, what do you see as the greatest challenges to be overcome (or barriers that must be broken down) to make this a successful community enterprise?

The remainder of the presentations were panel discussions featuring co-chairs from 12 active SIP WGs, each of whom provided their perspective on the ongoing activities of their WG and the overall effort to migrate the NGGPS global model, under development within NOAA, into a community-based unified modeling system.

The workshop concluded on the morning of 19 April with a series of parallel break-out groups, each of which was asked to provide their assessment based on what they saw and heard during the presentations to identify two categories of items:

  1. Best practices: What are the major things that we’re getting right?
  2. Gaps: What are the major things that we’re missing, or heading down the wrong track?

Note: Reports from these breakout sessions can be found in the workshop summary.

The SIP Working Group meeting, which ran from the afternoon of 19 April through the end of 20 April, consisted of a series of meetings between the various SIP Working Groups (WG) aimed at advancing the technical planning within each WG and ensuring that this technical planning is well-coordinated across WGs.  These meetings, also referred to as Cross-WG meetings, were also designed to identify areas of overlap vs. gaps between the WGs, and to help facilitate technical exchange.

Each WG was asked to provide (1) an overall assessment of the effectiveness of the workshop, (2) a summary of “immediate needs” they felt needed to be worked ASAP to ensure success in the long term, and (3) items on the “critical path” that were most important upon which others depended.  A summary of the “immediate needs” and “critical path” items are provided in the SIP meeting summary, which includes the full reports from each WG.

The overall consensus of the meeting participants for both portions of the workshop was very positive, with the most common feedback noting the increase in transparency within the Environmental Modeling Center and NOAA at large, the increasing effort to engage the entire community, and the general sense of positive momentum of the community coming together to embrace the opportunity to use NGGPS as a foundation to build a true community modeling resource for the Nation.

Expanding Capability of DTC Verification

Winter 2016

Robust testing and evaluation of research innovations is a critical component of the Research-to-Operations (R2O) process and is performed for NCEP by the Developmental Testbed Center (DTC).
At the foundation of the DTC testing and evaluation (T&E) system is the Model Evaluation Tools (MET), which is also supported to the community through the DTC. The verification team within the DTC has been working closely with DTC teams as well as the research and operational communities (e.g. NOAA HIWPP program and NCEP/EMC respectively) to enhance MET to better support both internal T&E activities and testing performed at NOAA Centers and Testbeds.

METv5.1 was released to the community in October 2015. It includes a multitude of enhancements to the already extensive capabilities. The additions can be grouped into new tools, enhanced controls over pre-existing capabilities, and new statistics. It may be fair to say there is something new for everyone.

New tools:  Sometimes through the development process, user needs drive the addition of new tools. This was the case for the METv5.1 release. The concept of automated regridding within the tools was first brought up during a discussion with the Science Advisory Board. The concept was embraced as a way to make the Mesoscale Model Evaluation Testbed (MMET) more accessible to researchers and was added. The MET team took it one step further and not only added the capability to all MET tools that ingest gridded data but also developed a stand-alone tool (regrid_data_plane) to facilitate regridding, especially of NetCDF files. 

For those who use or would like to use the Method for Object-based Diagnostic Evaluation (MODE) tool in MET, a new tool (MODE-Time Domain or MTD) that tracks objects through time has been developed. In the past, many MET users have performed separate MODE runs at a series of forecast valid times and analyzed the resulting object attributes, matches and merges as functions of time in an effort to incorporate temporal information in assessments of forecast quality. MTD was developed as a way to address this need in a more systematic way. Most of the information obtained from such multiple coordinated MODE runs can be obtained more simply from MTD. As in MODE, MTD applies a convolution field and threshold to define the space-time objects. It also computes the single 3D object attributes (e.g. centroid, volume, and velocity) and paired 3D object attributes (e.g. centroid distance, volume ratio, speed difference).

To address the needs of the Gridpoint Statistical Interpolation (GSI) Data Assimilation community tool, the DTC Data Assimilation Team and MET team worked together to develop a set of tools to read the GSI binary diagnostic files. The files contain useful information about how a single observation was used in the analysis by providing details such as the innovation (O-B), observation values, observation error, adjusted observation error, and quality control information. When MET reads GSI diagnostic files, the innovation (O-B; generated prior to the first outer loop) or analysis increment (O-A; generated after the final outer loop) is split into separate values for the observation (OBS) and the forecast (FCST), where the forecast value corresponds to the background (O-B) or analysis (O-A). This information is then written into the MET matched pair format. Traditional statistics (e.g. Bias, Root Mean Square Error) may then be calculated using the MET Stat-Analysis tool. Support for ensemble based DA methods is also included. Currently, three observation types are supported, Conventional, AMSU-A and AMSU-B.

Enhanced Controls: Working with DTC teams and end users usually provides plenty of opportunities to identify optimal ways to enhance existing tools. One example of this occurred during the METv5.1 release. Finer controls of thresholding were added to several tools to allow for more complex definitions of events used in the formulation of categorical statistics. This option is useful if a user would like to look at a particular subset of data without computing multi-categorical statistics (e.g. the skill for predicting precipitation between 25.4 mm and 76.2 mm). The thresholding may also now be applied to the computation of continuous statistics. This option is useful when assessing model skill for a sub-set of weather conditions (e.g. during freezing conditions or cloudy days as indicated by a low amount of incoming shortwave radiation).

Another example includes combining several tools such as Gen_Poly_Mask and Gen_Circle_Mask into a more generalized tool Gen_Vx_Mask. The Gen-Vx-Mask tool may be run to create a bitmap verification masking region to be used by the MET statistics tools. This tool enables the user to generate a masking region once for a domain and apply it to many cases. The ability to compute the union, intersection or symmetric difference of two masks was also added to Gen_Vx_Mask to provide finer control  for a verification region. Gen_Vx_Mask now supports the following types of masking definitions: 

MET’s Conditional Continuous Verification. Above panel shows the geographic bias of temperature at surface weather stations in 0.5 degree increments from -4 in purple to +4 in red. The mean bias over entire dataset is -0.43 K.

MET’s Conditional Continuous Verification. Above panel shows bias for all stations observed to be greater than 300 K. The mean bias for the warmer temperatures shows a greater cold bias of -0.87 K.

  1. Polyline (poly) masking reads an input ASCII file containing Lat/Lon locations. This option is useful when defining geographic sub-regions of a domain.
  2. Circle (circle) masking reads an input ASCII file containing Lat/Lon locations and for each grid point, computes the minimum great-circle arc distance in kilometers to those points. This option is useful when defining areas within a certain radius of radar locations.
  3. Track (track) masking reads an input ASCII file containing Lat/Lon locations of a “track” and for each grid point, computes the minimum great-circle arc distance in kilometers. This option is useful when defining the area within a certain distance of a hurricane track.
  4. Grid (grid) masking reads an input gridded data file, extracts the field specified using the its grid definition. This option is useful when using a model nest to define the corresponding area of the parent domain.
  5. Data (data) masking reads an input gridded data file, extracts the field specified by some threshold. The option is useful when thresholding topography to define a mask based on elevation or when thresholding land use to extract a particular category.

Additional examples of enhanced controls include the user being able to define a rapid intensification / rapid weakening event for a tropical cyclone in a more generic way with TC-Stat. This capability was then included in the Stat-Analysis tools to allow for identification of ramp events for renewables or extreme change events for other areas of study.

“The MET team strives to provide the NWP community with a state-of-the-art verification package where MET incorporates newly developed and advanced verification methodologies.”

New Statistics: In support of the need for expanded probabilistic verification capability of both regional and global ensembles, the MET team added a “climo_mean” specification to the Grid-Stat, Point-Stat, and Ensemble-Stat configuration files. If a climatological mean is included, the Anomaly Correlation is reported in the continuous statistics output. If a climatological or reference probability field is provided, Brier Skill Score and Continuous Ranked probability score are reported in the probabilistic score output. Additionally, the decomposition of the Mean Square Error field was also included in the continuous statistics computations. These options are particularly useful to the global NWP community and were added to address the needs of the NCEP/EMC Global Climate and Weather Prediction Branch.

In conclusion, the MET team strives to provide the NWP community with a state-of-the-art verification package. “State-of-the-art” means that MET will incorporate newly developed and advanced verification methodologies, including new methods for diagnostic and spatial verification but also will utilize and replicate the capabilities of existing systems for verification of NWP forecasts. We encourage those in the community to share your requirements, ideas, and algorithms with our team so that MET may better serve the entire verification community. Please contact us at

U.S. Air Force Weather Modeling and the DTC

Spring 2016
During the next decade, the Air Force will be working with our national and international modeling partners toward a goal of consolidated capabilities

The longstanding mission of the U.S. Air Force Weather (AFW) enterprise is to maximize America’s power through the exploitation of timely, accurate, and relevant weather information, anytime, everywhere. To meet this mission, the Air Force has operated a broad range of numerical weather models to analyze and predict environmental parameters that impact military operations. The internally developed Global Spectral Model (GSM, a separate effort from the NCEP GSM) was the first operational model run by the Air Force implemented in the early 1980s. The GSM was replaced by the Relocatable Window Model (RWM) in 1990 and then in the late ‘90s Mesoscale Model 5 (MM5) went into operations. In 2006, the Weather Research and Forecasting (WRF) model became the mainstay for Air Force operations and has remained so for most of the last decade. On 1 Oct 2015, the new Global Air-Land Weather Exploitation Model (GALWEM), based on the United Kingdom Met Office’s (UKMO) Unified Model, was implemented as the Air Force’s primary weather model to meet the warfighter’s global requirements. (Figure below provides timeline of USAF weather model evolution).

Timeline of Air Force weather model evolution

The Air Force is a Charter member of the DTC, as well as a number of other interagency partnerships working toward a shared goal of rapidly and cost effectively advancing U.S. weather modeling capabilities. These include the National Unified Operational Prediction Capability (NUOPC), the National Earth System Prediction Capability (ESPC), and the Joint Center for Satellite Data Assimilation (JCSDA). The science insertion, validation studies, and user product improvements developed through these partnerships have benefited AFW significantly.

Sample of the USAF GALWEM model output used by AFW.

The Air Force’s contributions to the DTC have focused on verification and tuning of the WRF model for a range of domains around the world; support to the standardization, documentation, and baseline management of the Gridpoint Statistical Interpolation (GSI) data assimilation system used by all of the DTC partners; and enhancing the community Model Evaluation Tools (MET) to more effectively verify clouds and other aviation parameters and to improve ensemble verification techniques.

During the next decade, the Air Force will be working with our national and international modeling partners toward a goal of consolidated capabilities to assimilate, analyze and predict parameters critical to military operations in a single model solution, independent from the model(s) used downstream of the DA system. The converged solution is expected to improve overall efficiency and reduce costs while streamlining new science insertion. We look forward to working closely with the DTC to achieve this goal.

Sample of the USAF GALWEM model output used by AFW.

NOAA Selects GFDL’s Dynamical Core

Autumn 2016

In August 2014, numerical weather prediction modelers attended a workshop to discuss dynamic core requirements and attri- butes for the NGGPS, and developed a battery of tests to be conducted in three phases over 18 months. Six existing dynamical cores were identified as potential candidates for NGGPS.

During Phase 1, a team of evaluators ran benchmarks to look at performance, both meteorological and computational, and the stability of the core. The performance benchmark measured the speed of each candidate model at the resolution run currently in National Centers for Environmental Prediction (NCEP) operations, and at a much higher resolution expected to be run operation- ally within 10 years. They also evaluated the ability of the models to scale across many tens of thousands of processor cores.

Assessment of the test outcomes from Phase 1 resulted in the recommendation to reduce the candidate pool to two cores, NCAR’s Model for Prediction Across Scales (MPAS) and GFDL’s Finite-Volume on a Cubed Sphere (FV3), prior to Phase 2.

In Phase 2, the team evaluated the two remaining candidates on meteorological performance using both idealized physics and the operational GFS physics package. Using initial conditions from operational analyses produced by NCEP’s Global Data Assimila- tion System (GDAS), each dynamical core ran retrospective forecasts covering the entire 2015 calendar year at the current opera- tional 13 km horizontal resolution. In addition, two cases, Hurricane Sandy in October 2012, and the May 18-20, 2013 tornado outbreak in the Great Plains were run with enhanced resolution (approximately 3 km) over North America. The team assessed the ability of the dynamical cores to predict severe convection without a deep convective parameterization, using operational initial conditions and high-resolution orography.

The results of Phase 2 tests showed that GFDL’s FV3 satisfied all the criteria, had a high level of readiness for operational imple- mentation, and was computationally highly efficient. As a result, the panel of experts recommended to NOAA leadership that FV3 become the atmospheric dynamical core of the NGGPS. NOAA announced the selection of FV3 on July 27, 2016.

Phase 3 of the project, getting underway now, will involve integrating the FV3 dynamical core with the rest of the operational global forecast system, including the data assimilation and post-processing systems. See results, modeling_nggps_implementation_atmdynamics.

Contributed by Jeff Whitaker.

Hindcast of the 2008 hurricane season, simulated by the FV3-powered GFDL model at 13 km resolution.

NGGPS Dynamical Core: Phase 1 Evaluation Criteria

  • Simulate important atmospheric dynamical phenomena, such as baroclinic and orographic waves, and simple moist convection
  • Restart execution and produce bit-reproducible results on the same hardware, with the same processor layout (using the same executable with the same model configuration)
  • High computational performance (8.5 min/day) and scalability to NWS operational CPU processor counts needed to run 13 km and higher resolutions expected by 2020
  • Extensible, well-documented software that is performance portable
  • Execution and stability at high horizontal resolution (3 km or less) with realistic physics and orography
  • Evaluate level of grid imprinting for idealized atmospheric flows

Phase 2 Evaluation Criteria

  • Plan for relaxing the shallow atmosphere approximation (deep atmosphere dynamics) to support tropospheric and space-weather requirements.
  • Accurate conservation of mass, tracers total energy, and entropy that have particular importance for weather and climate application.
  • Robust model solutions under a wide range of realistic atmospheric initial conditions, including strong hurricanes, sudden stratospheric warmings, and intense upper-level fronts with associated strong jet-stream wind speeds using a common (GFS) physics package
  • Computational performance and scalability of dynamical cores with GFS physics
  • Demonstrated variable resolution and/or nesting capabilities, including physically realistic simulations of convection in the high-resolution region
  • Stable, conservative long integrations with realistic climate statistics
  • Code adaptable to NOAA Environmental Modeling System (NEMS)/ Evaluated Earth System Modeling Framework (ESMF)
  • Detailed dycore (dynamical core) documentation, including documentation of vertical grid, numerical filters, time-integration scheme and variable resolu- tion and/or nesting capabilities.
  • Performance in cycled data assimilation tests to uncover issues that might arise when cold-started from another assimilation system
  • Implementation plan including costs

The need for a Common Community Physics Package

Summer 2016

While national modeling centers can benefit from the expertise in the broader community of parameterization developers, the social and technical barriers to a community researcher implementing and testing a new parameterization or set of parameterizations (a physics suite) in an operational model are high.

Physical parameterization codes are often implemented so that they are strongly linked to a particular model dynamical core, with dependencies on grid structure, prognostic variables, and even time-stepping scheme. Dependencies amongst schemes are also common. For example, information from a deep convection scheme may be needed in a gravity wave drag scheme. While the dependencies are generally justified based on computational efficiency arguments, it complicates the replacement of parameterizations and of suites, marginalizing tremendous scientific talent.

To address these difficulties, and engage the broad community of physics developers in the Weather Service’s Next-Generation Global Prediction System (NGGPS), the DTC’s Global Model Test Bed (GMTB) is participating in developing the Common Community Physics Package (CCPP). The schematic (below) shows the DTC’s proposed modeling meta-structure for NGGPS, with the CCPP shown in the gray box. Specific parameterizations in the CCPP shown here are for example only; other parameterization or set of parameterizations could be displayed in the blue boxes.

Although requirements are sure to evolve depending on priorities and funding, an initial set is in place to inform the CCPP design. They reflect the following vision for the CCPP: (1) a low barrier to entry for physics researchers to test their ideas in a sandbox, (2) a hierarchy of testing capabilities, ranging from unit tests to global model tests, (3) a set of peer-reviewed metrics for validation and verification, and (4) a community process by which new or modified parameterizations become supported within the CCPP. We recognize that an easier technical implementation path for a physical parameterization does not replace the scientific expertise necessary to ensure that it functions correctly or works well as part of a suite. A test environment intended to ease that process is also under development at GMTB, beginning with a single-column model linked to the GFS physics.

The low barrier to entry implies a highly modular code and clear dependencies. Dependencies, and the interface between physics and a dynamical core, will be handled by a thin “driver” layer (dark green box in the schematic). Variables are defined in the driver, and exchanged between model dynamics and various parameterizations. The current driver, being used for the NGGPS dynamic core test participants to run their models with the GFS physics suite, is a descendant of the National Unified Operational Prediction Capability (NUOPC) physics driver. Going forward it will be called the Interoperable Physics Driver. Continuing NUOPC input is critical to success, and the Driver development is proceeding with the NUOPC physics group’s knowledge and input.

The DTC is uniquely qualified to fulfill a leading role in physics development and community support, and the emerging CCPP is a critical element to bridge research and operations. The result will be a capability for operational centers to more rapidly adopt codes that reflect evolving scientific knowledge, and an operationally relevant environment for the broad community of physics developers to test ideas.

NITE: NWP Information Technology Environment

Summer 2015

Over the years, the DTC has put in place several mechanisms to facilitate the use of operational models by the general community, mostly by supporting operational codes (for data assimilation, forecasting, postprocessing etc.) and organizing workshops and tutorials.

By stimulating the use of operational codes by the research community, composed of universities, NCAR, and government laboratories, several new NWP developments have been transitioned to NCEP operations. However, in spite of the relative success of the DTC, there are still significant gaps in the collaboration between the research and operational groups. The NITE project focuses on infrastructure design elements that can be used to facilitate this collaborative environment.

During the past year, the DTC received funding from NOAA to create a design for an infrastructure to facilitate development of NCEP numerical models by scientists both within and outside of EMC. Requirements for NITE are based on a survey of potential users and developers of NCEP models, information obtained during site visits to the NOAA Environmental Modeling Center, the UK Meteorological Office, and the European Centre for Medium-Range Weather Forecasting, discussions with focus groups, and reviews of various existing model development systems.

The NITE design has been developed with the following goals in mind: 

  • modeling experiments easier to run;
  • a single system available to NCEP and collaborators;
  • results relevant for R2O;
  • reproducibility and records of experiments; and
  • general to any NCEP modeling suite.

The following elements are included in the system design:

Data management and experiment database Scientists need access to input datasets (model and observations), a mechanism for storing selected output from all experiments, and tools for browsing, interrogating, subsetting, and easily retrieving data. To facilitate sharing information, key aspects of the experiment setup, such as provenance of source code and scripts, configuration files, and namelist parameters, need to be recorded in a searchable database.

“NWP Information Technology Environment (NITE): an infrastructure to facilitate development of NCEP numerical models.”

Source code management and build systems Source code repositories for all workflow components need to be available and accessible to the community. Fast, parallel build systems should be implemented to efficiently build all workflow components of a suite before experiments are conducted.

Suite definition and configuration tools All configurable aspects of a suite are abstracted to files that can be edited to create the experiments. Predefined suites are provided as a starting point for creating experiments, with scientists also having the option to compose their own suites.

Scripts The scripting is such that each workflow component (e.g., data assimilation) is associated with a single script, regardless of which suite is being run.

Workflow automation system The workflow automation system handles all job submission activity. Hence, the scripts used to run workflow components do not contain job submission commands.

Documentation and training Documentation and training on all workflow components and suites are readily available through electronic means.

In addition to the elements above, standardized tools for data visualization and forecast verification need to be available to all scientists.

Next steps for NITE:  Modernization of the modeling infrastructure at NCEP is very important for community involvement with all NCEP suites, and with the Next Generation Global Prediction System (NGGPS) in particular. The recommended implementation approach for NITE includes several phases, to minimize disruption to operational systems, and limit implementation costs, while providing useful, incremental capabilities that will encourage collaboration. Ongoing discussions between EMC and DTC, especially in the context of NGGPS infrastructure modernization, will likely lead to NITE implementation in the coming years.


NITE design a software infrastructure

DTC: The Next Ten Years

Winter 2015

The transition of research advances into operations (abbreviated as R2O), particularly those operations involving numerical weather prediction, satellite meteorology, and severe weather forecasting, has always been a major challenge for the atmospheric science community.

With a preeminent mission to facilitate R2O in mind, NOAA and NCAR established the DTC in 2003. Since then, the DTC has worked toward this goal in three specific ways:  by providing community support for operational NWP systems, by performing testing and evaluation of promising NWP innovations, and by promoting interactions between the research and operational NWP communities via workshops, a newsletter, and a robust visitor program. Early DTC activities, which were primarily focused on evaluation of opportunities afforded by the then-new Weather Research and Forecasting model (WRF), included the testing and evaluation of two WRF model dynamic cores (one developed at NCAR and the other at EMC), rapid refresh applications; and a real-time high resolution winter forecast experiment. As a neutral party not involved with the development of either core, the DTC played a vital, independent role in these tests, especially their planning, their evaluation, and the provision of statistical results to all parties.

In its other role, that of community support, the DTC began providing users of the operational NMME model with documentation, tutorials, and help desk access in 2005. Since then, this DTC activity has grown in extent and complexity, and today also includes community support for the HWRF end-to-end tropical cyclone prediction system, the Unified Post Processer (UPP), Gridpoint Statistical Interpolation (GSI) and GSI ensemble hybrid data assimilation systems, and the Model Evaluation Tools (MET) verification system. In April 2015, the DTC will host its first Nonhydrostatic Multiscale Model on the B-grid (NMMB) tutorial at College Park, MD. Since its inception, the DTC has in fact organized or co-sponsored 27 community workshops, and has hosted 49 visitor projects selected on the basis of their potential to facilitate interaction between the operational and research NWP communities. The accompanying figures illustrate the distribution and evolution of DTC visitors and users of DTC-supported systems.

“The DTC has organized or co-sponsored 27 community workshops and has hosted 49 visitor projects.”

These activities have so far been primarily focused on regional and national weather modeling. Now, with continued advances in computing technology, global operational NWP using nonhydrostatic models at cloud-permitting resolution is within reach. With this possibility in mind, all major international operational centers are actively developing advanced global models. The United States National Weather Service, for example, initiated a major R2O project in 2014 to develop a Next-Generation Global Prediction System (NGGPS) that would reach mesoscale resolution. The boundary between regional and global modeling at these scales becomes murky indeed, and previous work of the DTC (testing of model physics in regional models, for example) becomes very relevant to global models as well. Recognizing this opportunity, the DTC Executive Committee unanimously voted earlier this year to expand the DTC’s scope to include global modeling. This decision marks a change that will have a profound impact on the direction of the DTC for the next ten years. Here, I offer my perspective on what, in this new context, the DTC should be focusing on in the future.

Storm-scale NWP. While significant progress has been made in NWP over the past decade, society’s expectations have often exceeded improvements. An excellent example is the recent January blizzard forecast for New York City, for which the inability to adequately convey forecast uncertainties in numerical guidance was widely recognized. In a previous but related report, the UCAR Community Advisory Committee for NCEP (or UCACN) pointed out that NCEP does not have an operational ensemble prediction system at convection-permitting (that is, storm-scale) resolution. The development and operation of a prediction system of this kind is a major undertaking, with significant computing demands and challenging scientific and technical issues. Among them are questions concerning initial condition perturbations, model perturbations, calibration, post-processing, and verification, just to name a few. These are also areas of active research attracting the interest of a significant fraction of the 24,000 registered WRF users. Since convection-resolving ensemble prediction is in fact a theme that cross-cuts all its current task areas, the DTC should be well positioned to facilitate R2O toward this end that is useful to both operations and research.

Unified modeling. From an R2O perspective, it is highly beneficial to reduce the number of operational systems, thereby allowing the research community to focus on a smaller number of systems.  Unified modeling (UM), which seeks to limit the proliferation of competing modeling elements, has been recognized worldwide as the most cost-effective approach to deal with the increased number and complexity of numerical weather, climate and environmental prediction systems at all space and time scales. A UM framework also allows sharing of modeling efforts (e.g., improvements in physical parameterizations) across different modeling systems. The UCACN has urged NCEP to migrate toward a UM approach for its future model development, and has suggested an interim goal of reducing NCEP modeling systems to only two: A global weather and climate system (GFS/CFS) and a very-high resolution convection resolving system. With nesting capability, the global high-resolution nonhydrostatic model planned for the NGGPS project could be a suitable candidate for a UM framework at NCEP.  It is true that migration toward UM is a significant challenge for any operational center, involving as it does a major culture change in addition to numerous technical issues. In its capacity for testing and evaluation, the DTC can help facilitate such a transition at NCEP.

“When fully developed, the global system will be an earth modeling system with fully coupled atmosphere, ocean, ice, land, waves, and aerosol components.”

Earth system modeling. When fully developed, the NGGPS will be an earth modeling system with fully coupled atmosphere, ocean, ice, land, waves, and aerosol components. The interactions between these components will require compatibility within the NOAA Environmental Modeling System (NEMS) and the Earth System Modeling Framework (ESMF). The NGGPS is expected to provide improved forecasts at a wide range of time scales, from a few hours to 30 days. For this major undertaking to be successful, the community at large will have to contribute at every step of its development. The DTC can encourage and facilitate these contributions to NGGPS code development by managing that code in a way that allows effective access by external developers, and by performing independent testing and evaluation of system upgrades proposed by the external community.

NWP IT Environment. For each NWP system it supports, the DTC typically maintains a community repository separate from the repository maintained at operational centers. Maintaining a separate community repository is a mixed blessing. On the one hand, a separate repository shields operations from potentially problematic code changes that have not been fully tested. On the other hand, ensuring proper synchronization between the two repositories (a necessary step if the research community is to take advantage of the latest developments at operational centers) becomes a greater challenge. Taking advantage of experience at other operational centers (e.g., ECMWF and UKMO), the DTC in collaboration with EMC has started exploring the possibility of developing an NWP IT Environment (NITE) concept for community support for operational systems. The basic idea of NITE is to maintain an IT infrastructure at the operational center itself (i.e., at EMC) that supports the development, testing, and evaluation of operational models by scientists both within and outside the center. Given the complexity of the NGGPS system, maintaining duplicate systems (repositories) for its many modeling components is neither feasible nor cost effective. This leaves a NITE infrastructure as perhaps the only viable option. The DTC should continue to work with EMC to support NITE development, with the potential of a profound impact on how R2O in NWP is conducted for the coming decade.

Microphysics, from Water Vapor to Precipitation

Summer 2014

NCAR-RAL has a long track record of transitioning numerical weather prediction (NWP) model cloud microphysical schemes from research to operations.

Beginning in the 1990s, a scheme by Reisner et al (1998) was created within MM5 (Fifth-Generation Penn State/NCAR Mesoscale Model) but also transitioned to the Rapid Update Cycle (RUC) model. A few years later, the scheme was modified and updated for both MM5 and RUC by Thompson et al (2004). Then, as the Rapid Refresh (RAP) model was replacing the RUC, an entirely rewritten microphysics scheme by Thompson et al (2008) was created for operational use in the Weather Research and Forecast (WRF) and RAP models. A primary goal of each of these efforts was to improve upon the explicit prediction of supercooled liquid water and aircraft icing while also improving quantitative precipitation forecasts (QPF) and surface sensible weather elements such as precipitation type.

The established pathway for transition to operations for the Thompson et al (2008) microphysics scheme is greatly facilitated through the WRF code repository and a continual collaboration with NOAA’s Earth System Research Laboratory (ESRL) and Global Sciences Division (GSD), especially the team led by Stan Benjamin. Various improvements to the scheme are rapidly implemented into prototype operations at NOAA-GSD for further testing before they eventually transition to the National Centers for Environmental Prediction (NCEP) Environmental Modeling Center (EMC) in the fully operational RAP model at NCEP.

The two-panel figure shows a 48 hour forecast of model lowest level radar reflectivity valid at 0 UTC 02 Feb 2011 made by the WRF-ARW (top panel) model and NEMS NMMB model (bottom panel).

A more recent DTC effort has included the testing and evaluation of the Thompson et al (2008) microphysics scheme into the Hurricane WRF (HWRF) model to see if it improves tropical cyclone track and intensity forecasts. During development, the scheme’s developers had not previously worked in the area of tropical cyclone prediction, but focused instead on mid-latitude weather. The current test may reveal potential improvements to tropical storm prediction or shortcomings in the microphysics scheme that could lead to future improvements.

A second DTC effort is the incorporation of the Thompson et al (2008) microphysics scheme into NCEP’s NEMS-NMMB (NOAA Environmental Modeling System-Nonhydrostatic Multiscale Model on B-grid) model, which is also the current North American Model (NAM). As the NAM transitions to higher and higher resolution, the potential use of alternative microphysics schemes is being considered. To achieve this goal, a number of structural code changes to NEMS-NMMB model were made to accept the larger number of water species used by the Thompson et al (2008) scheme, as compared to number of species in the operational microphysics scheme. However, the extent of code changes directly within the microphysical module was very minimally different than the existing WRF code, which greatly facilitates future WRF-code transitions to NEMS-NMMB.

The two-panel figure above shows a 48 hour forecast of model lowest level radar reflectivity valid at 0000 UTC 02 Feb 2011 made by the WRF-ARW (top panel) model and NEMS-NMMB model (bottom panel). Particularly evident in a comparison of the two model cores are sporadic low-value dBZ forecasts seen in broad areas of the NMMB and to a much lesser degree in the WRF, suggesting a much greater presence of drizzling clouds in the NMMB. Also shown in the figure at the beginning of the article (page 1) is the WRF-predicted explicit precipitation type with blue/pink/green shades representing snow, graupel, and rain, respectively, along with an overlay of colored symbols to represent the surface weather observations of various precipitation types. The notable lack of graupel observations vis-à-vis forecasts likely reflects deficiencies of automated observations.

AMS, Thompson et al. 2008, and 2014,

Keeping up with Model Testing & Evaluation Advances: New Verification Displays

Winter 2014
As numerical model predictions and functions proliferate and move toward ever higher resolution, verification techniques and procedures must also advance and adapt.

Assisting with the Transition of promising NWP Techniques from research to Operations

The ability to consolidate and integrate numerous verification results that are increasingly differentiated in intent and type largely depends on the effectiveness of graphical displays. In response to these needs, several new kinds of displays have recently been added to the DTC and MET arsenal, or are in process of development and assessment at the DTC.

An example is the regional display of verification scores in the figure above, where results from relatively long verification periods at point locations are shown (in this case, dewpoint temperature bias at surface observation sites). Although time resolution is sacrificed, these plots represent an important way to assess topographic, data density, and other geographic effects on model accuracy. In the first figure, for instance, the cluster of red symbols (portraying too-high dewpoints) in the mountains of Colorado, and along the east coast offer clues useful for assessing model inaccuracies. The opposite tendency (low-biased dewpoints, or toodry forecasts) are pronounced over Texas and Oklahoma, and in the Central Valley of California. The figure below is an example of new utilities used by the Ensemble Task to compute and display ensemble-relevant verification results. In this case, it is one way to present the spread-skill relationship, an important characteristic of ensemble systems. As is commonly seen, these particular CONUS-based ensemble members display an under-dispersive relationship; the struggle to create ensemble systems that accurately represent the natural variability is a difficult one still.


Among ongoing and future product directions are display options for time series evaluation of forecast consistency, in particular for “revision series” of hurricane track locations (figure below). The objective of this kind of graphic is to examine the consistency of a model’s track prediction with its own prior forecasts at the same location and time. For many users, this consistency in forecasts through time is a desirable quality; if updating forecasts change much or often, a user may believe they are of low quality, possibly even random. For instance, in the figure, the model shows consistent updates in the Caribbean, and inconsistent (zigzagging) ones as the storm moves northward. These latter forecasts of hurricane location might thus be considered less reliable.

Evaluating WRF performance over time

Autumn 2014
As modifications and additions are made to WRF code and released to the community, users often ask, “Is WRF really improving?”

Time series plot of 2m T (C) bias across CONUS domain over the warm season for WRF versions 3.4 (green), 3.4.1 (blue), 3.5 (red), 3.5.1 (orange), and v3.6 (purple). Median values of distribution are plotted with 99% confidence intervals. The gray boxes around forecast hour 30 and 42 correspond to the times shown in next figure.

This is a hard question to answer, largely because “WRF” means something different to each user with a specific model configuration for their application. With the numerous options available in WRF, it is difficult to test all possible combinations, and resulting improvements and/ or degradations of the system may differ for each particular configuration. Prior to a release, the WRF code is run through a large number of regression tests to ensure it successfully runs a wide variety of options; however, extensive testing to investigate the skill of the forecast is not widely addressed. In addition, code enhancements or additions that are meant to improve one aspect of the forecast may have an inadvertent negative impact on another.

In an effort to provide unbiased information regarding the progression of WRF code through time, the DTC has tested one particular configuration of the Advanced Research WRF (ARW) dynamic core for several releases of WRF (versions 3.4, 3.4.1, 3.5, 3.5.1, and 3.6). For each test, the end-to-end modeling system components were the same: WPS, WRF, the Unified Post Processor (UPP) and the Model Evaluation Tools (MET). Testing was conducted over two three-month periods (a warm season during July-September 2011 and a cool season during January-March 2012), effectively capturing model performance over a variety of weather regimes. To isolate the impacts of the WRF model code itself, 48-h cold start forecasts were initialized every 36h over a 15-km North American domain.

The particular physics suite used in these tests is the Air Force Weather Agency (AFWA) operational configuration, which includes WSM5 (microphysics), Dudhia/RRTM (short/long wave radiation), M-O (surface layer), Noah (land surface model), YSU (planetary boundary layer), and KF (cumulus). To highlight the differences in forecast performance with model progression, objective model verification statistics are produced for surface and upper air temperature, dew point temperature and wind speed for the full CONUS domain and 14 sub-regions across the U.S. Examples of the results (in this case, 2 m temperature bias) are shown in the figures. A consistent cold bias is seen for most lead times during the warm season for all versions (figure on page 1). While there was a significant degradation in performance during the overnight hours with versions 3.4.1 and newer, a significant improvement is noted for the most recent version (v3.6). Examining the distribution of 2 m temperature bias spatially by observation site (figure below), it is clear that for the 30-hour forecast lead time (valid at 06 UTC), v3.6 is noticeably colder over the eastern CONUS. However, for the 42-hour forecast lead time (valid at 18 UTC), v3.4 is significantly colder across much of the CONUS. For the full suite of verification results, please visit: WRF Version Testing website at

The four-panel figure shows average 2 m temperature (C) bias by observation station over the warm season for WRF version 3.4 (left) and 3.6 (right) at forecast hour 30 (top) and 42 (bottom).

Mesoscale Model Evaluation Testbed


Spring 2013

The DTC provides a common framework for researchers to demonstrate the merits of new developments through the Mesoscale Model Evaluation Testbed (MMET).

Established in the Fall of 2012, MMET provides initialization and observation data sets for several case studies and week-long extended periods that can be used by the entire numerical weather prediction (NWP) community for testing and evaluation. The MMET data sets also include baseline results generated by the DTC for select operational configurations.

To date, MMET includes nine cases that are of interest for the National Centers for Environmental Prediction/ Environmental Modeling Center (NCEP/EMC). A brief description of each case, along with access to the full data sets is available at http:// Researchers are encouraged to run several case studies spanning multiple weather regimes to illustrate the versatility of this new innovation for operational use.

“Researchers are encouraged to run several case studies to illustrate the versatility of the system.”

One particular case available in MMET is 28 February 2009, when nearly 7 inches of snow fell in Memphis, TN. A squall line marched through the Southeast along the leading edge of a cold front, prompting three tornado and several high-wind reports. The next two days (1-2 March), snow fell from Atlanta to New York, dropping up to a foot of snow in some areas. The figure above shows the two day precipitation accumulation. This case is of interest to NCEP/ EMC because the North American Mesoscale (NAM) model quantitative precipitation forecast valid 1 March shifted precipitation too far north, missing a rain/snow mix in Georgia and falsely predicting snow in western parts of the Carolinas.

If improved forecast accuracy is demonstrated through objective verification results with MMET cases, the technique can be submitted for further extensive testing by the DTC.

Community users can nominate innovations for more extensive DTC testing by filling out the nomination form ( mmet/candidates/form_submission. php).

As MMET continues to mature, additional cases will be made available to broaden the variety of available events in the collection. Submissions for additional cases to be included in MMET are accepted at: http://www. submission.php. For more information on the testing protocol process defined to accelerate the transition of mesoscale modeling techniques from research to operations, please see testing_protocol.pdf.

Comments and questions regarding MMET or any stage of the testing protocol process can be directed to Jamie Wolff (


Summer 2013

As the 2013 hurricane season continues in the North Atlantic and eastern North Pacific basins, a newly minted HWRF model is providing forecasts for the National Hurricane Center (NHC) on a new machine and with significant code additions. On July 24, the operational HWRF went live on the Weather and Climate Operational Supercomputing System (WCOSS). A research version for testing continues in use on the jet computers at the NOAA ESRL Global Systems Division. New, more efficient code promises to provide quicker processing, allowing timely forecasts and the opportunity to use more sophisticated physics routines.

HWRF simulated satellite image of TC Dorian

This year’s HWRF has several new features and options. Among the most significant are:

1. New data assimilation options. The HWRF can now assimilate wind information from the tail Doppler radar (TDR) on hurricane flight aircraft.

2. Use of hybrid data assimilation system, which allows better use of observations to initialize the model.

3. Increased code efficiency, which allows physics packages to run at 30 second intervals as compared to last year’s 90 seconds.

“Ambitious plans for HWRF in 2014 and beyond include new data and multiple moving nests.”

Additionally, this year’s HWRF public release, for the first time, supports idealized tropical cyclone simulations and hurricane forecasts in basins beyond the eastern North Pacific and North Atlantic.

The DTC conducts testing and evaluation of HWRF, and also serves as an accessible repository for HWRF code. Software version control assures that HWRF developers at EMC, GSD, and other operational and research institutions obtain consistent results. Particular attention has been paid to facilitate the inclusion of new research developments into the operational configuration of all HWRF components. For instance, updated model routines for the Princeton Ocean Model for Tropical Cyclones (POM-TC), developed at the University of Rhode Island, can be seamlessly introduced.

Ambitious plans for the HWRF in 2014 and beyond include code allowing multiple moving nests in the same run, additional targets for data assimilation (dropsondes, etc.), and post-landfall forecasts of ocean surge, waves, precipitation, and inundation.

See the HWRF v3.5a public release announcement in this issue.

SREF and the Impact of Resolution and Physics Changes

Autumn 2013

As operational centers move inexorably toward ensemble-based probabilistic forecasting, the role of the DTC as a bridge between research and operations has expanded to include testing and evaluation of ensemble forecasting systems.

In 2010 the ensemble task area in the DTC was designed with the ultimate goal of providing an environment in which extensive testing and evaluation of ensemble-related techniques developed by the NWP community could be conducted. Because these results should be immediately relevant to the operational centers (e.g., NCEP/EMC and AFWA), the planning and execution of these DTC evaluation activities has been closely coordinated with the operational centers. All of the specific components of the ensemble system have been subject to evaluation, including ensemble design, post-processing, products, and verification. More information about the DTC Ensemble Task organization and goals can be found at:

"It appears that finer resolution improves SREF forecast performance more than changes in microphysics.”

Recently, efforts of the DTC Ensemble team have included evaluation of the impact that changes in the National Centers for Environmental Prediction/Environmental Modeling Center (NCEP/EMC) Short-Range Ensemble Forecast (SREF) configuration have had on its performance. The focus has been on two areas: the impact of increased horizontal resolution and the impact due to changes in the model microphysical schemes. In an initial experiment, SREF performance using 16 km horizontal grid spacing (the current operational setting) was compared with the performance of SREF with potential future horizontal grid spacing of 9 km. In the second experiment the focus was on changes in microphysical parameterizations.

In the current operational version of SREF only one microphysical scheme (Ferrier) is used. That version has now been compared with results from an experimental ensemble configuration that includes two other microphysics options (called WSM6 and Thompson). Although these preliminary tests have used only SREF members from one WRF core (WRF-ARW), future tests will add NMMB members into the analysis. The sets of comparison ensemble systems each consisted of seven members: a control, and two pairs of three members with varying initial perturbations. This preliminary study was performed over the transition month of May 2013, and over the continental US domain. By good fortune, the time period captured one of the most active severe weather months in recent history, promising an interesting dataset for future in-depth studies.

Verification for the set of runs was performed using the DTC’s Model Evaluation Tools (MET) for both single-value and probabilistic measures aggregated over the entire month of study. Some of the relevant results are illustrated in the accompanying figures, each of which displays arithmetic means from the corresponding ensemble system. The first figure shows box plots of bias corrected root mean square error (BCRMSE) with analysis and two lead times for 850 mb temperature for the operational 16 km SREF (yellow), a parallel configuration with a different combination of microphysics (red), and the experimental 9 km setting (purple). For this preliminary run, it appears that finer resolution improves SREF forecast performance more than changes in microphysics. Indeed, the pairwise differences between the 16 km and 9 km SREF forecasts in the second figure represent a comparison for the 24 hr lead time that is statistically significant, albeit for a limited data sample. Additional detailed analyses of an expanded set of these data are under way.

The Community Leveraged Unified Ensemble in the NOAA/Hazardous Weather Testbed Spring Forecasting Experiments


The Community Leveraged Unified Ensemble, or CLUE, is an unprecedented collaboration between academic and government research institutions to help guide NOAA’s operational environmental modeling at the convection-allowing scale. The CLUE is produced during the annual NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiment (SFE), where the primary goal is to document performance characteristics of experimental Convection-Allowing Modeling systems (CAMs). The HWT SFE is co-organized by NOAA’s National Severe Storms Laboratory (NSSL) and the Storm Prediction Center (SPC).

Since 2007, the number of CAM ensembles examined in the HWT has increased dramatically, going from one 10-member CAM ensemble in 2007 to six CAM ensembles in 2015 that totaled about 70 members.  With these large and complex datasets, major advances were made in creating, importing, processing, verifying, and providing analysis and visualization tools. After the 2015 SFE it was clear that progress toward identifying optimal CAM ensemble configurations was being inhibited by the contributions of independently designed CAM systems created with different research goals. This made it difficult to analyze performance characteristics.  Furthermore, a December 2015 report by the international UCACN Model Advisory Committee, charged with developing a unified NOAA modeling strategy to advance the US to world leadership in numerical modeling, recommended that:

  1. The NOAA environmental modeling community requires a rational, evidence-driven approach towards decision-making and modeling system development,
  2. A unified collaborative strategy for model development across NOAA is needed, and
  3. NOAA needs to better leverage the capabilities of the external community.

In the spirit of these recommendations and recognizing the need for more controlled experiments, SFE organizers developed the concept of the CLUE system.  Beginning with the 2016 SFE, the ensemble design effort was much more coordinated. All collaborators agreed on a set of model specifications (e.g., model version, grid-spacing, domain size, physics, etc.).  Forecasts contributed by each group could then be combined to form one large, carefully designed ensemble, which comprises the CLUE. The CLUE design for each year has been built around already existing funded projects led by external HWT collaborators.  Thus, HWT partners can run experimental systems to meet the expectations of their funded projects, and at the same time contribute to something much bigger.

During the last three years of the SFEs, the CLUE configurations have enabled experiments focused around several aspects of CAM ensemble design including: impact of single, mixed, and stochastic physics, data assimilation strategies, impact of multi-model vs. single model, forecast skill of FV3, microphysics sensitivities, and impact of ensemble size.  Collaborators have included the Center for Analysis and Prediction of Storms at the University of Oklahoma (OU), the National Center for Atmospheric Research, The University of North Dakota, the Multi-scale Data Assimilation and Predictability Laboratory at OU, NSSL, NOAA’s Global Systems Division of the Earth Systems Research Laboratory, and NOAA’s Geophysical Fluid Dynamics Laboratory.

The Developmental Testbed Center (DTC) has been a major contributor to the CLUE effort. To date, two DTC Visitor Program projects have involved examining the impact of radar data assimilation and mixed versus single physics using CLUE data.  Furthermore, DTC has led much of the configuration design and verification of CLUE stochastic physics experiments. Finally, CLUE data is being used in a Model Evaluation Tool development project directed towards providing the ability to produce a CAM verification scorecard.  Ultimately, the CLUE is a positive step towards improving US modeling and is already providing helpful insight for designing future operational systems, impacting broad sectors of the weather enterprise including NOAA’s efforts to develop a Weather-Ready Nation. 

Hazardous Weather Testbed Spring Forecasting Experiments. Photo by James Murnan, NSSL.

Hierarchical Model Development and Single Column Models


Earth system models connect the atmosphere, ocean, and land, and depend on proper representations of dynamics and physics, initial conditions, and interactions of these processes to predict future conditions. Standard meteorological variables are used to validate typical numerical weather prediction models but are gross measures of these countless interactions and limit their usefulness for guiding model improvement. Some fraction of error in these metrics can be the result of specific physical parameterizations, but it can be difficult to trace the source. One solution is to isolate these parameterizations – compare them with something measurable. These process-level metrics can help us begin to understand and then address the systematic biases in a given parameterization before we can consider the root causes of systematic biases in a more fully-coupled model.

Single Column Model (SCM) testing is part of the hierarchical model development approach by the Global Model Test Bed (GMTB) under the Developmental Testbed Center (DTC).  DTC/GMTB is a joint effort between NCAR and NOAA/ESRL, in collaboration with their external research-to-operations partners, and led by personnel in NCAR/RAL/JNT and NOAA/ESRL/GSD.  Single column models (SCMs) are an excellent way to evaluate the performance of a set of model physics because many physical processes primarily interact in the vertical, with horizontal transport by dynamics.  Here, the model physical parameterizations are connected (as a column) and are provided with the necessary initial conditions and lateral forcing to investigate the evolution of the profile.  SCM forcing may be from model, observational (e.g. from field programs) or idealized/synthetic data sets, to explore the response of the physics in different conditions, as well as to “stress test” parameterizations. In addition, computational resources required to run a SCM are orders of magnitude smaller than a fully-coupled model, and so may run in seconds on a laptop.  SCMs with options to turn on and off various parameterizations, then allow for the examination of the interactions of those parameterizations, e.g. land plus surface-layer turbulence plus atmospheric boundary-layer.

The question to answer is, “do we obtain the same performance when the parameterizations are run separately as we do when they are coupled?”  A model can be tuned to obtain some required level of performance, but the more complex the system, the more tuning may be accommodating a number of compensating errors, rather than making improvements to the model physics.  What we are ultimately after is "getting the right answers for the right reasons," first testing a parameterization in isolation, then progressively adding parameterization interactions, up to a SCM. Using SCMs can enhance interactions with the Research-to-Operations (R2O) community, where they often work on physics development, but may not have their focus on or computer resources to do fully-coupled model runs, which could include data flow, data assimilation, model output post-processing, etc.

Note that at higher resolutions (model grid boxes that are on the order of 5-10 km or less), the evaluation of some physics (most notable convection and convective systems) requires at least a limited-area model to examine processes and identify systematic biases, where circulations are induced between grid boxes.  This is a part of the hierarchy of model testing and development, where the follow-on steps are then regional, continental, and global-scale models, which have more traditional NWP metrics of performance. One must still get the physics right with process-level metrics of performance. We must “look under the hood” to see what is really going on if we are to make real improvements in the performance of Earth system and numerical weather prediction models.



Important local land-atmosphere interactions
Important local land-atmosphere interactions for conditions of daytime surface heating, where arrows indicate model processes for radiation, boundary-layer, and land.  Solid arrows indicate the direction of feedbacks that are normally positive (leading to an increase of the recipient variable).  Dashed arrows indicate negative feedbacks.  Two consecutive negative feedbacks make a positive feedback.


The Global Model Test Bed: Bringing the U. S. scientific community into NCEP global forecast model development

"Entraining a vibrant, diverse external community into NCEP global model development will bring broader dividends. ....Surely the U.S. can marshal its intellectual resources to do even better and create the world's best unified modeling system using the GMTB as a collaborative platform."

The DTC is at the core of an exciting new effort to more effectively bring the U. S. scientific community into the development of our national global weather forecast model, the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS).  This effort is part of NOAA’s ongoing Next Generation Global Prediction System (NGGPS) Program, which started in 2014.  NGGPS is a multimillion dollar effort to support implementation, testing and refinement of community-driven improvements that aim to transform the GFS into a unified weather and seasonal forecast system (UFS) with world-leading forecast skill.  

Up to now, the GFS has been developed primarily within NCEP’s Environmental Modeling Center (EMC) in College Park, MD.  This has naturally led to barriers to effective participation of the external community. These barriers include lack of documentation about how to run the model and the implementation of the equations and parameterizations, limited model diagnostics and metrics, and lack of a well-organized and accessible code repository.  Two further issues are software engineering that does not easily support the testing of major changes in the model physics and dynamics, and complications in accessing NOAA high-performance computing resources for model testing.  

DTC’s Global Model Test Bed (GMTB), led by Ligia Bernadet and Grant Firl, is an ambitious project to make GFS/UFS model development much more user-friendly, catalyzing partnerships between EMC and research groups in national laboratories and academic institutions.   The GMTB aims to implement transparent and community-oriented approaches to software engineering, metrics, documentation, code access, and model testing and evaluation.   

A first step in this direction, in collaboration with EMC, has been the design of an Interoperable Physics Driver and design of standard interfaces for a Common Community Physics Package.  These software frameworks allow for easy interchange of dynamical cores or different physical parameterizations.  For instance, the suite of physical parameterizations used in the GFDL or CESM climate models, or a new cumulus or microphysical parameterization can be tried out within GFS.

At present, the GMTB supports the use of a single-column version of the global model. The single column is useful for running case studies that isolate particular physical processes such as stable boundary layers or tropical oceanic deep cumulus convection, and global atmospheric simulations with specified geographical distribution of sea-surface temperatures.  Global hindcast simulations can be evaluated using the same set of metrics currently used at EMC for weather forecasts, which focus on forecast lead times of 10 days or less.  GMTB has already performed an evaluation of an alternative cumulus parameterization scheme within GFS using this approach, and may soon be testing alternative microphysical or boundary-layer parameterizations.

To realize the vision of a unified model that can be used out to seasonal timescales, the GFS must also be systematically tested at lower grid resolution in an ocean-coupled mode.  A ‘test harness’ of hindcast cases must be implemented for evaluating model performance in that setting, in which skill in forecasting modes of low-frequency variability such as ENSO, the Madden-Julian Oscillation, and the North Atlantic Oscillation, as well as land-atmosphere coupling, becomes paramount.  Metrics of two-week to 6-month forecast skill must be agreed upon by EMC and the broader community and balanced with more typical measures of shorter-range weather forecast skill.   GMTB will need to implement both the test harness of coupled model simulations and the unified metrics suite.

Over the long term, GMTB will need to address a variety of other nontrivial challenges to be successful.  The most important is maintaining a close working relationship with EMC, such that the codes, metrics, and cases that EMC uses for evaluating new model developments for operational readiness are the same as those used by outside developers.  GMTB also needs streamlined access to dedicated high-performance computing such that a new user can quickly work on modifying and running GFS without lengthy delays in obtaining needed approvals and resources.  The above vision also places responsibility for GMTB to be the help desk for outside GFS/UFS model developers, which will require adequate trained staff and extensive improvement of model documentation. GTMB will need to play an important role in model evaluation, promoting transparent, trusted decision-making about what model developments are ready to be considered for operational testing and implementation (though NCEP will have the final word on what gets implemented for operations).   Lastly, an important issue for the future scope of GMTB is whether and how to bring data assimilation, another key element of the forecast process, into this vision.

Entraining a vibrant, diverse external community into NCEP global model development will bring broader dividends.  More eyes will lead to more insight into model strengths and weaknesses, and young scientists will naturally learn about GFS and provide a talent pool for making it a world-leading model. The framework of interoperability could be broadened to include climate models such as CESM, allowing further cross-talk between the weather and climate modeling communities. The UK Met Office has demonstrated the strength of this approach; surely the U. S. can marshal its intellectual resources to do even better and create the world’s best unified modeling system using the GMTB as a collaborative platform.


Sample results for Hurricane Matthew from GMTB evaluation of an alternative cumulus parameterization scheme, showing the daily averaged Upward Short Wave Radiative Flux (USWRF) and cloudiness 2 Oct 2016 0Z. Left is a control GFS run, middle is experimental run, right is the difference in the low cloud coverage between the two runs. Conclusion, the experimental run.