News | Bridges to Operations

Bridges to Operations

Technical Aspects of Generalizing the UFS to use Multiple Dynamical Cores

Winter 2024

Following the scientific and programmatic discussions surrounding a potential shift of the Rapid Refresh Forecast System (RRFS) toward using the Model for Prediction Across Scales (MPAS) dynamical core (Carley et al. 2024), a Tiger Team was formed to scope out the technical work needed to add a second dynamical core (dycore) to the Unified Forecast System (UFS).  The Tiger Team focused on a two-pronged approach: scope out the inclusion of a generic new dycore in the UFS, and focus the majority of its work on the MPAS dycore. Similarly, since the drive for a new dycore comes from the RRFS team, the Tiger Team kept in mind the use of the MPAS dycore for all UFS Apps, while focusing primarily on the UFS Short Range Weather (SRW) App.

The Tiger Team collected input from UFS App leadership teams and from NSF NCAR. The connection with both NSF NCAR Mesoscale and Microscale Modeling (MMM) and Climate and Global Dynamics (CGD) Laboratories is relevant because MMM develops and uses the MPAS model, while CGD uses the MPAS atmospheric component dycore in the Community Atmospheric Model, the atmospheric component of the Community Earth System Model (CAM/CESM). The solution proposed by the Tiger Team is similar to the one used by CGD. The vision is to use the MPAS dycore in the UFS without using the entire MPAS code available on Github, which comes with additional components such as a driver, a framework, and additional utilities. This arrangement will allow the UFS to retain core parts of its infrastructure, such as its workflow, connection with physics via the Common Community Physics Package (CCPP), Input/Output (I/O) strategy, post-processing capability, and product generation. While this approach is more costly initially, it will save resources in the long run, facilitate community engagement, and smooth out the path for bringing innovations into NOAA operations.

The vision is to use the MPAS dycore in the UFS without using the entire MPAS code available on Github. This way, the UFS will retain core parts of its infrastructure, such as its workflow, connection with physics via the Common Community Physics Package (CCPP), Input/Output (I/O) strategy, post-processing capability, and product generation.

The technical challenges of including the MPAS dycore in the UFS were estimated to be on the order of 13 full-time equivalent employees and can be grouped into the main areas below. Additional resources may be needed to support NSF NCAR for this collaboration and to conduct scientific testing. It should be noted that this level of effort corresponds to the initial integration of the MPAS dycore in the UFS, and does not represent the ongoing overhead of maintaining a dual dycore forecast system.

  • Generalization of the UFS atmospheric component.  Portions of code tie directly to the Finite-Volume Cubed-Sphere (FV3) dynamics, and these portions need to be generalized to support multiple dycores. The build system needs to be modified to accommodate this generalization.
  • Code management and testing. A code management plan needs to be devised jointly with NSF NCAR to manage the insertion and potential updates to the MPAS dycore. New regression tests need to be added to the UFS Weather Model to cover the new dycore.
  • Pre-processing. It will be necessary to integrate already-existing MPAS utilities to prepare initial condition and static files into the UFS. Tools to obtain/create new MPAS meshes will need to be available to the community.
  • Data assimilation. Significant work is needed to connect the Joint Effort for Data assimilation Integration (JEDI) with the UFS Weather Model and with RRFS in particular. That said, given that the JEDI interfaces are model agnostic and that the JEDI-MPAS capability already exists, there is no new cost generated by the dynamical core switch. It is assumed that no efforts will be made to integrate the MPAS dycore with the legacy Gridpoint Statistical Interpolation (GSI) data assimilation system.
  • Physics-dynamics coupling. While the CCPP offers model-agnostic interfaces, the substantial differences in physics dynamics coupling between FV3 and MPAS will demand some adjustments. Those pertain to where in the dycore the physics tendencies should be applied, differences in time-split versus process-split approaches, conversions toward the MPAS height-based coordinates, and to the development of MPAS-specific interstitial schemes. Additional effort will be needed to adapt existing stochastic processes.
  • Inter-component coupling. The MPAS National Unified Operational Prediction Capability (NUOPC) cap existing in CAM/CESM will be leveraged to expose the MPAS dycore geometry and domain decomposition in the cap of the UFS atmospheric component. Aspects of data memory allocation, run sequence, and import/export of fields will need to be addressed.
  • Input/Output and post-processing. Since MPAS outputs data on its native unstructured mesh, additional tools will be needed to convert the output to the desirable lat-lon grids. First, stand-alone tools can be used. Ultimately, to improve performance for operations, the Unified Post-Processor (UPP) and the UFS asynchronous I/O component need to be generalized to write out the desired products.
  • Workflow. The workflow(s) will have to be modified to include the MPAS-specific tasks.

For more information about this effort, readers are referred to Wang et al. (2023).

METplus in Operations

Autumn 2023

During the first few years of the DTC, the United States Air Force (USAF) put forth a vision for the DTC to develop a suite for verification tools that would provide reproducible results such that statistics and metrics could be shared across institutions.  The USAF envisioned a suite of tools that would include both standard verification approaches, as well as advanced diagnostic methodologies.  In 2008, the DTC released the software package, Model Evaluation Tools (MET) to the community with the intent to provide state-of-the-science verification and diagnostic tools to the community. Over the past 15 years, an enhanced version of MET, called METplus, has become a software system collaboratively developed by the Developmental Testbed Center (DTC) and community partners.

Several operational entities are nearing completion of their research-to-operations process (R2O) to include METplus in their operational verification systems.  NOAA’s Environmental Modeling Center (EMC) will be the first to make the transition to a system built using METplus. The first version of the EMC Verification System (EVS v1), which employs METplus to generate real-time model verification statistics, will become operational in fiscal year (FY) 2024. EMC, DTC, and the National Centers for Environmental Prediction (NCEP) Central Operations (NCO) collaborated  from FY2022 to the present to install METplus on the Weather and Climate Operational Supercomputer System (WCOSS2), working through software library issues and strict installation standards to ensure METplus meets rigorous WCOSS2 security and installation standards.

Ultimately, METplus allows EVS v1 to publish plots for most real-time EMC environmental forecast models to a web page, enabling anyone with internet access the ability to determine how EMC models are currently performing.

Through highly collaborative meetings and frequent communication, EMC and DTC developed a suite of best practices to streamline the METplus installation process with NCO on WCOSS2, drastically reducing the time from the METplus software release to operational capability (e.g., months to days). Ultimately, METplus allows EVS v1 to publish plots for most real-time EMC environmental forecast models to a web page, enabling anyone with internet access the ability to determine how EMC models are currently performing.

EVS version 2 is planned to be an extension of EVS v1 with additional metrics and capability. Development of the EVS v2 system will begin shortly after version 1 becomes operational. The current development plan includes options for the system to run outside of WCOSS2, perhaps in a cloud environment or on other machines.

Global (GFS) Headline Scores (click on image to enlarge)

The METplus governance partners are also developing operational systems powered by METplus. Besides a representative from NOAA labs and centers, the governance partners include NSF NCAR, USAF, United Kingdom Met Office (UKMO), Naval Research Laboratory (NRL), and Australia’s Bureau of Meteorology (BoM).  A detailed description of how METplus is being incorporated into the USAF evaluation system is described in the “Director’s Corner” of this newsletter. The USAF plans for METplus to become their cybersecure verification software stack that provides the standard for upgrading their models.

The Met Office made the decision to adopt a MET-based system for its operational verification replacement back in 2019. A first complete instantiation of the current operational system capability will go into parallel testing ahead of operational implementation in mid-2024. The long-term plan is for Met Office scientists and software engineers to become integral members of the open-source development community beyond the DTC. To date, code contributions have been small but significant (e.g. the first inclusion of OpenMP) as have the contributions as subject-matter expertise and beta testers for seven coordinated METplus releases.

Both NRL and Australia's BoM are in the beginning phases of testing METplus for their operational systems. NRL has worked both collaboratively and via funded projects with the NCAR node of DTC to develop support for their ensemble, atmospheric composition, and data assimilation systems. The BoM has been providing in-kind testing and code-management capability to all aspects of METplus to reach their near-term goal of a quick transition of METplus to operational use. In summary, it has taken 15 years, but the vision of a framework of consistent verification tools for the community is becoming more attainable as each institution adopts METplus for both operational use and system development.

Informing and Optimizing Ensemble Design in the RRFS

Summer 2023
Implementing a single-dycore and single-physics RRFS will allow a number of deterministic modeling systems to be retired.

The transition from current operational convection-allowing models (CAMs) to the Rapid-Refresh Forecast System (RRFS) involves a significant paradigm shift in CAM ensemble forecasting for the US research and operational communities. In particular, implementing a single-dycore and single-physics RRFS will allow a number of deterministic modeling systems to be retired, as well as the multi-dycore and multi-physics High-Resolution Ensemble Forecast (HREF) system. Such a change brings up a number of ensemble design questions that need to be addressed to create an RRFS ensemble that effectively addresses concerns about sufficient spread and forecast accuracy. Single-dycore and single-physics CAM ensembles are known to lack sufficient spread to account for all potential forecast solutions; therefore, a concerted research effort is necessary to evaluate potential options to address these concerns.

To this end, the DTC has been conducting two RRFS ensemble design assessments using both previous real-time data and retrospective, cold-start simulations in an operationally similar configuration. The first evaluation assessed time-lagging for the RRFS, with quantitative comparisons conducted between two RRFS-based ensembles run in real-time for the 2021 Hazardous Weather Testbed Spring Forecasting Experiment (HWT-SFE).

A nine-member time-lagged ensemble was created for a period of 13 days during the 2021 HWT-SFE by combining five members initialized at 00Z plus four members from the previous 12Z initializations. While the time-lagged members contain longer lead times, they still provide valuable predictive information, and so the 5+4 constructed time-lagged (TL) ensemble’s skill was compared with the full single-init (SI) ensemble, where all nine members came from the initialization at 00Z.

While the primary goal of a time-lagged ensemble is to derive additional predictive information from older initializations without the added cost of numerical integration, the results revealed that a set including TL members can contribute additional spread when compared to an SI ensemble of the same size, with little to no degradation in skill. This result was observed in most variables, atmospheric levels, and ensemble metrics, including evaluations of spread and accuracy, reliability, Brier score, and more. Almost all metrics exhibited a matched (and occasionally improved) level of skill and reliability between the TL and SI experiments, while many variables showed a statistically significant improvement to spread.

In preparation for the first implementation of the RRFS, results were communicated to model developers and management at NOAA/GSL and NOAA/EMC, the principle laboratories working on the system.  Due to computational constraints, and informed by our findings, a decision was made to pursue the use of time-lagging as part of the ensemble design of RRFSv1.

A second evaluation will explore the impact of initial condition (IC) uncertainty and stochastic physics in retrospective cold-start simulations. Employing initial condition perturbations from GEFS forecasts, these experiments are run with and without stochastic physics to isolate impacts of/investigate potential optimization for both uncertainty representation methods and their evolution as a function of forecast lead time. Given these results, in combination with the time-lagging experiments, we hope to help address the pressing need for operational systems to produce an adequate sample of potential outcomes while balancing the constraints of computational cost.

Time-Lagged ensemble paintball plot of 30 dBZ composite reflectivity thresholds. 00Z initialized members are shown in warm colors (mem1,6-9), while lagged members from the previous 12Z initialization are shown in cool colors (mem2-5). MRMS observations in black contours.

2-m temperature comparison of accuracy (root mean squared error; RMSE) and spread of the time-lagged ensemble vs single-init ensemble of the same size. While the single-init ensemble showed more accuracy in the first few evaluation hours, skill was often matched by the TL ensemble for the remainder of the period, while spread for the TL ensemble was consistently improved over the SI ensemble.

Initial Operational Implementation of the UFS Hurricane Application

Spring 2023

The initial operational implementation of the Hurricane Analysis and Forecast System (HAFS) for tropical cyclone (TC) forecasting was recently approved for operations by the NOAA National Centers for Environmental Prediction (NCEP) Central Operations (NCO) in advance of the 2023 Atlantic basin hurricane season. HAFS is the operational instantiation of the Unified Forecast System (UFS) hurricane application, with a focus on transitioning TC modeling research to operations. HAFS is an atmosphere-ocean-wave coupled TC forecast system, featuring convection-allowing high-resolution storm-following nests, vortex initialization, inner-core data assimilation, and TC-calibrated model physics.

The development of HAFS began in 2019, when increasingly complex configurations of HAFS (HAFSv0.0 through HAFSv0.3) were run and evaluated during the Hurricane Forecast Improvement Project (HFIP) real-time demonstration (see figure below), providing the groundwork for HAFSv1.0.

Timeline of HAFS development, adapted from EMC.

For the initial operational capability of HAFS (HAFSv1.0), two distinct configurations were selected to replace the existing regional operational hurricane forecast systems, Hurricane Weather Research and Forecast (HWRF) and Hurricanes in a Multi-scale Ocean-coupled Non-hydrostatic Model (HMON). The HAFSv1.0a (HFSA) configuration will replace HWRF; whereas the HAFSv1.0b (HFSB) configuration will replace HMON. Both HAFS configurations include a storm-centric domain with one moving nest at 6 km and 2 km, respectively, 81 vertical levels, and a 2-hPa model top. Additionally, both configurations employ four-dimensional ensemble variational (4DEnVar) data assimilation with warm-cycling vortex initialization (VI) and two-way HYbrid Coordinate Ocean Model (HYCOM) ocean coupling.

For HFSA, unique features include a slightly larger parent domain than HFSB, one-way wave (WAVEWATCH III, WW3) coupling, a greater maximum wind threshold for VI, and up to seven storms run for all global basins. The HFSB configuration does not include wave coupling, and runs up to five storms for National Hurricane Center (NHC) and Central Pacific Hurricane Center (CPHC) basins only. In order to provide additional diversity, HFSA and HFSB apply two different physics suites. The major differences between the HFSA and HFSB physics suites stem from the planetary boundary layer (PBL) and microphysics schemes, with the HFSB configuration employing an enhanced TC PBL option (Chen et al. 2022) and the Thompson double-moment microphysics scheme, as opposed to the GFDL single-moment microphysics for the HFSA configuration. Both HAFS configurations demonstrated improved track and intensity skill relative to the current operational hurricane models over a 3-year retrospective period covering all storms in the Atlantic and Eastern Pacific basins, leading to the operational implementation of HAFS as part of NOAA NCO.

HAFS demonstrated improved track and intensity skill relative to the current operational hurricane models over a 3-year retrospective period covering all storms in the Atlantic and Eastern Pacific basins, leading to the operational implementation of HAFS as part of NOAA NCO. This significant milestone was achieved through a collaborative effort.

Throughout this process, the DTC provided software management and community support to ensure that distributed development efforts for HAFS were achievable through strong code governance and a developer support system. In addition to their community engagement role, DTC also conducts independent testing and evaluation (T&E), which focuses on providing value-added evaluations for HAFS forecasts during the pre-implementation process. For HAFSv1.0, the DTC used the enhanced Model Evaluation Tools (METplus) to provide TC-centric evaluations such as track, intensity, rapid intensification, large-scale, and quantitative precipitation forecast (QPF) verification. These evaluations provided additional evidence and support for the HAFSv1.0 implementation into operations. For example, the evaluation provided confidence in the implementation of Thompson microphysics for one HAFS configuration based on the improved precipitation structure when compared to that produced by the GFDL microphysics.

To further support these efforts, the DTC Visitor Program is supporting three community principal investigators (PI) to work on projects aimed at the transition of research developments to operations and ultimately improved HAFS forecasts. Projects by Andrew Hazelton (Atlantic Oceanographic and Meteorological Laboratory (AOML)/Hurricane Research Division (HRD) and University of Miami (UM)/Cooperative Institute for Marine and Atmospheric Studies (CIMAS)), Mike Iacono and John Henderson (Atmospheric and Environmental Research), and Shaowu Bao (Coastal Carolina University) all aim to provide improvements and diversity to the physics parameterizations used in the HAFS configurations. As preparations for HAFSv2.0 are gearing up, these visitor projects, aligned with continued DTC T&E activities, are well positioned to impact the next HAFS implementation, planned for 2024. 

This significant milestone was achieved through a collaborative effort, led by NOAA NCEP’s Environmental Modeling Center (EMC), including active development and verification efforts from 

  • NOAA Atlantic Oceanographic and Meteorological Laboratory's Hurricane Research Division and Physics Oceanography Divisions,
  • NOAA Geophysical Fluid Dynamics Laboratory, 
  • NOAA National Hurricane Center,
  • Developmental Testbed Center, 
  • National Center for Atmospheric Research, 
  • Naval Research Laboratory, 
  • University of Oklahoma, 
  • University of Miami / Cooperative Institute for Marine and Atmospheric Studies, 
  • University of Maryland, 
  • State University of New York  / University at Albany, and
  • University of Alabama Huntsville.

For more news related to HAFS, see the Lead Story article: The CCPP Goes Operational.

Informing NCEP Legacy Operational Model Retirement Through Scorecards

Winter 2023

NOAA is undergoing a massive, community-driven initiative to unify the NCEP operational model suite under the Unified Forecast System (UFS) umbrella. A key component of this effort is transitioning from the legacy systems to unified Finite-Volume Cubed-Sphere (FV3)-based deterministic and ensemble operational global and regional systems. For the UFS, the goal is to consolidate operational models around a common software framework, reduce the complexity of the NCEP operational suite, and maximize available HPC resources, which is especially imperative with a shift toward using ensemble-based operational systems. As such, a number of current operational systems are slated to be retired; however, before systems can be phased out, the upcoming systems need to perform on par or better than the systems they are replacing.

To address this evaluation requirement, the DTC was charged with creating performance summary “scorecards” to inform model developers, key stakeholders, and decision makers on the retirement readiness of legacy systems, as well as highlight areas that can be targeted for improvement in future versions of UFS-based operational systems. Scorecards are used as a graphical synthesis tool that allow users to objectively assess statistically significant differences between two models (e.g., the UFS-based Rapid Refresh Forecast System (RRFS)  and one of the operational systems) for user-defined combinations of forecast variables, thresholds, and levels for select verification measures.

Scorecards are used as a graphical synthesis tool that allow users to objectively assess statistically significant differences between two models, e.g., the UFS-based Rapid Refresh Forecast System and one of the operational systems, for user-defined combinations of forecast variables, thresholds, and levels for select verification measures.

This exercise focused on evaluating the UFS-based Global Forecast System (GFS) against the North American Mesoscale (NAM) model and Rapid Refresh (RAP) model, as well as the UFS-based Global Ensemble Forecast System (GEFS) against the Short-Range Ensemble Forecast (SREF) system. The eventual goal is to replace the NAM and RAP with the GFS for medium-range forecasting, and replace the SREF with the GEFS as a medium-range ensemble-based system. The scorecards were created with the METplus Analysis Suite using verification output from April 2021 - March 2022; verification output was provided by NOAA/EMC (special thanks to Logan Dawson and Binbin Zhou at EMC for facilitating the data transfer!). The provided verification output allowed for deterministic, ensemble, and probabilistic grid-to-grid and grid-to-point evaluations over four seasonal aggregations.

Key results indicated promising GFS results against the NAM, but the RAP is still largely competitive. When evaluating the GFS against the NAM, precipitation is consistently better forecast in the GFS. For upper-air fields, the GFS generally performs as well or better than the NAM; however, convective-season upper-air forecasts could be a target for improvements in the GFS, as the NAM was most competitive during this period. When evaluating the GFS against the RAP, surface-based and low-to-mid-level verification for the GFS is generally on par or worse than the RAP. The GFS scores well aloft and with cold season precipitation, but an area of focused improvement should be directed toward warm season precipitation. When considering GEFS versus SREF, the GEFS performance was slightly better overall in the fall and winter seasons, but worse in the spring and summer seasons. Results have been shared through presentations at the weekly NOAA/EMC Model Evaluation Group (MEG) meeting as well as a UFS Short-Range Weather/Convective-Allowing Model (SRW/CAM) Application Team meeting. Examples of the scorecards created during this evaluation are provided in Figures 1 and 2.


Figure 1. Scorecard of Gilbert Skill Score (GSS) for 3-h accumulated precipitation at specified thresholds and forecast lead times for the 00 UTC initializations over the period July 1, 2021 - Sept. 30, 2021. Results indicate that after the 12-h forecast lead time, when there are statistically significant differences, the GFS outperforms the NAM.

Figure 2. Scorecard of Continuous Ranked Probability Score (CRPS), bias, and root-mean squared-error (RMSE) for 2-m temperature, 2-m relative humidity, U-component of the wind, V-component of the wind, pressure reduced to mean-sea level (PRMSL), and total cloud for various forecast lead times for the 12/15 UTC initializations over the period April 1, 2021 - June 30, 2021. Results are mixed, with GEFS having slightly worse performance overall; however, GEFS does frequently outperform SREF at 06-, 24-,30-,48-,54-,72-, and 78-h forecast lead times.

Single-Precision Physics in CCPP

Autumn 2022

A constant struggle in NWP model design is the tradeoff between scientific improvements and computational cost. A method commonly used to balance that tradeoff is lowering some numerical calculations to single precision (or 32-bit calculations). Often, single-precision calculations have enough precision for physics, and they reduce disk storage, memory usage, and computation time. To apply single-precision calculations correctly, it is necessary to carefully evaluate and fine-tune algorithms, and perhaps in isolated places of the code, perform calculations in double precision (64 bits).

This approach is already used in several operational models. For example, the NOAA operational RAP and HRRR models (which use the WRF model) primarily use single-precision physics and the ECMWF Integrated Forecasting System (IFS) model recently switched the bulk of its physics calculations from double to single precision. However, until recently, the DTC-hosted Common Community Physics Package (CCPP), which is directly used by the UFS modeling system, was missing this key capability.

Earlier this year, the US Naval Research Laboratory (NRL) successfully enabled a single-precision physics suite in the Navy Environmental Prediction System Using the NUMA CorE (NEPTUNE) model. They focused on developing the RAP software suite available via the CCPP, leveraging previous work done over the years by the broad community that developed the RAP model. This suite had previously been widely tested and carefully tuned to work well with most of the code in single precision, while retaining key parts of the code in double precision. NRL’s work corrected some imprecise calculations and troublesome constants, addressing issues that were not present in the WRF version of the same parameterizations. 

This technical achievement by NRL, which was made available to the CCPP authoritative code repository, paved the way for the DTC to make the single-precision RAP suite more widely available. The DTC generalized the code and added support to the UFS, which employs the FV3 dynamical core. Since the UFS already supported both single and double precision for dynamics, the logic of mixed precision already existed, at least conceptually. Therefore, most of the work was of a mechanical nature: fixing type mismatches in the code and in connections to coupling, libraries, and stochastic physics.

The outcome of this work is that the single-precision RAP suite now works technically in the UFS. So far it has only been tested at low resolution, and results indicate that it runs approximately 25% faster than at double precision. 

There is much work to be done before single-precision physics in CCPP can be considered a finished product. First, scientific validation of the RAP suite has to be conducted, which may reveal the need for additional code adjustments. Second, further development could reduce computation cost of calculations. UFS only reads and writes files with double-precision floating point, and some areas of the code may still convert between single and double precision unnecessarily. Finally, additional suites could be made to work in single precision. This work has laid the foundation for budding capability on which the DTC and the community will be able to build, but only if agencies continue to invest in this development.

Creation of an Agile RRFS Prototype Testing Framework to Inform and Accelerate Operational Implementation

Summer 2022

Within the effort to unify the NCEP operational model suite, taking place under the Unified Forecast System (UFS) umbrella, a key area of interest is the evolution of legacy operational, convective-allowing systems to a new, unified Finite-Volume Cubed-Sphere (FV3)-based deterministic and ensemble storm-scale system called the Rapid Refresh Forecast System (RRFS). The ongoing transition from the existing NOAA NWP systems to the UFS is a major multi-year undertaking, with the RRFS targeted for initial operational implementation in late 2024. As part of the UFS, a new model-development paradigm is taking shape that focuses human resources and expertise from across the meteorological community on fewer systems, allowing for effective model development, and ultimately improved forecast skill, across the full NCEP modeling suite. In addition, simplification of the operational NCEP suite will optimize existing and future high-performance computer resources as well as reduce the overhead associated with maintaining multiple systems.

In order to successfully replace legacy regional prediction systems (i.e., the NAM nests, HREF, RAP, and HRRR) in favor of a single regional, convection-allowing ensemble, a phased retirement approach will be necessary and employed to ensure that the RRFS performs on par with each convective-allowing model. Progression toward an eventual operational implementation of the RRFS requires coordinated development across several, interconnected areas spanning the dynamic core, data assimilation, and chosen physics suite. Integral throughout this process is careful objective and subjective diagnostic analysis of forecast output in the forms of case studies and metrics as development of the RRFS evolves. In addition, continued engagement of model-development groups across NOAA, academia, the private industry, and input from stakeholders and end users is critical. To this end, a testing and evaluation framework is needed to incrementally assess various innovations from the community in a thorough and transparent manner, as model development is evaluated for potential inclusion in the RRFS.

The preliminary 3-km RRFS computational domain (shown in red), with initial plans for output grids (blue).


In preparation for this work, the DTC has been partnering and collaborating closely with teams including the UFS-R2O Short-Range Weather/Convection-Allowing Model (SRW/CAM) sub-project and several additional UFS-R2O cross-cutting teams to address the need of assessing RRFS performance.

As part of this initiative, the DTC recently completed a thorough comparison of the GFS versus the NAM and RAP using METviewer scorecards to inform the beginning stages of the legacy regional model retirement process. For the upcoming year, the DTC’s role in the UFS-R2O SRW/CAM sub-project will continue by establishing and exercising an agile benchmark testing framework through which output from end-to-end RRFS prototypes can be quickly verified using standard observational data and compared to verification of the operational CAM-based regional systems. This verification capability will be modeled after the coupled system, benchmark testing paradigm currently being used for global system evaluations at EMC, and will be run over important RRFS retrospective periods large enough for statistical significance, but small enough to allow for rapid prototyping and turnaround. The ability to iteratively evaluate RRFS prototypes against both observations and legacy, operational systems will provide model developers with not only a baseline for the current RRFS prototype, but also the ability to identify future development work and areas for potential innovation on a timely basis, crucial for an on-time delivery of the RRFS into operations in FY24. Regular evaluation of RRFS prototypes will also allow the community to participate through continuous engagement, facilitating collaboration from across the weather enterprise.

Extensive SRW App development has taken place over the past year, including integration of the advanced Model Evaluation Tools (METplus) into the App workflow, the result of several DTC testing and evaluation (T&E) projects. Using recent advances in the workflow, as well as new verification functionality established in these efforts, work on an RRFS agile testing framework will begin with the goal of providing this framework to the community by the spring of 2023. To this end, Agile framework development will be contributed to the authoritative SRW App repository for future distribution through the develop branch as well as in an upcoming release of the SRW App. It will also facilitate evolving prototype T&E activities. The successful implementation of the RRFS will depend on using the framework to rapidly evaluate multiple prototypes, an effort that will be based on both DTC evaluation and collaborative engagement of the weather community to take part in testing subsequent prototypes. One potential avenue for this kind of community engagement is the DTC Visitor Program.

Flow chart illustrating the model development and testing paradigm that the agile framework will be facilitating.

Physics Assessments in Support of the Upcoming GFS and GEFS 2024 Implementations

Spring 2022

The next operational implementations of the Global Forecast System (GFS) and Global Ensemble Forecast System (GEFS) are not scheduled until 2024, but work is underway to code and test a number of upgrades to these modeling systems. While innovations are planned in all aspects of the end-to-end system, the DTC has been particularly involved in supporting the development and improvement of the physics suite as a member of the Unified Forecast System (UFS) Research-to-Operations (R2O) physics subproject.

The GFS and GEFS are configurations of the UFS used for operational numerical prediction. Their 2024 operational implementations will use the Common Community Physics Package (CCPP) for the first time. To prepare for this, a CCPP-based configuration of the GFS using the current operational physics suite was created to serve as a baseline for future development. DTC conducted a thorough assessment of this baseline using process-oriented methods to highlight physical relationships responsible for forecast biases. For example, examination of the relationship between precipitable water and precipitation suggested there is still room for improvement in triggering of deep convection over the central and eastern contiguous United States (CONUS). Additionally, a case study was used for the in-depth examination of a known low bias in convective available potential energy (CAPE) over the CONUS, which suggested a problem with the representation of soil moisture, resulting in reduced evaporation and excessively dry planetary boundary layer (PBL; figure below).

Time-height plot of observed (left) and simulated (right) potential temperature (K, contours) and water vapor mixing ratio (g kg-1, shaded). The thick black lines denote the PBL height. Figure courtesy of Xia Sun (CIRES at NOAA/GSL and DTC).

On top of this baseline, physical parameterizations were added to or improved in the CCPP, and then assessed individually and incrementally to determine their suitability for the upcoming implementation. The DTC staff supported physics developers in adding their innovations to CCPP, conducting experiments in one- and three-dimensional configurations, and analyzing results from their own runs as well as from runs conducted by developers and the NOAA Environmental Modeling Center (EMC).

DTC contributed to a number of evaluations of alternate gravity wave drag (GWD) parameterization configurations, including the assessment of the small-scale orographic GWD implemented in the CCPP by Michael Toy of NOAA Global Systems Laboratory (GSL). For example, DTC contributed kinetic-energy spectra evaluations of a C384 (approximately 25-km grid spacing) run conducted by EMC and C768 (approximately 13-km grid spacing) runs conducted by GSL that ascertained the new configuration did not adversely affect the canonical distribution of energy among various scales of motion. DTC also evaluated innovations in the surface layer, PBL and convective representations, and stochastic physics provided by Jongil Han of EMC and Lisa Bengtsson of NOAA Physical Sciences Laboratory. These evaluations, which on more than one occasion revealed bugs that were subsequently fixed by developers, contributed to the decision to adopt the innovations for the latest GFS/GEFS prototype, dubbed P8.

DTC testing and evaluation also identified innovations that are not yet ready for transition to operations. In particular, evaluations of multiple versions of the Rapid Radiative Transfer Model for Global Climate Models (RRTMG-Parallel; RRTMGP) radiation scheme revealed that this new radiative scheme, or its coupling with other physical processes, produces excessively warm temperatures over Antarctica. The DTC conducted an in-depth investigation using the CCPP Single-Column Model to simulate the Department of Energy Atmospheric Radiation Measurement West Antarctic Radiation Experiment case. Results suggest that the problem seen in the three-dimensional tests may stem from interactions between RRTMGP and the land-surface model and that RRTMGP is more responsive to low-level clouds than the currently operational RRTMG scheme, indicating a need for further investigation.

The integration of DTC testing and evaluation activities with development activities under the auspices of the UFS and UFS-R2O physics working groups represents a new and successful paradigm in cooperation. DTC assessed innovations at a rapid pace and in close collaboration with developers, providing actionable information to assist EMC and project leads in determining physics configurations for the upcoming GFS and GEFS implementation.

A Comprehensive Retrospective Evaluation of the Global Synthetic Weather Radar Products Using METplus

Winter 2022

Traditional radar product integrity suffers from gaps in coverage over large areas. While some geographical areas are fortunate to have ground- and/or satellite-based radar coverage, most areas around the globe do not have complete coverage from reliable radar networks. The use of radar-based products is advantageous for understanding current weather conditions and how weather systems may evolve over time. Convection and precipitating weather systems often impact critical missions for the United States Air Force (USAF); therefore, access to products that go beyond traditional radar output would be beneficial for mission planning and execution. To address this, the Massachusetts Institute of Technology Lincoln Laboratory (MITLL) has created the Global Synthetic Weather Radar (GSWR) product, which provides near-global coverage of radar-based outputs using advanced machine-learning techniques. To aid the USAF in making evidenced-based decisions when using GSWR products, the Developmental Testbed Center (DTC) performed rigorous veracity testing to compare GSWR products against several ground-based radar outputs to ensure the products were developed properly. The evaluation was performed using the enhanced Model Evaluation Tools (METplus) and was largely conducted on the cloud via Amazon Web Services (AWS). Both traditional evaluation approaches, as well as more advanced spatial approaches, including the application of MET’s Method for Object-based Diagnostic Evaluation (MODE), were employed.

Example illustrating the MODE objects created from GSWR (shaded) and OPERA (outlines) composite reflectivity (≥20 dBZ threshold) valid on 05 June 2020 14:13:28 UTC.


The DTC’s evaluation was based on GSWR analysis data spanning from 15 May - 31 August 2020. These products included composite reflectivity, echo top, and vertically integrated liquid. GSWR products were compared against several observation and analysis datasets, including Multi-Radar/Multi-Sensor (MRMS) products, Next Generation Weather Radar (NEXRAD) products, Operational Programme for the Exchange of Weather Radar Information (OPERA) products.

Valid time series plot of MODE centroid displacement (grid squares) for composite reflectivity (dBZ) with a threshold of ≥20 dBZ for all analysis times from 20200515-20200831 over the full European coverage domain. West-East displacement is in red and North-South displacement is in blue, where westward (eastward) and southward (northward) displacement is negative (positive). The vertical bars attached to the median represent the 95% CIs.


For all three variables evaluated, GSWR analyses showed the highest skill at the lowest thresholds evaluated and the lowest skill at the highest thresholds evaluated; VIL typically showed lower skill than composite reflectivity and echo tops. Object-based evaluation provided value-added information beyond traditional measures. Key results from the object-based evaluation established that GSWR often produced fewer, larger composite reflectivity and echo top MODE objects when compared to the “truth” datasets and GSWR MODE objects typically exhibited weaker intensities and a south-west displacement. The results from this robust veracity testing activity were shared with the USAF and MITLL with the goal of providing information to make informed operational decisions as well as aid in the future development of GSWR products.

Highlights of the 2021 NOAA Hazardous Weather Testbed Spring Forecasting Experiment

Autumn 2021

Although the COVID pandemic has precluded in-person experiments for nearly two years in NOAA’s Hazardous Weather Testbed, virtual Spring Forecasting Experiments (SFEs) have proven to be an effective way to maintain momentum in key research-to-operations activities.  SFEs are annual, 5-week severe-weather forecasting experiments led by NOAA’s Storm Prediction Center and National Severe Storms Laboratory in Norman, Oklahoma.  After pivoting to a virtual format for SFE 2020, which limited the scope of activities, the 2021 SFE featured a full slate of virtual experimental forecasting and model evaluations that were made possible through well-vetted virtual meeting tools, web-based drawing tools coupled with experimental model-guidance visualizations, and interactive model-comparison webpages.  Without physical space constraints, the 2021 SFE was able to accommodate a record number of participants: 130 forecasters, researchers, and students from around the world.  

The 2021 SFE was held 3 May – 4 June 2021.  The primary goals of the experiment included testing new severe-weather prediction tools, studying how end-users apply severe weather guidance, and facilitating experiments for optimizing convection-allowing model (CAM) ensemble design to inform Unified Forecast System (UFS) development.  A comprehensive report on 2021 SFE preliminary findings and results was recently completed and can be found here, Preliminary Findings and Results.  A few highlights from the experiment are described below.  

Afternoon forecasting activities emphasized the use of NSSL’s Warn-on-Forecast System (WoFS) for providing short-term severe-weather guidance.  WoFS is an on-demand, CAM ensemble system that features rapidly updating ensemble data assimilation that incorporates radar and satellite data every 15 minutes.  WoFS is designed to increase warning lead times for hazardous weather with tentative plans to transition to operations for the NWS in the 2025-30 timeframe.  In one activity, two groups issued the same set of short-term severe-weather outlooks, but one group used WoFS and the other didn’t.  Each group had two expert forecasters whose outlooks were subjectively rated on a scale of 1-10.  Further, each group had several non-expert forecasters whose outlooks were combined to form consensus outlooks, which were also subjectively rated.  While there was little difference between the consensus WoFS and No-WoFS outlooks, the outlooks produced by expert forecasters using WoFS were rated significantly higher (Welch’s t test using α =0.05) than the experts who did not use WoFS (Figure 1).  

Figure 1. Average subjective ratings of WoFS and No-WoFS forecasts from the 2021 SFE.

Model evaluation activities emphasized the 64-member Community Leveraged Unified Ensemble (CLUE), a framework for SFE collaborators to contribute CAMs for controlled experiments.  CLUE experiments examined data assimilation methods, strategies for single-model CAM ensemble design, and impact of regional domain size on Day 2 model performance.  In one CLUE evaluation, configuration strategies for a Rapid Refresh Forecast System (RRFS) were examined.  The RRFS is a rapidly-updating CAM ensemble that will use the UFS Short Range Weather Application, and will subsume several regional models to simplify NOAA’s modeling suite.  Subjective evaluations indicated that a prototype RRFS from NOAA Global Systems Laboratory, which used stochastic physics and initial conditions from operational HRRRDAS (High Resolution Rapid Refresh data assimilation system), performed quite well (Figure 2).  In fact, this configuration almost performed as well as HREFv3 – NOAA’s current operational CAM ensemble, which continues to stand as a formidable baseline for experimental CAM ensembles.  NSSL maintains an archive of past and present CLUE datasets, which can be made available upon request (Contacts are Adam Clark [] or Kent Knopfmeier []).     

Figure 2. Distributions of subjective ratings (1-10) by SFE participants of severe-weather fields over a mesoscale area of interest for the forecast hours 13-36 for HREF, GSL RRFS, RRFS Cloud, MAP RRFS, and MAP RRFS VTS.  The RRFS cloud configuration was contributed by EMC and GSL and used stochastic physics, mixed physics, and cold start from GFS and GEFS initial and lateral boundary conditions.  The MAP RRFS and MAP RRFS VTS were contributed by the Multi-scale data Assimilation and Predictability (MAP) group at the University of Oklahoma, and used GSI hybrid EnVar data assimilation with initial and lateral boundary conditions from GFS and GEFS.  The runs labeled “VTS” also used a technique known as Valid Time Shifting to artificially increase the number of background ensemble members used for data assimilation.


METplus for Operational Verification and Diagnostics

Summer 2021

The idea of including the enhanced Model Evaluation Tools (METplus) in NOAA operations for the verification and validation of Environmental Modeling Center (EMC)’s suite of environmental prediction models has been a decade in the making. METplus is the DTC-developed verification framework that spans a wide range of temporal (warn-on-forecast to climate) and spatial (storm to global) scales.  It is intended to be extensible through additional capability developed by the community and the transition of this framework to NOAA’s Weather and Climate Operational Supercomputing System (WCOSS) was a major accomplishment. This partnership project between DTC, NOAA’s EMC, and the National Centers for Environmental Prediction (NCEP) Central Operations (NCO) is an on-going engagement. Through its development, METplus has become an integral part in the verification work of NOAA’s modeling operations because of the robust software, suite of capabilities, ease of use and documentation, and the fact that METplus is continually evolving. METplus has made the desire to have reliable, consistent statistical output a reality and will help bolster the next generation of numerical models to higher forecast prediction accuracy. As a result of this hand-in-hand partnership and cooperation to build and test the latest versions of METplus, less time is needed by numerical weather prediction (NWP) model developers to assess how a NWP model can be improved. METplus has the statistical and graphical verification output they need to diagnose shortcomings efficiently and optimize NWP settings for the next generation of NWP models. METplus’ output has been tested against real-time datasets from NOAA and compared to in-house calculations for precision and ease of use. 

DTC, EMC, and NCO worked collaboratively and iteratively to install METplus-3.1 and MET-9.1 on the WCOSS developmental system to test and optimize the software between August 2020 and March 2021, with the goal of installing the METplus system into real-time operations once WCOSS2 is available to EMC scientists.  METplus-4.0.0 is planned to be the foundation of the burgeoning EMC Verification System (EVS).  

In April 2021, the software was officially installed into 24/7 operations at NCO on WCOSS (soon to be WCOSS2). This milestone will now allow EMC to run operational METplus verification tasks and continue to build METplus into real-time Unified Forecast System (UFS) applications. This represents a major achievement in the UFS Research to Operations to Research (R2O2R) paradigm, as verification metrics developed within the UFS community now have a direct pathway to operations.

METplus Online Documentaion screenshot

One part of the change METplus brings as an operational companion to numerical weather forecasting is in its support and documentation. METplus is fully supported: METplus has a User’s Guide for each of its components (with additional information available on the METplus Website). The documentation is continuously updated and refined as needed for operability and understanding. Configuration options and keywords can be easily searched in each guide, eliminating the need to bookmark a specific page for reference.  With community support through a GitHub Discussions board, all documented user issues in METplus will receive the direct attention of the METplus scientists,engineers, and the METplus community at large. If an opportunity for enhancing METplus is brought forward from these help sources, a METplus team member will create a Github issue, allowing users to track the progress.

METplus is not only a component of NOAA's Unified Forecast System (UFS) cross-cutting infrastructure but will also be an evaluation and diagnostic capability for NCAR's System for Integrated Modeling of the Atmosphere (SIMA). METplus is actively being developed by NCAR/Research Applications Laboratory (RAL), NOAA Global Systems Laboratory (GSL), EMC, several US Department of Defense agencies and departments, and the Unified Modeling partners led by the Met Office of the United Kingdom.  Finally, METplus is a community resource via the DTC and is open for community contributions to the transition of successful ideas from research to operations.

Dell is the latest addition to NOAA's weather and climate operational supercomputing system. This powerful Dell hums alongside NOAA's IBM and Cray computers at a data center in Orlando, Florida. The three systems combined in Florida and Virginia give NOAA 8.4 petaflops of total processing speed and pave the way for improved weather models and forecasts. (NOAA )

Growing the WPC HydroMeteorological Testbed: Immersive Forecasting and New Perspectives

Spring 2021

The Hydrometeorological Testbed at NOAA/NWS/NCEP Weather Prediction Center (WPC) is a naturalistic decision-making environment, a physical space, a collaboration space, and an insight-generating laboratory. We explore observations and models (Numerical Weather Prediction, Machine Learning and statistical models) in order to evaluate, validate and verify weather-forecasting procedures, tools, and techniques. 

We recently wrapped up our season long Virtual Winter Weather Experiment (WWE) 2020-2021 during which we evaluated eight experimental Unified Forecast System (UFS) convection-allowing models (CAMs) and one machine learned snow to liquid equivalent technique (in the western US only) for snowfall forecasting using an immersive forecasting activity.  We asked participants to view model information and draw their own forecasts, rank models in a pre and post evaluation survey (both subjectively and objectively), and discuss how such guidance might influence their forecasts or forecast process. Participants enjoyed the immersive forecasting activity and appreciated the opportunity to explore these experimental data sets in a pseudo-operational way. We learned that predictability for the most common events was hit or miss, large-scale predictability, at time scales of 60-84 h, was still uncertain, and CAMs could not correct for this very well. However, the information contained in such forecasts was still useful and could be brought to bear in the forecast process, and thus could be meaningful in Impact Decision Support Services (IDSS). We continued to explore the predictability challenges by designing case studies focused on Days 3 and 2 in our retrospective, intensive forecasting sessions. This aspect of the forecast process was also considered valuable because we can begin to explore notions of forecast consistency between model cycles and interactions with the forecast strategies, processes, and procedures in future experiments. For more information on the WWE, contact Dr. Kirstin Harnos (kirstin.harnos at

A large part of our success comes from the participation of a large number of NWS Weather Forecast Offices, regional centers, Environmental Modeling Center, Physical Sciences Laboratory, and our academic partners as shown below:


Participant locations and number of sessions attended by Weather Forecast Offices, River Forecast Centers, National Centers, Academic Institution, Cooperative Institutes, National Labs, Region, or NOAA entity.

Our  immersive forecasting activities will continue into the warm season for our Virtual Flash Flood and Intensive Rainfall Experiment (FFaIR). We will continue to utilize CAMs provided by the Center for the Analysis and Prediction of Storms, Environmental Modeling Center, and the Global Systems Laboratory for the purpose of detecting and forecasting heavy and significant precipitation that may lead to flash flooding. We will do so through an operational product lens (i.e., Excessive Rainfall Outlook) and a hybrid forecast product for 6-hour rainfall, which bridges the traditional Quantitative Precipitation Forecast guidance from WPC with the Mesoscale Precipitation Discussion product. We will forecast rainfall accumulations, rainfall rates, durations, and flooding in the Day 1 period synthesizing many operational and experimental deterministic and ensemble CAM systems. We will continue to extract meaningful information from these systems under a variety of real-time forecasting scenarios during the peak of the warm season. For more information on FFaIR, contact Dr. Sarah Trojniak (sarah.trojniak at

We are looking forward to expanding the breadth of our knowledge as we seek collaboration with the social-science community. The information we produce informs the public we serve, from the methods we employ to solve physical science problems, to how we equip and prepare forecasters. Only through many different perspectives can we hope to capture a wide-angle view of forecast challenges to improve the predictions of precipitation that empower all of us to save lives and protect property.

We encourage researchers, forecasters, and emergency-support function personnel to reach out so we can work together and appreciate each other's challenges to better apply our various sciences, techniques, and approaches to empower life-saving and protective action against hazards, local and national. 

  1. NOAA/NWS/NCEP Weather Prediction Center
  2. Cooperative Institute for Research in Environmental Sciences (CIRES), University of Colorado Boulder

Upcoming UFS Metrics Workshop

Winter 2021

The Developmental Testbed Center (DTC), in collaboration with the National Oceanic and Atmospheric Administration (NOAA) and the Unified Forecast System's Verification and Validation Cross-Cutting Team (UFS-V&V), is hosting a three-day workshop to identify key verification and validation metrics for UFS applications.  The workshop will be held remotely 22-24 February, 2021. Approximately 275 participants have registered for this event from across the research and operational community. 

The goal of this workshop is to identify and prioritize key metrics to apply during the evaluation of  UFS research products and guiding their transition from research-to-operations (R2O).  Because all UFS evaluation decisions affect a diverse set of users, workshop organizers are encouraging members from government, academic, and private-sector organizations to participate in the workshop. In preparation for the workshop, a series of three pre-workshop surveys were distributed to interested parties. The results of the surveys have been used to prepare the discussion points of the breakout groups to streamline the metrics prioritization process.

The R2O funnel, in which metrics are used to advance the innovations towards progressively high readiness levels. (Image from Rood, Richard; Tolman, Hendrik, 2018: Organizing Research to Operations Transition Technical Report.)


The organizing committee is using the outcome of the 2018 DTC Community Unified Forecast System Test Plan and Metrics Workshop and pre-workshop surveys to form the foundation of the workshop. Ricky Rood, Dorothy Koch, Hendrik Tolman along with the Workshop Co-Chairs will be kicking off the meeting by providing the background and goals for the workshop. The first day features an opening plenary to curate the issues to be addressed in the breakout groups. The end of Day One and Day Two includes breakout groups to allow for final community input into the application forecast challenges along with prioritization across all applications. The workshop will end on Day Three with a final set of breakout groups to discuss how to apply the prioritized metrics to the full R2O development stages and gates. A wrap-up plenary rounds out a robust agenda The breakout groups will focus on the nine UFS applications along with key forecast challenges:

Model Applications

  • Short Range Weather (SRW)
  • Medium Range Weather (MRW
  • Sub-Seasonal 
  • Seasonal  
  • Atmospheric Quality and Composition
  • >Coastal 
  • Hurricane
  • Marine and Cryosphere
  • Space Weather
  • Land/Hydrology

Additional Key Forecast Challenges

  • Aviation
  • High Impact Weather (beyond hurricanes)
  • Weather Extremes
  • Data Assimilation

The Workshop Organizing Committee includes Tara Jensen (NCAR and DTC), Jason Levit (NOAA/EMC), Geoff Manikin (NOAA/EMC), Jason Otkin (UWisc CIMSS), Mike Baldwin (Purdue University), Dave Turner (NOAA/GSL), Deepthi Achuthavarier (NOAA/OSTI), Jack Settelmaier (NOAA/SRHQ), Burkely Gallo (NOAA/SPC), Linden Wolf (NOAA/OSTI), Sarah Lu (SUNY-Albany), Cristiana Stan (GMU), Yan Xue (OSTI), Matt Janiga (NRL), and the entire UFS V&V Team.

Engaging the Community to Advance HWRF Physics Innovations

Tests conducted by DTC lead to operational implementation of physics innovations in HWRF

Autumn 2020

The Hurricane Weather Research and Forecast system (HWRF), which is one of NOAA’s operational models used to predict the track, intensity, and structure of tropical cyclones, undergoes an upgrade cycle that is generally conducted on an annual basis.  Through the code management and developer support framework provided by the DTC, innovations from the research community that have been added to branches within the HWRF code repository serve as candidates for testing as part of these upgrade cycles. Scientists at NOAA’s Environmental Modeling Center (EMC) and the Developmental Testbed Center (DTC) frequently collaborate when testing and evaluating (T&E) changes to the HWRF physics schemes or data assimilation system in hopes of improving HWRF predictions.  For the 2020 HWRF upgrade cycle, the DTC focused on T&E of two potential upgrades to model physics stemming from the DTC visitor projects: 1) upgrades to the cloud-overlap scheme used in the Rapid Radiative Transfer Model for General Circulation Models (RRTMG), made available by John Henderson and Michael Iacono of Atmospheric and Environmental Research (AER) and 2) the Mellor-Yamada-Nakanishi-Niino (MYNN) Planetary Boundary Layer, based on work conducted by Dr. Robert Fovell (SUNY Albany) and a collaboration with Dr. Joseph Olson (NOAA Global Systems Laboratory).

Testing these potential upgrades focused on thirteen tropical cyclones in the North Atlantic ocean from the past three years that provided a mixture of storm characteristics and previous operational model performance. Preliminary results for the cloud-overlap upgrades (four of the thirteen storms) suggested that the results would not be sufficient to warrant operational implementation. The DTC communicated this feedback to AER, and worked with them to further analyze the results. The analysis suggested that a different configuration of the cloud-overlap upgrades might perform better. After coordinating with EMC and AER, the DTC tested the revised cloud-overlap configuration. This second test demonstrated up to 4% improvement in the 3–5 day hurricane-track forecast, which was sufficient for EMC to transition the cloud-overlap changes into the 2020 operational HWRF. This process illustrates the important role of iterative testing and development when transitioning research to operations. Results from the MYNN experiment indicated the scheme was not yet ready for operational implementation. However, the results are informing additional changes to the code by the developers, which may further enhance the performance of the MYNN PBL in high-wind conditions for potential application within NOAA’s Unified Forecast System (UFS).

The change in tropical cyclone track forecast skill relative to the control (H20C; blue line) for the initial (H20R; black line) and final (H2R1; red line) cloud-overlap experiments. The number of forecast cycles verified at each lead time is shown at the top of the plot.

Since these tests were conducted, the 2020 configuration of HWRF was finalized and implemented in operations. During the summer of 2020, DTC and EMC worked together to merge the final version of the code back to the trunk of the HWRF repository. This step enables researchers to add further innovations to the latest version of the code, ensuring that any scientific results are directly applicable to the operational HWRF, and positioning the community to contribute to the next HWRF implementation in early 2021.

CCPP Framework

Summer 2020

The Common Community Physics Package (CCPP) is a library of physical parameterizations distributed with a framework that enables its use in any host model that incorporates CCPP into their own structure. CCPP is currently used with NOAA’s Unified Forecast System (UFS) for experimental subseasonal-to-seasonal, medium- and short-range weather, air quality, and hurricane applications, with all physics development for the UFS transitioned to CCPP. The CCPP framework was originally developed by DTC, and is now being co-developed by DTC and NCAR. Both NCAR and the Naval Research Laboratory are adopting CCPP for use in their models. The DTC also distributes the CCPP Single Column Model (SCM), which allows physics experimentation with the CCPP in a simplified setting. The capabilities inherent in the CCPP and its SCM form an optimal collaborative infrastructure for use in a simple-to-more-complex Hierarchical System Development (HSD) approach to test and improve modeling systems, where such an approach can more easily identify systematic biases in models.

The CCPP’s interoperability, specifically, its ability to be used by a wide variety of host models, derives from the method used to communicate variables between the physics and the host models. All variables required by a physical parameterization must be accompanied by metadata, including their standard name, units, rank, etc. Similarly, to be CCPP-compliant, the host models must include metadata about their variables. The CCPP framework compares the variables in the physics against those in the host and automatically generates physics caps, which are software interfaces for communicating variables.

CCPP Architecture

The clearly defined software interfaces for communicating variables facilitate the use and development of the CCPP by the general community, while the interoperability aspects open the door for scientists at multiple organizations using diverse models to share code and work together on physics innovations. Desirable capabilities, such as scheme reordering in a physics suite, grouping schemes (calling one or more parameterizations from different parts of the host model), and subcycling (calling selected schemes at shorter time increments) make the CCPP framework suitable for use in research and operations, streamlining the R2O transition.

The physical parameterizations in the CCPP are typically used as sets by the host models, known as physics suites, and described using Suite Definition Files (SDFs). The CCPP framework permits a multi-suite build, in which multiple SDFs are selected at compile time and are available for use at runtime. This capability is appealing to both researchers and operational centers, since it enables flexibility, while maintaining high computational performance.

The CCPP is in a state of active development. Parameterizations are continuously improved and new schemes and suites are being added to meet the needs of various projects. Collaborative development is stimulated through the use of open-source code accessible via GitHub. Public releases of the CCPP with its SCM can be found at and the public release of the UFS Medium-Range Weather Application using CCPP is described at

MET Collaboration

Spring 2020

The NOAA Environmental Modeling Center (EMC) and the Developmental Testbed Center (DTC) are currently collaborating on using the Model Evaluation Tools (MET) for the verification and validation of EMC’s suite of environmental prediction models, such as the Global Forecast System (GFS), the Global Ensemble Forecast System (GEFS), and the Rapid Refresh (RAP)/High Resolution Rapid Refresh (HRRR). Both centers are currently working towards creating an operational configuration of MET that can be implemented on NOAA’s Weather and Climate Operational Supercomputer (WCOSS) to be used in real-time within a 7x24x365 operational environment. 

To that end, EMC and DTC have worked with NCEP Central Operations (NCO) to install METplus 2.1 and MET 8.1 on the developmental component of WCOSS to test and optimize the software system, with the eventual goal of installing METplus 3.0 and MET 9.0 into real-time operations in calendar year 2020. Once installed, the software will enable EMC to create a suite of real-time verification systems that will provide statistics on EMC model performance to both internal and external customers. Additionally, the real-time verification statistics will also be used to create graphics and displays with a cloud-based METViewer and METExpress user interface.

Image created using METplus for the GFSv15 vs GFSv16 500mb anomaly correlation comparison.

Advanced Physics Testing

Spring 2019

With funding from the Next Generation Global Prediction System (NGGPS) initiative and broad support from the community, the National Centers for Environmental Prediction (NCEP)/Environmental Modeling Center (EMC) recently replaced the dynamic core in its flagship operational model, the Global Forecast System (GFS). Version 15 of the GFS (GFSv15), implemented in operations on June 12, 2019, includes the Finite-Volume Cubed-Sphere (FV3) non-hydrostatic dynamical core in place of the long-running spectral hydrostatic core. This modeling system provides a fundamental early building block for the emerging Unified Forecast System (UFS) that is envisioned to be a full community-based Earth-System model.

The next major upgrade of the GFS, scheduled for 2021, is expected to include significant changes in model physics, posing the UFS community with a variety of challenges. Individual physical parameterizations need to be upgraded or replaced to produce superior forecast performance. Additionally, the physics suite needs to be well-integrated so that information is correctly transferred among parameterizations. Finally, the suite needs to run within the time available in the operational computing platform.

To address these challenges, three suites were identified as possible replacements for the GFS v15 suite (Suite 1). Suite 2 is the most similar to the operational suite, containing a single parameterization replacement, the planetary boundary layer (PBL) scheme. Suite 3 contains two parameterization replacements (the convective and microphysics schemes), harnessing development conducted at multiple research centers and universities, including Colorado State, Utah, NASA, NCAR, and EMC. Suite 4 contains five parameterization replacements, as it is derived from the operational RAP/HRRR modeling system, which was developed by NOAA Global Systems Division from years of community contributions through the WRF community modeling system for mesoscale applications.

Physics suite test configurations (terms are defined in detail below).

In addition to the differences in physics listed in the Table above, it should be noted that the forecasts by the various configurations differ in a few other aspects, including dynamics settings and computational platforms. Additionally, Suite 4 uses the Common Community Physics Package (CCPP) as a demonstration of the UFS’s new paradigm for integrating physics and dynamics.

Runs were conducted between December 2018 and February 2019. Each suite was applied in a total of 163 model initializations, and performance for 10-day forecasts were compared objectively and subjectively. The initializations included 16 high-impact events selected by EMC’s Model Evaluation Group (MEG) along with an additional 147 dates from all seasons in 2016 and 2017. The DTC’s Global Model Test Bed (GMTB) and EMC collaboratively conducted the model runs and the output was analyzed using EMC’s verification statistics database (VSDB - the basis for all model upgrade decisions), the Model Evaluation Tools (MET) package, and a comprehensive MEG evaluation. Additionally, GMTB produced diagnostic analyses focusing on tropical cyclones, precipitation characteristics, spectral decomposition, and boundary layer properties. These diagnostic and statistical summaries were examined by an impartial panel of experts to inform their formal recommendation for next steps to EMC. Consistent with the panel’s recommendation, EMC’s final decision was to use suite 2 as the basis for developing a prototype configuration for the next GFS implementation (GFSv16). Specifically, EMC has configured this prototype with the PBL parameterization that distinguishes suite 2, along with already planned upgrades to parameterizations for gravity wave drag, land, and atmospheric radiation - and a doubling of vertical levels with extension of the model upper boundary to the top of the mesosphere. Optimization and development of this prototype in a fully cycled system will proceed in coming months, in anticipation of an early 2021 operational implementation.

For more information about this test, visit the DTC website at

Terms defined
AA - Aerosol Aware
AW - Arakawa-Wu
Cu - cumulus cloud / convection
CS - Chikira-Sugiyama
EDMF - Eddy-Diffusivity Mass-Flux
GF - Grell-Freitas
GFDL - Geophysical Fluid Dynamics Laboratory
MG3 - Morrison-Gettelman 3
MYNN - Mellor-Yamada-Nakashini-Niino
Noah - Noah Land-Surface Model
RUC - Rapid Update Cycle
SA - Scale Aware
SAS - Simplified Arakawa Schubert
TKE - Turbulent Kinetic Energy
Thompson - Thompson Scheme
See more here


Tests Conducted at DTC Lead to Operational Implementation of Innovations in Physics and Data Assimilation in HWRF

Summer 2018

The Hurricane Weather Research and Forecast system (HWRF) is one of NOAA’s operational models used to predict the track, intensity, and structure of tropical cyclones. Each winter, scientists at NOAA’s Environmental Modeling Center (EMC), the Developmental Testbed Center (DTC), and NOAA’s Hurricane Research Division (HRD) perform testing and evaluation (T&E) on possible changes to the HWRF physics schemes, dynamic core, and data assimilation system that have the potential to improve HWRF predictions. Many of these potential changes are innovations from the research community that have been added to branches within the HWRF code repository in the past year, with guidance from the DTC. These branches are then retrieved from the repository by EMC and DTC staff to perform annual T&E. This yearly upgrade cycle illustrates the seamless exchange of innovations from the research community to operational testing environments, which is facilitated by the code management and developer support provided by the DTC.

This year, the DTC effort focused on T&E of two potential upgrades to model physics. The first looked at upgrades to the Rapid Radiative Transfer Model for General Circulation Models (RRTMG) radiation scheme made available by John Henderson and Michael Iacono of Atmospheric and Environmental Research (AER) through the DTC Visitor Program. The second replaced the Scale-Aware Simplified Arakawa-Schubert (SASAS) cumulus scheme with the Grell-Freitas scheme, based on work by Georg Grell (NOAA’s Global Systems Division), Saulo Freitas (NASA), and Evelyn Grell (NOAA’s Physical Sciences Division) that was funded by the Hurricane Forecast Improvement Project (HFIP). The DTC also participated in several experiments led by HRD and EMC to determine the impact of assimilating additional data to improve the HWRF initial conditions.

Each of these potential upgrades was first tested individually by running retrospective HWRF forecasts on a subset of tropical cyclones from the past three years in the North Atlantic ocean. For these initial tests, EMC selected sixteen storms that provided a mixture of storm intensities, storm motion directions, and previous operational model performance. For the RRTMG radiation scheme upgrades, the DTC ran nine of the sixteen storms before EMC staff decided the forecast improvements (~4% for both track and intensity) merited including the changes in the 2018 version of HWRF. Results from the Grell-Freitas experiment indicated the scheme was not yet ready for operational implementation. However, the results are informing additional changes to the code by the developers, who are working with DTC and EMC staff to test an improved version of their scheme later this summer. For the data addition experiment, the DTC ran 2–3 storms for each additional data type, which helped EMC determine that wind data from the Stepped Frequency Microwave Radiometer and inner-core dropsondes should be assimilated into HWRF in 2018.

Once the 2018 configuration of HWRF is finalized, DTC and EMC will work together to merge the final version of the code back to the HWRF trunk. This step will enable researchers to add additional innovations to the latest version of the code, ensuring that any scientific results are directly applicable to the operational HWRF, which will position the community well for next year’s HWRF pre-implementation tests. With additional opportunities for transition of research to operations in upcoming versions of the model, DTC staff look forward to continuing to lend their expertise in code management, developer support and T&E to the community!


Mean Track Error - Atlantic Basin (land and water)
Figure 1. Mean track errors with respect to forecast lead time for HWRF with RRTMG radiation scheme upgrades (H18R, red line) and the HWRF control (H18C, black line) experiments. Pairwise differences (H18C minus H18R) are shown in blue with 95% confidence intervals. Solid blue circles indicate lead times with statistically significant differences. The number of cases at each lead time is shown in gray at the top of the figure.



Mean absolute intensity errors
Figure 2. Mean absolute intensity errors with respect to forecast lead time for HWRF with RRTMG radiation scheme upgrades (H18R, red line) and the HWRF control (H18C, black line) experiments. Pairwise differences (H18C minus H18R) are shown in blue with 95% confidence intervals. Solid blue circles indicate lead times with statistically significant differences. The number of cases at each lead time is shown in gray at the top of the figure.


DTC MET Verification Tutorial

Spring 2018

The  DTC Verification team hosted a MET tutorial at NCAR January 31-February 2, 2018 in association with the semi-annual WRF tutorial. This event was the first in-residence MET tutorial since February 2015. There were 31 registered users and several new DTC staff dropped in for pertinent lectures.  The tutorial included a half day of lectures on verification basics plus two and a half days of presentations focused on many of the MET tools supplemented with practical sessions that demonstrated the tool.  The last day also included training on the METViewer database and display system, used for aggregating, stratifying and plotting statistics, and the newly developed MET+ python wrappers.

METv7.0 was released on March 5th. For more information on MET capabilities, check out the MET Users’ Page:

Evaluation of the new hybrid vertical coordinate in the RAP and HRRR

Autumn 2017

The terrain-following sigma coordinate has been implemented in many Numerical Weather Prediction (NWP) systems, including the Weather Research and Forecasting (WRF) model, and has been used with success for many years. However, terrain-following coordinates are known to induce small-scale horizontal and vertical accelerations over areas of steep terrain due to the reflection of topography in the model levels.  These accelerations introduce error into the model equations and can impact model forecasts, especially as errors are advected downwind of major mountain ranges.

This is a cross-section plot of one set of cold-start RAP simulations. The figure highlights the reduction in spurious noise above the Rocky Mountains.

Efforts to mitigate this problem have been proposed, including Klemp’s smoothed, hybrid-coordinate, in which the sigma coordinate is transitioned to a purely isobaric vertical coordinate at a specified level.  Initial idealized tests using this new vertical coordinate showed promising results with a considerable reduction in small-scale spurious accelerations.  

Based on these preliminary findings, the DTC was tasked to test and evaluate both the hybrid vertical coordinate and the terrain-following sigma coordinate within the RAP and HRRR forecast systems to assess impacts on retrospective cold-start and real-time forecasts.

The DTC conducted several controlled cold-start forecasts and one cycled experiment with the 13 km RAP, initialized from the GFS. This sample included days with strong westerly flow across the western CONUS, favoring vertically propagating mountain wave activity.  In addition, one cycled, 3-km HRRR experiment was initialized from the non-hybrid coordinate RAP.  The only difference between these retrospective runs was the  vertical coordinate.

This sample of forecasts indicated the hybrid vertical coordinate produced the largest impact at upper levels, where the differences in coordinate surfaces are most pronounced due to the reflection of terrain over mountainous regions.  As a result, wind speeds with the hybrid coordinate were generally increased near jet axes aloft as vertical and horizontal mixing of momentum decreased when compared with the terrain-following coordinate.  In addition, the depiction of vertical velocity at upper levels was greatly improved with reduced spurious noise and better correlation of vertical motion to forecast jet-like features.  A corresponding improvement was found in upper-level temperature, relative humidity, and wind speed verification when using the hybrid vertical coordinate.

The hybrid vertical coordinate will be implemented in the operational versions of RAPv4 and HRRRv3 in 2018.

This work was a collaborative effort between NOAA GSD, DTC, and NCAR MMM.


The 2016 Hurricane WRF System

Winter 2017

The community Hurricane Weather Research and Forecasting (HWRF) modeling system was upgraded to version 3.8a on November 21, 2016.  This release includes all components of the HWRF system: scripts, data preprocessing, vortex initialization, data assimilation, atmospheric and ocean models, coupler, postprocessor, and vortex tracker.  In addition to default operational features, the release includes capabilities to perform idealized tropical cyclone simulations run with alternate physics, and backwards compatibility for inner nest grid sizes.

The HWRF community modeling system currently has over 1300 registered users.  The public release includes updates to the user webpage, online practice exercises, datasets, and extensive documentation.  The release code is fully supported, with community support provided via the HWRF helpdesk,

Information about obtaining the codes, datasets, documentation and tutorials can be found at

Soil moisture sensitivity plots illustrating the new idealized capability. As soil moisture increases from left to right, the storm intensity increases (contoured values). Courtesy of Subramanian, 2016.

The NCEP 2016 operational implementation of HWRF and the HWRF v3.8a community release are compatible systems.  Starting in 2016, the default configuration runs with ocean coupling for all northern hemisphere oceanic basins, and uses Real-Time Ocean Forecast System (RTOFS) data for ocean initialization in the Eastern North Pacific Basin.  Two specific capabilities, a 40-member HWRF ensemble for the assimilation of Tail Doppler Radar (TDR) data that NCEP began running in 2015, and the addition of one-way wave coupling using WAVEWATCH III in 2016, are not currently supported to the general community.

  • Other notable upgrades in HWRF version 3.8a include:
  • Code upgrades including, WRF v3.8, GSI v3.5, and UPP v3.1.
  • Inner domain (d02, d03) sizes increased to 25ºx25º and 8.3ºx8.3º, respectively.
  • Reduced time step from 38 4/7 s to 30 s.
  • Data assimilation enabled by default for both Atlantic and Eastern North Pacific Basins.
  • Improved physics for all scales:
    • Cumulus parameterization updates, including enabling by default for all 3 domains and a new Scale Aware Simplified Arakawa Shubert (SAS) scheme.
    • New GFS Hybrid-Eddy Diffusivity Mass Flux PBL scheme.
    • Updated momentum and enthalpy exchange coefficients (Cd/Ch).
    • Enhanced Idealized capability with landfall.
  • Enhanced products including simulated brightness temperatures for new satellite sensors in all basins.

DTC visitor contributes enhanced idealized capability

As noted in the HWRFv3.8a updates, an enhanced idealized capability to include simulated landfall using the GFDL slab land surface physics scheme is included in the v3.8a release. This capability was contributed through a successful DTC visitor project by Subashini Subramanian (Purdue University), “Developing Landfall Capability in Idealized HWRF for Assessing the Impact of Land Surface on Tropical Cyclone Evolution”.  The new feature introduces a namelist switch for allowing the landfalling capability, which specifies the type of land surface and an initial land-surface temperature to be used over land.  The default configuration introduces a homogeneous land surface that can be modified to account for heterogeneity. Additionally, the direction of land motion is a user-defined option. Work is underway to extend this capability to include other land-surface physics options.

The Unified Post Processor

Summer 2017

Post-processing is an essential but often overlooked component of numerical weather prediction and encompasses a broad range of concepts, methods, and tools to make raw model output more useful. The Unified Post Processor (UPP) can compute a variety of diagnostic fields, interpolate to pressure levels or specified (pre-defined or custom) grids, and de-stagger grids. Examples of the products include:

  • T, Z, humidity, wind, cloud water, cloud ice, rain, and snow on isobaric levels
  • SLP, shelter level T, humidity, and wind fields
  • Precipitation-related fields
  • PBL-related fields
  • Severe weather products (i.e. CAPE, Vorticity, Wind shear)
  • Radiative/Surface fluxes
  • Cloud related fields
  • Aviation products
  • Radar reflectivity products
  • Satellite look-alike products

The UPP produces GRIB1 and GRIB2 output files that can be used directly by a number of plotting packages and the Model Evaluation Tools (MET) verification package.

UPP Components version 3.1.

The UPP is used to post-process operational models such as the Global Forecast System (GFS), GFS Ensemble Forecast System (GEFS), North American Mesoscale (NAM), Rapid Refresh (RAP), High Resolution Rapid Refresh (HRRR), Short Range Ensemble Forecast (SREF), and Hurricane WRF (HWRF) applications.  The DTC serves as a bridge between operations and the community, and provides UPP software and support for the Weather Research and Forecasting (WRF) modeling core.  Since the UPP is used in operations; users can mimic the production of operational products through the community UPP distribution. Another advantage is its efficient handling of large datasets because it’s a parallelized code. 

One of the more popular features among community users is the ability leverage the Community Radiative Transfer Model (CRTM) to output synthetic satellite products.  Other favored features include vertical interpolations of certain products, such as radar reflectivity ¼ km above ground level (AGL), and the horizontal grid manipulation capability. In addition, users have recently leveraged UPP as a tool to post-process WRF simulations into GRIB output. Required fields can then be used as input to initialize another WRF simulation.

The DTC’s UPP team works directly with community developers to incorporate their contributions into the code base, and serves as a liaison to integrate new features into the operational code.  The UPP team also continues to expand and improve documentation to help the community use and contribute to the UPP software package.  Look for a new online tutorial coming later this year!

UPP v3.1 is the most recent version available, and was released in the Fall 2016.  The next release can be expected in Summer or Fall of 2017.  More information can be found on the UPP website:

HWRF Training at Home and Abroad

Spring 2016

The DTC hurricane team has provided training opportunities to learn the Hurricane Weather Research and Forecasting (HWRF) system to both general users and active developers over the past several months.

The community HWRF modeling system (version 3.7a released in August 2015) is compatible with the NCEP 2015 operational implementation, which includes high-resolution deterministic tropical cyclone numerical guidance for all global oceanic basins. Due to the demonstrated skill and advanced capabilities of the HWRF model, there is a great deal of international interest for research and operational use. In order to meet these demands and foster collaborations, an HWRF tutorial was held at the Nanjing University of Information Science and Technology (NUIST) in Nanjing, China. DTC hurricane team members Ligia Bernardet and Christina Holt participated along with members of the Environmental Modeling Center (EMC) HWRF team. The tutorial, held 1-2 December 2015, attracted 84 participants and received positive feedback.

“ In addition to the tutorials aimed at general users working with the publicly released code, the DTC also responded to developer requests for specialized training. ”

Following the China tutorial, the DTC co-hosted an HWRF tutorial with the EMC HWRF team in College Park, MD at the NOAA Center for Weather and Climate Predication. This tutorial spanned three days from 25-27 January 2016. Tutorial attendees heard over 12 hours of lectures covering all aspects of the HWRF system, as well as enrichment lectures on the HWRF multi-storm modeling system, the HWRF ensemble predication system, HYCOM ocean coupling, and forecast verification. Invited speakers participated from various institutions, including NCEP/EMC, University of Rhode Island (URI), AOML/HRD and DTC. In addition to lectures, students received 6 hours of hands-on practical sessions. The event was well received from participants, many who unexpectedly attended the tutorial remotely due to the 25+ inches of snow that fell over the DC area the weekend prior!

Presentations and materials for the College Park, MD and Nanjing, China tutorials are posted at:

In addition to the tutorials aimed at general users working with the publicly released code, the DTC also responded to developer requests for specialized training. To meet the needs of active developers working with the HWRF repository code, the DTC hosted two separate HWRF specific Python trainings; one in conjunction with the HFIP annual review meeting in Miami, FL, and a second joined to the HWRF tutorial in College Park, MD. Training materials and resources from the developer trainings are available at:

Typhoon Symposium and HWRF Tutorial--Nanjing, China group photo .

HWRF Operational Implementation and Public Release

Winter 2016

With the conclusion of the 2015 hurricane season, assessments of model performance indicate that the upgraded 2015 Hurricane WRF (HWRF) model provided superior forecast guidance to the National Hurricane Center (NHC), with marked improvements over the previous HWRF system.

The unified HWRF system, for which the DTC provides the operational codes to the research community, is a cornerstone of HWRF’s success.

The community HWRF modeling system was upgraded to version 3.7a on August 31, 2015.  This release includes all components

of the HWRF system, including: scripts, data preprocessing, vortex initialization, data assimilation, atmospheric and ocean models, coupler, postprocessor, and vortex tracker (see Figure on the left).  Additionally, the capability to perform idealized tropical cyclone simulations is included (Figure in upper right). The HWRF community modeling system currently has over 1100 registered users.  The DTC provides resources for these users through updates to the user webpage, online practice exercises, datasets, and extensive documentation consistent with the latest release code.  With the HWRF v3.7a release, the HWRF helpdesk was migrated to a new tracking system (, providing support for all aspects of the code.  Information about obtaining the codes, datasets, documentations, and tutorials can be found at the DTC HWRF user webpage:

The HWRF v3.7a public release is compatible with the NCEP 2015 operational implementation of HWRF.  The HWRF model consists of a parent domain and two storm following two-way interactive nest domains.  Starting with the 2015 operational season, the default HWRF horizontal resolution increased to 18/6/2 km (from 27/9/3 km), and NCEP expanded high-resolution deterministic tropical cyclone forecast numerical guidance to all global oceanic basins for operations.  NCEP is running HWRF configurations with reduced complexity for global basins other than the Atlantic and Eastern North Pacific basins operationally.  However, the HWRF public release includes flexibility and alternate configuration options, such as running with full complexity including atmosphere-ocean coupled mode with data assimilation for all oceanic basins.  Additionally, the HWRF v3.7a maintains backwards compatibility to run the 27/9/3 km resolution.  One unsupported capability of the HWRF system is the use of an HWRF ensemble.

Improvements to the HWRF physics for the 2015 operational HWRF system demonstrate successful R2O transitions facilitated by the DTC.  The DTC acts as a conduit for code management and R2O by maintaining the integrity of the unified HWRF code and assisting developers with transitioning their innovations into the operational code.  Specifically, the DTC successfully facilitated R2O transitions for upgrades to radiation parameterization and PBL improvements that were implemented for the 2015 operational HWRF system.

Data Assimilation Study for TC Intensity

Summer 2015

The hybrid Ensemble Kalman Filter (EnKF)-Gridpoint Statistical Interpolation (GSI) data assimilation system was implemented at NCEP for its Global Forecasting System (GFS) in May 2012.

Schematic illustration of the hybrid EnKF-GSI data assimilation procedure. Dashed line indicates the optional re-centering step in the hybrid system.

This implementation led to significant improvements to global forecasts, including those of tropical storms. It can be noted that this improvement occurred while most current operational regional applications still use global rather than regional ensembles in their hybrid system. To bridge this gap, the DTC investigated the improvement of tropical storm intensity forecasts by using a regional ensemble in the GSI-hybrid data assimilation system.

A complete hybrid EnKF-GSI for the Hurricane WRF (HWRF) system was developed for the regional ensemble experiments, and results were compared to those obtained with the 2014 HWRF system. A two-way hybrid system was set up based on the GFS data assimilation scheme, using the GSI deterministic analysis to re-center the ensemble members at each analysis time. This re-centering step was found to reduce the ensemble spread for tropical cyclone (TC) center locations and intensity, so a one-way hybrid system that skipped the re-centering step was also developed.

Results showed that the operational system (Figure below, green) generated the lowest bias at the analysis time, but over time the bias showed a rapid “spin-down” from stronger to weaker wind forecasts than observed. (A similar spin-down issue was also noted using the 2015 HWRF system, but with smaller biases.)  The one-way hybrid system (red), which used a regional ensemble, performed better than the two-way hybrid system (blue), and also outperformed the 2014 operational configuration and the GSI hybrid system using GFS ensemble (without vortex initialization, cyan), for TC intensity forecasts beyond the 12-hour forecast lead time.

The DTC also performed experiments to further investigate the initial spin-down issue and found that it is related to an imbalance issue triggered by data assimilation. Experiments show that applying dynamic constraints could help ease such an imbalance. However, more research is required to find an optimal solution that reduces such imbalance-induced noise while still achieving desirable analysis increments.

Bias of (a) Maximum surface wind speed, and (b) Minimum sea level pressure for all the forecasts as a function of forecast lead time

Bridges to Operations

Did You Know?

Autumn 2014

Researchers from the DTC plan to provide numerical model runs from a preliminary version of the North American Rapid Refresh Ensemble system (Pre-NARRE) to the Hydrometeorological Testbed of the Weather Prediction Center (HMT/ WPC) during their current Winter Exercise. The DTC Ensemble Task will run the ensemble system (most likely on the NOAA hjet computing system) and post-process some of the results for HMT/WPC. Members of the ensemble (eight in total) will be produced from both WRF/ RUC and NMMB dynamical cores, and will include different combinations of microphysical, planetary boundary layer, surface physics, convective parameterization, and initial and boundary condition options (as in the chart below). Although the WPC will evaluate the runs on the CONUS domain, the computational domain will be set to the larger existing RAP domain, at 13 km resolution out to 24-48h, depending on computing resources. One hopeful outcome of the experiment will be an opportunity to compare NARRE forecasts with parallel runs from the Environmental Modeling Center’s (EMC) operational regional ensemble forecast system (SREF), which will be provided by EMC. In addition, results from the experiment will be used to extend previous assessments of NARRE performance to wintertime regimes.

Contributed by Isidora Jankov and Ed Tollerud.

Version of the North American Rapid Refresh Ensemble system (Pre-NARRE) provided to the Hydrometeorological Testbed of the Weather Prediction Center (HMT/WPC)

Innovation in HWRF 2013 Baseline

Spring 2013

One of the regional numerical weather prediction models used operationally by the National Weather Service is the Hurricane WRF (HWRF), a coupled model with atmospheric and ocean components that exchange fluxes of short- and long-wave radiation, momentum, moisture, and heat. The momentum flux is particularly important because the strong winds in tropical cyclones cause turbulence and upwelling in the ocean, which can lead to transport of cold water from deep in the ocean towards the surface, reducing the storm’s energy source and causing it to weaken.


A comparison between the ocean cooling in HWRF against observational buoy data, performed by the Hurricane Research Division of NOAA’s Atlantic Oceanographic and Meteorological Laboratory, showed that the ocean surface cooling in HWRF is too small. The DTC worked with the NOAA Environmental Modeling Center and oceanographers from the University of Rhode Island to formulate a test in which the momentum flux in the ocean model was altered to be more physically consistent. The figure below shows the mean intensity error as a function of lead time for 2012. The black curve is the control and the red curve is the forecast with modified fluxes with 95% confidence intervals.


Results aggregated over all 2012 Atlantic storms showed the more physical flux reduced the 5-kt positive intensity bias of the operational model to near-zero. This change has been incorporated by EMC into the 2013 HWRF baseline, and is expected to be adopted operationally for the 2013 hurricane season.

Support for Operational DA at AFWA

Autumn 2013

Unlike some other forecast model components, a data assimilation (DA) system is usually built to be flexible in order to be run by different forecast systems at varying scales.

Its testing and evaluation must therefore be performed in the context of a specific application; in other words, it must be adaptable to different operational requirements as well as to research advances. Established in 2009, the DTC DA team started providing data assimilation support and testing and evaluation for Air Force Weather Agency (AFWA) mesoscale applications throughout its global theaters. This task has become one important component of the DTC’s effort to accelerate transitions from research to operations (R2O). Between 2009 and 2011, the focus of extensive DA testing for AFWA at the DTC was to provide a rational basis for the choice of the next generation DA system. Various analysis techniques and systems were selected by AFWA for testing, including WRF Data Assimilation (WRFDA), Gridpoint Statistical Interpolation (GSI), and the NCAR Ensemble Adjustment Kalman Filter. During this testing, the impacts of different data types, background error generation, and observation formats were also investigated.

“The developmental experiment outperformed the baseline”

Testing activity by the DTC DA team took a sharp turn in August 2012. To assist AFWA in setting up an appropriate configuration for their 2013 implementation of GSI, the DTC adapted their DA testbed to complement AFWA’s pre-implementation parallel tests in real-time. In support of providing new code and configurations, the team now performs two types of tests for AFWA:

The baseline experiment is usually generated by running the current operational or parallel system at AFWA. Whenever an AFWA baseline is updated, the DTC checks its reproducibility (or similarity) using the DTC functionally-similar testing environment to ensure that any following tests are comparable, and that there is no code divergence between research and operations. One such test conducted during the summer of 2013 (see figure next page) revealed that wind analysis fits to observations in AFWA forecasts were not reproduced by the DTC due to an inadvertent AFWA code change reading their own conventional data files. Other data assimilation components and applications (new configurations, techniques, observations, etc.) can also be tested in the DTC end-to-end DA testbed, see figure to the left.


During DTC real-time tests of the AFWA 2013 implementation, the AFWA GO index (a multivariate combined statistical score) dropped when the (then) AFWA parallel run configuration was used. When the GO index exceeded 1 (i.e., before November), the developmental experiment (which used the DTC-suggested configuration) outperformed the baseline (here, GFS-initialized). For wind variables in particular, the DTC configuration significantly improved the wind analyses. Further retrospective tests narrowed down the contributing factors, and the DTC suggested that the North American Mesoscale (NAM) static background errors generated by NCEP be used. AFWA adopted this configuration for its first GSI implementation in its global coverage domains in July 2013.

Use of Model Evaluation Tools in NWS QPF Verification


The National Weather Service (NWS) Meteorological Development Laboratory (MDL) is developing an automated, nationally consistent and centralized service that verifies Quantitative Precipitation Forecasts (QPF). This QPF Verification Service (QPFVS)  will provide objective assessments of the predictive skill of numerical model guidance and official NWS forecasts to help increase the accuracy of quantitative precipitation forecasts. QPFVS will be implemented as a component of a larger gridded verification system with custom front-ends to serve various user communities, such as aviation weather, public weather, and water management/hydrology.

MDL uses the Model Evaluation Tools (MET) software package from the Developmental Testbed Center (DTC) to generate verification results for QPFVS. The MET software has a robust set of verification techniques (station, grid, ensemble, object-oriented) and metrics that meet the NWS requirements for QPFVS and is well supported by extensive documentation and a responsive help desk. The NWS Weather Prediction Center (WPC) and Environmental Modeling Center (EMC) also use MET software, which allows MDL to ensure consistency in techniques and verification scores across the NWS.

QPFVS is accessed via a web-based Graphical User Interface (GUI) and includes datasets from the National Digital Forecast Database (NDFD), National Blend of Models (NBM), High-Resolution Rapid Refresh (HRRR), and Global Forecast System (GFS). To verify, QPFVS uses UnRestricted Mesoscale Analysis (URMA) QPE06 (Quantitative Precipitation Estimation) gridded analysis as the truth. The forecasts and analysis are displayed on a flexible zoom-and-roam interface.

The QPFVS Statistics page can be used to query a database to generate plots and tables of verification scores. The plots are interactive, allowing users to interrogate and save graphics for reports and presentations and download tabular verification data in CSV (Comma Separated Values) format. The current version, QPFVS v1.0 contains gridded verification scores with plans to add station-based verification and more sources in QPFVS v2.0.

Figure 1. QPFVS Viewer allows users to view forecasts and verifying analysis within the same map panel with the ability to zoom and roam through the entire grid. The images preload for quick manipulation and viewing.

QPFVS leverages the MET Docker Container to produce gridded statistics in real-time.  To generate gridded verification, QPFVS first uses MET to convert NDFD forecasts and guidance to the common URMA grid definition. The forecasts, guidance, and analysis are then processed through additional MET programs to generate gridded verification statistics at various geographic scales (i.e., national, regional, and local) and are stored in a database. The QPFVS GUI allows users to easily build a custom query of the database with choices such as location(s) of interest, data source(s), and date range.

MET output of forecasts, guidance, and analysis data on the common URMA grid are also converted into Georeferenced Tagged Image File Format (GeoTIFF) images. Additional features include the ability to view time series of QPF data at individual grid points.

The MET software and team have been very helpful in establishing QPFVS v1.0. MDL anticipates MET will continue to be useful in meeting additional QPFVS requirements, including station-based verification, probabilistic verification, and object-oriented verification.

Contributed by Tabitha Huntemann and Dana Strom.

Figure 2. QPFVS can display the verification metrics in a multitude of ways. Pictured above is a performance diagram for the month of October 2017 for all grid points where a forecast or an observation was >= 0.25”.