Results from the 2016 Community Leveraged Unified Ensemble
During the 2016 Hazardous Weather Testbed Spring Experiment in the United States which took place during much of May and early June, a large ensemble known as CLUE (Community Leveraged Unified Ensemble) was run to provide guidance for severe convective weather forecasters, and to allow an examination of the impact of ensemble design on the forecasts. The present study compares two nine-member ensembles, both using mixed lateral boundary and initial conditions, but with one also using mixed physics and the other a single set of physics. The Meteorological Evaluation Toolkit was used to verify the ensemble forecasts, using both traditional and object-based verification approaches to analyze 1-hour and 3-hour precipitation forecasts, and reflectivity forecasts. In addition, for a small sample of convective events, the convective initiation was also verified in the ensembles.
Preliminary results suggest spread is much larger when mixed physics are added to the ensemble, and therefore the observed precipitation or reflectivity values are more likely to fall within the envelope of member solutions. However, the increased spread is primarily due to a very high bias in area and intensity for two members that used the Milbrandt-Yau microphysics, and a low bias in three members using the P3 scheme. The average of member attributes from the Method for Object-Based Diagnostic Evaluation system is often slightly better for the ensemble with mixed physics as well, but this appears again to be due to near-balance between large positive errors in the Milbrandt-Yau members and negative errors in members using other schemes, especially those using P3. The sum of errors from all members is larger for the ensemble with mixed physics. In general, behavior is similar whether reflectivity or precipitation is used for verification, although differences among ensemble members are slightly greater when reflectivity is used. Traditional ensemble verification measures such as area under the ROC curve or Brier Skill Score show no substantial differences between both ensembles, with positive skill only present for the lightest precipitation and reflectivity thresholds – 0.254 mm for 1-hour precipitation, 0.254 and 2.54 mm for 3-hour precipitation, and 20 dBZ reflectivity. These results show the difficulty in accurately predicting details of intense warm season convection. The results also suggest further work is necessary in the design of ensembles to provide the best possible guidance to forecasters. The impacts of adding varied physics to the 2016 CLUE ensemble would best be described as mixed, and the value would likely depend on the specific goals of individual forecasters.