Point-Stat

Point-Stat cindyhg Thu, 04/25/2019 - 10:18

Point-Stat Functionality

The Point-Stat tool provides verification statistics for forecasts at observation points, as opposed to over gridded analyses. The Point-Stat tool matches gridded forecasts to point observation locations using several configurable interpolation methods. The tool then computes continuous as well as categorical verification statistics for the matched pairs falling inside the verification masking regions defined by the user. The categorical statistics generally are calculated by applying one or more thresholds to the forecast and observation values. Confidence intervals, which represent a measure of uncertainty, are computed for most of the verification statistics.

Point-Stat Usage

View the usage statement for Point-Stat by simply typing the following:

point_stat

At a minimum, the input gridded fcst_file, the point observation obs_file in NetCDF format, and the configuration config_filemust be passed in on the command line.

You may use the -point_obs command line argument to specify additional NetCDF point observation files to be used. By default, a matching point observation time window, specified in the configuration file, is defined relative to the valid time of the forecast. However, the -obs_valid_beg and -obs_valid_end command line options override the default and explicity define that time window.

Configure

Configure cindyhg Thu, 04/25/2019 - 10:20

The behavior of Point-Stat is controlled by the contents of the configuration file passed to it on the command line. The default Point-Stat configuration file may be found in the $MET_BASE/config/PointStatConfig_default file. The configuration used by the test script may be found in the $MET_BASE/PointStatConfig file. Prior to modifying the configuration file, users are advised to make a copy of the default:

cp $MET_BASE/config/PointStatConfig_default $MET_TUTORIAL_DATA/config/PointStatConfig_tutorial

Open up the $MET_TUTORIAL_DATA/config/PointStatConfig_tutorial file for editing with your preferred text editor.

The configurable items for Point-Stat are used to specify how the verification is to be performed. The configurable items include specifications for the following:

  • The verification grid.
  • The forecast fields to be verified at the specified vertical levels.
  • The threshold values to be applied.
  • The economic cost-loss value ratios to be evaluated.
  • The reference climatological dataset.
  • The matching time window for point observations.
  • The type of point observations to be matched to the forecasts.
  • The areas over which to aggregate statistics - as predefined grids, configurable lat/lon polylines, or individual stations.
  • The confidence interval methods to be used.
  • The interpolation methods to be used.
  • The types of verification methods to be used.

You may find a complete description of the configurable items in the MET Users Guide or in the $MET_BASE/config/README file. Please take some time to review them.

For this tutorial, we'll configure Point-Stat to verify the model temperature at two vertical levels and winds at the surface. However, Point-Stat may be configured to verify as many or as few model variables as you desire. The sample input forecast file is not on the NCEP Grid 212 domain. However, we'll use the NCEP Grid 212 domain to define a masking region for our data. Edit the $MET_TUTORIAL_DATA/config/PointStatConfig_tutorial file as follows:

  • Set wind_thresh = [ >0.0, >=1.0, >=5.0, >=8.0 ];
    To indicate that we'd like VL1L2 lines computed using these thresholds on the wind speeds.
  • In the fcst dictionary, set
      field = [
         {
           name       = "TMP";
           level      = "Z2";
           cat_thresh = [ >278, >283, >288 ];
         },

        {
           name  = "TMP";
           level = "P750-850";
           cat_thresh = [ >278 ];
         },

         {
           name  = "UGRD";
           level = "Z10";
           cat_thresh = [ >=5.0 ];
         },

         {
           name  = "VGRD";
           level = "Z10";
           cat_thresh = [ >=5.0 ];

         }
       ];

    To verify 2-meter temperature, 10-meter winds, and temperature fields between 750hPa and 850hPa and apply the categorical thresholds listed. TMP is in Kelvin and the U and V components of wind are in m/s.

  • Retain the default settings of obs = fcst;
    To use the settings from the fcst dictionary above.
  • Set message_type = [ "ADPUPA", "ADPSFC" ];
    To verify using these 2 observation types.
  • In the mask dictionary, set grid = [ "G212" ];
    To accumulate statistics over NCEP Grid 212 domain.
  • In the mask dictionary, set poly = [ "${MET_BASE}/poly/EAST.poly", "${MET_BASE}/poly/WEST.poly" ];
    To accumulate statistics over the regions defined by the EAST and WEST polyline files.
  • In the interp dictionary, set
      type = [
         {
           method = NEAREST;
           width  = 1;
         },

         {
           method = UW_MEAN;
           width  = 5;
         }
       ];

    To indicate that the forecast values should be interpolated to the observation locations using the nearest neighbor method and by averaging the forecast values over the 5 by 5 box surrounding the observation location.

  • Set
    output_flag = {
       fho    = NONE;
       ctc    = BOTH;
       cts    = BOTH;
       mctc   = BOTH;
       mcts   = BOTH;
       cnt    = BOTH;
       sl1l2  = BOTH;
       sal1l2 = NONE;
       vl1l2  = BOTH;
       val1l2 = NONE;
       vcnt   = NONE;
       pct    = NONE;
       pstd   = NONE;
       pjc    = NONE;
       prc    = NONE;
       ecnt   = NONE; // Only for HiRA.
       eclv   = NONE;
       mpr    = NONE;
    }

    To indicate that the contingency table counts (CTC), contingency table statistics (CTS), multi-category contingency table counts (MCTC), multi-category contingency table statistics (MCTS), continuous statistics (CNT), scalar partial sum (SL1L2), and vector partial sum (VL1L2) line types should be output.

Save and close this file.

Run

Run cindyhg Thu, 04/25/2019 - 10:21

Next, run Point-Stat on the command line using the following command:

point_stat \
$MET_TUTORIAL_DATA/input/sample_fcst/2007033000/nam.t00z.awip1236.tm00.20070330.grb \
$MET_TUTORIAL_DATA/output/pb2nc/tutorial_pb.nc \
$MET_TUTORIAL_DATA/config/PointStatConfig_tutorial \
-outdir $MET_TUTORIAL_DATA/output/point_stat \
-v 2

Point-Stat is now performing the verification tasks we requested in the configuration file. It should take a minute or two to run.

In this example, Point-Stat accumulates matched forecast/observation pairs into 48 groups based on our configuration file selections. The 48 groups are a result of: 4 fields (TMP at Z2, TMP at P750-850, UGRD at Z10, VGRD at Z10) * 2 observing message types * 3 masking regions * 2 interpolation methods. However, many of these combinations, such as verifying TMP at Z2 versus upper-air observations (ADPUPA), will result in zero matched pairs being found. As Point-Stat runs, you should see several status messages printed to the screen to indicate progress.

The configuration file language allows you greater control of which message types are used for each verification task. Instead of trying all 48 combinations listed above, you could specify which message type(s) should be used for verifying each field. Simply move the message_type setting inside each entry of the field array. Use ADPSFC to verify the surface fields (TMP at Z2, UGRD at Z10, and VGRD at Z10), and use ADPUPA to verify the upper-air field (TMP from 750 to 850mb).

While Point-Stat has the ability to compute bootstrap confidence intervals, doing so is rather slow. For this reason, bootstrapping is disabled by default. To turn them on, edit the configuration file $MET_TUTORIAL_DATA/config/PointStatConfig_tutorial, in the boot dictionary, set n_rep equal to 1000, and re-run the previous Point-Stat command.

While Point-Stat has the ability to compute rank correlation statistics, doing so is rather slow. For this reason, rank correlations are disabled by default. To turn them on, edit the configuration file $MET_TUTORIAL_DATA/config/PointStatConfig_tutorial, set the rank_corr_flag variable equal to TRUE, and re-run the previous Point-Stat command.

Next, we'll take a look at the Point-Stat output we just generated.

Output

Output cindyhg Thu, 04/25/2019 - 10:22

The output of Point-Stat is one or more ASCII files containing statistics summarizing the verification performed. In this example, the output is written to the $MET_TUTORIAL_DATA/output/point_stat directory as we requested on the command line. That output directory should now contain 8 files, one each for the CNT, CTC, CTS, MCTC, MCTS, SL1L2, and VL1L2 and line types (.txt), and an eighth one for the STAT file (.stat). The STAT file contains all of the output statistics while the other ASCII files contain the exact same data, but sorted by line type.

Since the lines of data in these ASCII files are so long, we strongly recommend configuring your text editor to NOT use dynamic word wrapping. The files will be much easier to read that way.

Open up the $MET_TUTORIAL_DATA/output/point_stat/point_stat_360000L_20070331_120000V_ctc.txt CTC file using the text editor of your choice and note the following:

  • This is a simple ASCII file consisting of several rows of data.
  • Each row contains data for a single verification task.
  • The first 22 header columns contain data applicable to all line types, such as timing information, variable and level information, verifying message type, masking region applied, interpolation method applied, and threshold values applied.
  • The twenty-second column, labeled LINE_TYPE, indicates the type of statistics contained in this line. In this file, the LINE_TYPE column contains CTC indicating that the columns to follow contain contingency table counts.
  • The remaining columns after LINE_TYPE are labeled TOTALFY_OYFY_ONFN_OY, and FN_ON and contain the contingency table counts.

Close this file, open up the $MET_TUTORIAL_DATA/output/point_stat/point_stat_360000L_20070331_120000V_cts.txt CTS file, and note the following:

  • The first 22 columns contain the same type of header data as in the previous file.
  • The LINE_TYPE column is set to CTS which indicates that the columns to follow contain contingency table statistics. Refer to the MET Users's Guide for a thorough description of this output line type.
  • Confidence intervals are given for each of these statistics, computed using either one or two methods. The columns ending in _NCL and _NCU give lower and upper confidence limits computed using assumptions of normality. The columns ending in _BCL and _BCU give lower and upper confidence limits computed using bootstrapping. If you re-ran the Point-Stat example with bootstrapping turned off, the _BCL and _BCU will contain the missing data value of NA.

Open up the $MET_TUTORIAL_DATA/output/point_stat/point_stat_360000L_20070331_120000V_mctc.txt MCTC file, and note the following:

  • This file contains 6 lines of multi-category contingency table counts.
    These 6 lines are a result of: 3 masking regions * 2 interpolation methods.
  • Since we only provided multiple thresholds for 2-meter temperature, this file only contains MCTC output for that field. Point-Stat used the 3 thresholds we provided to define 4x4 contingency tables. The corresponding statistics are written out in the MCTS file. This functionality is new for METv8.0.

Open up the $MET_TUTORIAL_DATA/output/point_stat/point_stat_360000L_20070331_120000V_vl1l2.txt VL1L2 file, and note the following:

  • This file contains 24 lines of VL1L2 partial sums.
    These 24 lines are a result of: 3 masking regions * 2 interpolation methods * 4 wind speed thresholds.
  • For the VL1L2 line, the contents of the FCST_THRESH and OBS_THRESH header columns indicate the thresholds that were applied to the wind speed values to determine which U/V points would be included in the sum.

The other output text files contain data specific to their individual line types. Refer to tables 7.2 through 7.21 in the MET Users Guide for a description of their contents.

Lastly, the $MET_TUTORIAL_DATA/output/point_stat/point_stat_360000L_20070331_120000V.stat STAT file contains all of the same data we just viewed but in a single file. The Stat-Analysis tool, which we'll use later in this tutorial, only reads the STAT output of the Point-Stat, Grid-Stat, Wavelet-Stat, and Ensemble-Stat tools, not the ASCII (.txt) files.