MET Tool: Grid-Stat

MET Tool: Grid-Stat
IMPORTANT NOTE: If you are returning to the tutorial, you must source the tutorial setup script before running the following instructions. If you are unsure if you have done this step, please navigate to the Verify Environment is Set Correctly page.

Grid-Stat Functionality

The Grid-Stat tool provides verification statistics for a matched forecast and observation grid. If the forecast and observation grids do not match, the regrid section of the configuration file controls how the data can be interpolated to a common grid. All of the forecast gridpoints in each spatial verification region of interest are matched to observation gridpoints. The matched gridpoints within each verification region are used to compute the verification statistics.

The output statistics generated by Grid-Stat include continuous partial sums and statistics, vector partial sums and statistics, categorical tables and statistics, probabilistic tables and statistics, neighborhood statistics, and gradient statistics. The computation and output of these various statistics types is controlled by the output_flag in the configuration file.

Grid-Stat Usage

View the usage statement for Grid-Stat by simply typing the following:

grid_stat
Usage: grid_stat  
  fcst_file Input gridded forecast file containing the field(s) to be verified.
  obs_file Input gridded observation file containing the verifying field(s).
  config_file GridStatConfig file containing the desired configuration settings.
  [-outdir path] Overrides the default output directory (optional).
  [-log file] Outputs log messages to the specified file (optional).
  [-v level] Level of logging (optional).
  [-compress level] NetCDF compression level (optional).

The forecast and observation fields must be on the same grid for verification. You can use copygb to regrid GRIB1 files, wgrib2 to regrid GRIB2 files, or the automated regridding within the regrid section of the MET config files.

At a minimum, the input gridded fcst_file, the input gridded obs_file, and the configuration config_file must be passed in on the command line.

admin Mon, 06/24/2019 - 16:06

Configure

Configure

Start by making an output directory for Grid-Stat and changing directories:

mkdir -p ${METPLUS_TUTORIAL_DIR}/output/met_output/grid_stat
cd ${METPLUS_TUTORIAL_DIR}/output/met_output/grid_stat

The behavior of Grid-Stat is controlled by the contents of the configuration file passed to it on the command line. The default Grid-Stat configuration file may be found in the data/config/GridStatConfig_default file. Prior to modifying the configuration file, users are advised to make a copy of the default:

cp ${MET_BUILD_BASE}/share/met/config/GridStatConfig_default GridStatConfig_tutorial

Open up the GridStatConfig_tutorial file for editing with your preferred text editor.

vi GridStatConfig_tutorial

The configurable items for Grid-Stat are used to specify how the verification is to be performed. The configurable items include specifications for the following:

  • The forecast fields to be verified at the specified vertical level or accumulation interval
  • The threshold values to be applied
  • The areas over which to aggregate statistics - as predefined grids, configurable lat/lon polylines, or gridded data fields
  • The confidence interval methods to be used
  • The smoothing methods to be applied (as opposed to interpolation methods)
  • The types of verification methods to be used

You may find a complete description of the configurable items in the grid_stat configuration file section of the MET User's Guide. Please take some time to review them.

For this tutorial, we'll configure Grid-Stat to verify the 12-hour accumulated precipitation output of PCP-Combine. We'll be using Grid-Stat to verify a single field using NetCDF input for both the forecast and observation files. However, Grid-Stat may in general be used to verify an arbitrary number of fields. Edit the GridStatConfig_tutorial file as follows:

  • Set:
    fcst = {
       field = [
          {
            name       = "APCP_12";
            level      = [ "(*,*)" ];
            cat_thresh = [ >0.0, >=5.0, >=10.0 ];
          }
       ];
    }
    obs = fcst;

    To verify the field of precipitation accumulated over 12 hours using the 3 thresholds specified.

  • Set:
    mask = {
       grid = [];
       poly = [ "../gen_vx_mask/CONUS_mask.nc",
                "MET_BASE/poly/NWC.poly",
                "MET_BASE/poly/SWC.poly",
                "MET_BASE/poly/GRB.poly",
                "MET_BASE/poly/SWD.poly",
                "MET_BASE/poly/NMT.poly",
                "MET_BASE/poly/SMT.poly",
                "MET_BASE/poly/NPL.poly",
                "MET_BASE/poly/SPL.poly",
                "MET_BASE/poly/MDW.poly",
                "MET_BASE/poly/LMV.poly",
                "MET_BASE/poly/GMC.poly",
                "MET_BASE/poly/APL.poly",
                "MET_BASE/poly/NEC.poly",
                "MET_BASE/poly/SEC.poly" ];
    }

    To accumulate statistics over the Continental United States (CONUS) and the 14 NCEP verification regions in the United States defined by the polylines specified. To see a plot of these regions, execute the following command:

    gv ${MET_BUILD_BASE}/share/met/poly/ncep_vx_regions.pdf &

  • In the boot dictionary, set:
    n_rep = 500;

    To turn on the computation of bootstrap confidence intervals using 500 replicates.

  • In the nbrhd dictionary, set:
    width          = [ 3, 5 ];
    cov_thresh = [ >=0.5, >=0.75 ];

    To define two neighborhood sizes and two fractional coverage field thresholds.

  • Set:
    output_flag = {
       fho     = NONE;
       ctc      = BOTH;
       cts      = BOTH;
       mctc   = NONE;
       mcts   = NONE;
       cnt     = BOTH;
       sl1l2   = BOTH;
       sal1l2 = NONE;
       vl1l2   = NONE;
       val1l2 = NONE;
       vcnt = NONE;
       pct     = NONE;
       pstd   = NONE;
       pjc     = NONE;
       prc     = NONE;
       eclv    = NONE;
       nbrctc = BOTH;
       nbrcts = BOTH;
       nbrcnt = BOTH;
       grad    = BOTH;
       dmap  = NONE;
       seeps  = NONE;
    }

    To compute contingency table counts (CTC), contingency table statistics (CTS), continuous statistics (CNT), scalar partial sums (SL1L2), neighborhood contingency table counts (NBRCTC), neighborhood contingency table statistics (NBRCTS), and neighborhood continuous statistics (NBRCNT).

admin Mon, 06/24/2019 - 16:17

Run

Run

Next, run Grid-Stat on the command line using the following command:

grid_stat \
${METPLUS_TUTORIAL_DIR}/output/met_output/pcp_combine/sample_fcst_12L_2005080712V_12A.nc \
${METPLUS_TUTORIAL_DIR}/output/met_output/pcp_combine/sample_obs_12L_2005080712V_12A.nc \
${METPLUS_TUTORIAL_DIR}/output/met_output/grid_stat/GridStatConfig_tutorial \
-outdir ${METPLUS_TUTORIAL_DIR}/output/met_output/grid_stat \
-v 2

Grid-Stat is now performing the verification tasks we requested in the configuration file. It should take a minute or two to run. The status messages written to the screen indicate progress.

In this example, Grid-Stat performs several verification tasks in evaluating the 12-hour accumulated precipiation field:

  • For continuous statistics and partial sums (CNT and SL1L2), 15 output lines each:
    (1 field * 15 masking regions)
  • For contingency table counts and statistics (CTC and CTS), 45 output lines each:
    (1 field * 3 raw thresholds * 15 masking regions)
  • For neighborhood methods (NBRCNT, NBRCTC, and NBRCTS), 90 output lines each:
    (1 field * 3 raw thresholds * 2 neighborhood sizes * 15 masking regions)

To greatly increase the runtime performance of Grid-Stat, you could disable the computation of bootstrap confidence intervals in the configuration file. Edit the GridStatConfig_tutorial file as follows:

vi GridStatConfig_tutorial
  • In the boot dictionary, set:
    n_rep = 0;

    To disable the computation of bootstrap confidence intervals.

Now, try rerunning the Grid-Stat command listed above and notice how much faster it runs. While bootstrap confidence intervals are nice to have, they take a long time to compute, especially for gridded data.

admin Mon, 06/24/2019 - 16:17

Output

Output

The output of Grid-Stat is one or more ASCII files containing statistics summarizing the verification performed and a NetCDF file containing difference fields. In this example, the output is written to the current directory, as we requested on the command line. It should now contain 10 Grid-Stat output files beginning with the grid_stat_ prefix, one each for the CTC, CTS, CNT, SL1L2, GRAD, NBRCTC, NBRCTS, and NBRCNT ASCII files, a STAT file, and a NetCDF matched pairs file.

The format of the CTC, CTS, CNT, and SL1L2 ASCII files will be covered for the Point-Stat tool. The neighborhood method and gradient output are unique to the Grid-Stat tool.

  • Rather than comparing forecast/observation values at individual grid points, the neighborhood method compares areas of forecast values to areas of observation values. At each grid box, a fractional coverage value is computed for each field as the number of grid points within the neighborhood (centered on the current grid point) that exceed the specified raw threshold value. The forecast/observation fractional coverage values are then compared rather than the raw values themselves.
  • Gradient statistics are computed on the forecast and observation gradients in the X and Y directions.

Since the lines of data in these ASCII files are so long, we strongly recommend configuring your text editor to NOT use dynamic word wrapping. The files will be much easier to read that way.

Execute the following command to view the NetCDF output of Grid-Stat:

ncview grid_stat_120000L_20050807_120000V_pairs.nc &

Click through the 2d vars variable names in the ncview window to see plots of the forecast, observation, and difference fields for each masking region. If you see a warning message about the min/max values being zero, just click OK.

Now dump the NetCDF header:

ncdump -h grid_stat_120000L_20050807_120000V_pairs.nc

View the NetCDF header to see how the variable names are defined.

Notice how *MANY* variables there are, separate output for each of the masking regions defined. Try editing the config file again by setting apply_mask = FALSE; and gradient = TRUE; in the nc_pairs_flag dictionary. Re-run Grid-Stat and inspect the output NetCDF file. What affect did these changes have?

admin Mon, 06/24/2019 - 16:18

METplus Motivation

METplus Motivation

We have now successfully run the PCP-Combine and Grid-Stat tools to verify 12-hourly accumulated preciptation for a single output time. We did the following steps:

  • Identified our forecast and observation datasets.
  • Constructed PCP-Combine commands to put them into a common accumulation interval.
  • Configured and ran Grid-Stat to compute our desired verification statistics.

Now that we've defined the logic for a single run, the next step would be writing a script to automate these steps for many model initializations and forecast lead times. Rather than every MET user rewriting the same type of scripts, use METplus to automate these steps in a use case!

admin Mon, 06/24/2019 - 16:19