MODE

MODE griggs Wed, 04/24/2019 - 15:29

MODE Functionality

MODE, the Method for Object-Based Diagnostic Evaluation, provides an object-based verification for comparing gridded forecasts to gridded observations. MODE may be used in a generalized way to compare any two fields containing data from which objects may be well defined. It has most commonly been applied to precipitation fields and radar reflectivity. The steps performed in MODE consist of:

  • Define objects in the forecast and observation fields based on user-defined parameters.
  • Compute attributes for each of those objects: such as area, centroid, axis angle, and intensity.
  • For each forecast/observation object pair, compute differences between their attributes: such as area ratio, centriod distance, angle difference, and intensity ratio.
  • Use fuzzy logic to compute a total interest value for each forecast/observation object pair based on user-defined weights.
  • Based on the computed interest values, match objects across fields and merge objects within the same field.
  • Write output statistics summarizing the characteristics of the single objects, the pairs of objects, and the matched/merged objects.

MODE may be configured to use a few different sets of logic with which to perform matching and merging. In this tutorial, we'll use the most simple approach, but users are encouraged to read the MET Users Guide for a more thorough description of MODE's capabilities.

MODE Usage

View the usage statement for MODE by simply typing the following:

mode

At a minimum, the input gridded fcst_file, the input gridded obs_file, and the configuration config_file must be passed in on the command line.

As with the other MET statistics tools, all gridded forecast and observation data must be interpolated to a common grid prior to processing. This may be done using the automated regrid feature in the Ensemble-Stat configuration file or by running copygb and/or wgrib2 first.

Configure

Configure griggs Wed, 04/24/2019 - 15:31

The behavior of MODE is controlled by the contents of the configuration file passed to it on the command line. The default MODE configuration file may be found in the met-8.0/share/met/config/MODEConfig_default file. The configurations used by the test scripts may be found in the $MET_BASE/config/MODEConfig* files. Prior to modifying the configuration file, users are advised to make a copy of the default:

cp $MET_BASE/config/MODEConfig_default $MET_TUTORIAL_DATA/config/MODEConfig_tutorial

The configuration items for MODE are used to specify how the object-based verification approach is to be performed. Whereas in Point-Stat and Grid-Stat you may only compare the same type of forecast and observation fields, in MODE you may compare any two fields. When necessary, the items in the configuration file are specified separately for the forecast and observation fields. In most cases though, users will be comparing the same forecast and observation fields. The configurable items include specifications for the following:

  • The verification domain.
  • The forecast and observation fields and vertical levels or accumulation intervals to be compared.
  • Options to censor a portion of or threshold the raw fields.
  • The forecast and observation object definition parameters.
  • Options to filter out objects that don't meet a size or intensity criteria.
  • Flags to control the logic for matching/merging.
  • Weights to be applied for the fuzzy engine matching/merging algorithm.
  • Interest functions to be used for the fuzzy engine matching/merging algorithm.
  • Total interest threshold for matching/merging.
  • Various plotting options.

While the MODE configuration file contains many options, beginning users will typically only need to modify a few of them. You may find a complete description of the configurable items in the MET Users Guide or in the $MET_BASE/config/README file. Please take some time to review them.

For this tutorial, we'll configure MODE to verify the same 12-hour accumulated precipitation output of PCP-Combine that we used for Grid-Stat. Whereas Grid-Stat and Point-Stat may be used to compare multiple fields in one run, MODE compares a single forecast field to a single observation field.

Open up the $MET_TUTORIAL_DATA/config/MODEConfig_tutorial file for editing with the text editor of your choice and edit it as follows:

  • Set grid_res = 40;
    To set the nominal grid spacing to 40km for this grid. The grid_res parameter is used further down in the config file in defining interest functions.
  • In the fcst dictionary, set
       field = {
           name  = "APCP_12";
           level = "(*,*)";
         };

    To select the forecast field of 12-hour rainfall total accumulation from the input NetCDF file.

  • In the fcst dictionary, set conv_radius = 5;
    To specify a convolution smoothing radius of 5 grid units. This parameters may be set explicitly, as we're doing now, or relative to another value, like the grid_res parameter which was set above.
  • In the fcst dictionary, set conv_thresh = >=5.0;
    To threshold the convolved field and define objects.
  • In the fcst dictionary, set merge_flag = NONE;
    To disable the additional forecast and observation merging methods.
  • Set obs = fcst;
    To use the settings from the fcst dictionary above.
  • Set match_flag = MERGE_BOTH;
    To use the one-step matching/merging method.

Save and close this file.

Run

Run griggs Wed, 04/24/2019 - 15:32

Next, run MODE on the command line using the following command:

mode \
$MET_TUTORIAL_DATA/output/pcp_combine/sample_fcst_24L_2005080800V_12A.nc \
$MET_TUTORIAL_DATA/output/pcp_combine/sample_obs_2005080800V_12A.nc \
$MET_TUTORIAL_DATA/config/MODEConfig_tutorial \
-outdir $MET_TUTORIAL_DATA/output/mode \
-v 2

MODE is now performing the verification task we requested in the configuration file. It should take a minute or two to run. MODE's runtime is greatly influenced by the number of gridpoints in the domain, the convolution radius chosen, and the number of objects resolved. The more dense the domain, larger the convolution radius, and greater the number of objects, the more computations required.

When MODE is finished, it will have created four files: two ASCII statistics files, a NetCDF object file, and a PostScript summary plot. Open up the PostScript summary plot using the PostScript viewer of your choice, gv, or Ghostview, for example:

gv $MET_TUTORIAL_DATA/output/mode/mode_240000L_20050808_000000V_120000A.ps

This PostScript summary plot contains 5 pages. The first page summarizes the application of MODE to this dataset. The second and third pages contain enlargements of the forecast and observation raw and object fields. The fourth page shows the forecast and observation object fields overlaid on top of each other. And the fifth page contains pair-wise differences for the matched clusters of objects. The PostScript summary plot will contain additional pages when additional merging methods are selected. Looking at the first page, note the following:

  • The valid data in the forecast field extends much further than in the observation field leading to objects in the forecast field with no match (royal blue = unmatched) in the observation field.
  • The forecast field contains 5 objects while the observation field contains 6.
  • Two pairs of objects (colored red and green) are matched across these fields. Forecast object 4 matches observed object 5 (red). Forecast object 3 matches observed object 2 (green).

Now, let's modify the configuration file and rerun this case. Again, open up the $MET_TUTORIAL_DATA/config/MODEConfig_tutorial file and edit it as follows:

  • In the fcst dictionary, set conv_radius = 2;
    To apply less smoothing before defining objects.
  • Set mask_missing_flag = BOTH;
    To mask out the bad data in both fields with each other.

Now, rerun the MODE command listed above, and when it's finished, reload the PostScript plot. Reducing the convolution radius (amount of smoothing) while keeping the convolution threshold fixed should result in a greater number of smaller objects. On the first page of the PostScript plot, note the following:

  • The valid data in the raw forecast and observation fields now match up nicely.
  • The forecast field contains six objects and the observation field contains 17. These are called simple objects.
  • Four sets of objects are now matched across the fields. They are colored red, green, magenta, blue, and light blue.
  • Objects that are colored the same color within the same field are called merged. Objects that have the same color across fields are called matched.
  • Each set of colored objects is referred to as a cluster object. A cluster object consists of one or more simple objects. For example, in the observation field, simple object numbers 11 and 13 (both colored green) are merged together and are members of the same cluster object. They match forecast object number 4 which is its own cluster object.

After completing the next page on MODE Output, users are welcome to return to this page, play around with settings in the configuration file, and rerun this case several times. Listed below are some configuration parameters you may want to try modifying:

  • total_interest_thresh
  • fcst_conv_radius and obs_conv_radius
  • fcst_conv_thresh and obs_conv_thresh
  • fcst_area_thresh and obs_area_thresh
  • fcst_inten_thresh and obs_inten_thresh
  • fcst_merge_thresh and obs_merge_thresh with fcst_merge_flag and obs_merge_flag both set to 1

Output

Output griggs Wed, 04/24/2019 - 15:35

As mentioned on the previous page, the output of MODE typically consists of four files: two ASCII statistics files, one NetCDF object file, and one PostScript summary plot. The output of any of these files may be disabled using the appropriate configuration file entry. In this example, the output is written to the $MET_TUTORIAL_DATA/output/mode directory as we requested on the command line.

The MODE output file naming convention is designed to contain the lead times, valid times, and accumulation times. If you rerun MODE on the same fields but with a slightly different configuration, the new output will override the old output, unless you redirect it to a different directory using the -outdir command line argument or specify an output_prefix in the configuration file. The four MODE output files are described briefly below:

  • The PostScript file ends in .ps and was described on the previous page.
  • The NetCDF object file ends in _obj.nc and contains the raw and cluster object indices and boundary polylines for the simple objects.
  • The ASCII contingency table statistics file ends in _cts.txt.
  • The ASCII object statistics file ends in _obj.txt and contains all of the object and object comparison data.

Since we've already seen the PostScript summary plot, we'll skip that one here. Use the ncview utility (if available on your machine) to view the NetCDF object output of MODE:

ncview $MET_TUTORIAL_DATA/output/mode/mode_240000L_20050808_000000V_120000A_obj.nc&

Click through the variable names in the ncview window to see plots of the four object fields in the file. The fcst_obj_id and obs_obj_id contain the indices for the forecast and observation objects defined by MODE. The fcst_clus_id and obs_clus_id contain indices for the matched cluster objects. Now dump the header:

ncdump -h $MET_TUTORIAL_DATA/output/mode/mode_240000L_20050808_000000V_120000A_obj.nc

View the NetCDF header to see how the file is structured.

The object colors plotted by ncview will generally not correspond to those in MODE's PostScript output.

Next, open up the $MET_TUTORIAL_DATA/output/mode/mode_240000L_20050808_000000V_120000A_cts.txt contingency table statistics ASCII file using the text editor of your choice. This file is similar to the CTS output of Grid-Stat but much less complete. It contains three lines, a header row followed by contingency table statistics computed two ways:

  • The first row contains RAW in the FIELD column. The scores listed in this row are computed from the RAW forecast and observation fields. The raw fields are thresholded using the fcst_conv_thresh and obs_conv_thresh values specified to create 0/1 mask fields. Those mask fields are compared point by point to compute a contingency table. The scores listed in this row are derived from that contingency table.
  • The second row contains OBJECT in the FIELD column. The scores listed in this row are computed from the forecast and observation OBJECT fields. In MODE, after objects have been defined, the field may be thought of as a 0/1 mask field, 1 at grid points contained inside an object and 0 everywhere else. The object mask fields are compared in this way point by point, a contingency table is computed, and the corresponding statistics are listed in this row.

This file is not meant to replicate or replace the functionality of the Grid-Stat tool which includes many more features and options. It is simply meant to provide a convenient way of seeing how the output of MODE compares to the traditional contingency table statistics that are often computed.

Close this file, and open up the $MET_TUTORIAL_DATA/output/mode/mode_240000L_20050808_000000V_120000A_obj.txt object statistics ASCII file using the text editor of your choice. This file contains all of the object statistics in which most users will be interested. It contains four different line types which may be distinguished by the contents of the OBJECT_ID column:

  • The rows containing F### and O### in that column give information about the simple forecast and observation objects, respectively. ### refers to the simple object number (e.g. "F001" for the first simple forecast object or "O010" for the tenth simple observation object).
  • The rows containing F###_O### in that column give information about pairs of simple objects (e.g. "F001_O010" compares the first forecast object to the tenth observation object).
  • The rows containing CF### and CO### in that column give information about the cluster forecast and observation objects, respectively. ### refers to the cluster object number.
  • The rows containing CF###_CO### in that column give information about pairs of cluster objects.

In the ASCII MODE statistics file, the value of 000 for NNN in the OBJECT_ID column indicates that that object was not matched.

Each line in this file contains the same number of columns. However, only certain columns are applicable to certain line types. For example, the CENTROID_X and CENTROID_Y columns contain valid data for simple object lines, but not for pairs of simple object lines. The opposite is true for the CENTROID_DIST column which gives the distance between the centroids of two objects. Columns which are not applicable to a given line type are filled with a value of NA.

Quilt Option

The MODE Tool processes multiple convolution radii and thresholds. Each configuration would need to be run separately in earlier versions of MODE. Open up the $MET_TUTORIAL_DATA/config/MODEConfig_tutorial file for editing again and make the following changes:

  • Set the quilt option to true to enable all 9 possible permutations of radius and threshold listed below. If false, only 3 configurations would be run:
quilt = TRUE;
  • In the fcst dictionary, set:
     
conv_radius = [ 2, 4, 6 ];
conv_thresh = [ >=4.0, >=5.0, >=6.0 ];

Save and close this file and rerun the previous MODE command.

Notice that the output files are now appended with R#_T# where # indicates which radius and threshold were applied. Inspect the PostScript output and notice that as radius increases the objects get smoother, and as the thresholds increase, the objects get smaller.

Please refer to the MET Users Guide for a more thorough description of the MODE output. At this point, feel free to return to the previous page and play around with the MODE configuration settings.