Wavelet-Stat

Wavelet-Stat cindyhg Thu, 04/25/2019 - 10:28

Wavelet-Stat Functionality

The Wavelet-Stat tool provides a wavelet-based intensity-scale decomposition for comparing gridded forecasts to gridded observations. Wavelet-Stat may be used in a generalized way to compare any two fields but has been most commonly applied to precipitation. The steps performed in Wavelet-Stat consist of:

  • Preprocess the input forecast and observation fields to select one or more tiles of dimension 2^n by 2^n. The wavelet decomposition may only be performed on such fields, known as dyadic.
  • Threshold the forecast and observation fields to create a 0/1 binary field.
  • Use a wavelet decomposition approach to decompose the thresholded forecast and observation tiles into separate scales.
  • Compare the forecast and observation tiles at each scale and compute statistics, such as the mean-squared error and intensity skill score.
  • If multiple tiles were used, aggregate the results across all of the tiles and write out the aggregated statistics as well as the statistics for each tile.

Wavelet-Stat may be configured to use several different types of wavelet decompositions, all of those that are supported by the GNU Scientific Library. Here we'll use the Haar wavelet which is employed in the Intensity-Scale method by Casati et al. See the MET Users Guide for a more thorough description of how to configure the Wavelet-Stat tool.

Wavelet-Stat Usage

View the usage statement for Wavelet-Stat by simply typing the following:

wavelet_stat

At a minimum, the input gridded fcst_file, the input gridded obs_file, and the configuration config_file must be passed in on the command line.

Just as with Grid-Stat, the forecast and observation fields must be interpolated to a common grid before Grid-Stat can compute statistics. As of version 5.1, the MET tools are able to regrid data on the fly using use the regrid section in the configuration files. Alternatively, users may choose to regrid their entire GRIB1 or GRIB2 files using the copygb and/or wgrib2 utilities.

Configure

Configure cindyhg Thu, 04/25/2019 - 10:44

The behavior of Wavelet-Stat is controlled by the contents of the configuration file passed to it on the command line. The default Wavelet-Stat configuration file may be found in the $MET_BASE/config/WaveletStatConfig_default file. The configuration used by the test script may be found in the met-8.0/scripts/config/WaveletStatConfig_APCP_12 file. Prior to modifying the configuration file, users are advised to make a copy of the default:

cp $MET_BASE/config/WaveletStatConfig_default $MET_TUTORIAL_DATA/config/WaveletStatConfig_tutorial

Open up the $MET_TUTORIAL_DATA/config/WaveletStatConfig_tutorial file for editing with the text editor of your choice.

The configuration items for Wavelet-Stat are used to specify how the intensity-scale verification approach is to be performed. In previous versions of MET, Wavelet-Stat was restricted to comparing variables of the same type. In METv8.0, this has been generalized for comparing two different types of variables, if desired. The configurable items include specifications for the following:

  • The verification domain.
  • The fields and vertical level or accumulation interval to be compared.
  • Option to mask out a portion of the raw fields.
  • Specify how one or more tiles of size 2^n by 2^n are extracted from the domain.
  • Select which wavelet family and type is used.
  • Various plotting options.

While the Wavelet-Stat configuration file contains many options, beginning users will typically only need to modify a few of them. You may find a complete description of the configurable items in the MET Users Guide or in the comments of the configuration file itself. Please take some time to review them.

For this tutorial, we'll configure Wavelet-Stat to verify the same 12-hour accumulated precipitation output of PCP-Combine that we used for Grid-Stat. Edit the $MET_TUTORIAL_DATA/config/WaveletStatConfig_tutorial file as follows:

In the fcst dictionary, set

  field = [
     {
       name       = "APCP_12";
       level      = "(*,*)";
       cat_thresh = [ >0 ];
     }
   ];

To verify the NetCDF variable of that name and threshold any non-zero precipitation.

  • Set grid_decomp_flag = TILE;
    To use the tile we'll manually define below.

Set the tile dictionary to

tile = {
   width = 64;
   location = [
      {
         x_ll = 80;
         y_ll = 25;
      }
   ];
};

To define a single tile with a lower-left corner of (80, 25) in the grid and a dyadic width of 64 (2^6) grid boxes.

In the output_flag dictionary, set

  output_flag = {
      isc  = BOTH;
   }

To write to two ASCII statistics files, a NetCDF scale decomposition file, and a PostScript summary plot.

Save and close this file.

Run

Run cindyhg Thu, 04/25/2019 - 10:45

Next, run Wavelet-Stat on the command line using the following command:

wavelet_stat \
$MET_TUTORIAL_DATA/output/pcp_combine/sample_fcst_24L_2005080800V_12A.nc \
$MET_TUTORIAL_DATA/output/pcp_combine/sample_obs_2005080800V_12A.nc \
$MET_TUTORIAL_DATA/config/WaveletStatConfig_tutorial \
-outdir $MET_TUTORIAL_DATA/output/wavelet_stat \
-v 2

Wavelet-Stat is now performing the verification task we requested in the configuration file. It should take several seconds to run. Generally, the Wavelet-Stat tool runs pretty quickly.

When Wavelet-Stat is finished, it will have created four files: two ASCII statistics files, a NetCDF scale decomposition file, and a PostScript summary plot. Open up the PostScript summary plot using the PostScript viewer of your choice, gv, or Ghostview, for example:

gv $MET_TUTORIAL_DATA/output/wavelet_stat/wavelet_stat_240000L_20050808_000000V.ps

This PostScript summary plot contains five pages. The first page summarizes the definition of the tile(s) in the domain. The remaining pages show the difference field (f-o) for each decomposed scale and the statistics for each scale.

Now, let's modify the configuration file and rerun this case. Again, open up the $MET_TUTORIAL_DATA/config/WaveletStatConfig_tutorial file and edit it as follows:

  • Set grid_decomp_flag = AUTO;
    To let the Wavelet-Stat tool automatically define the largest 2^n by 2^n tile that fits in the center of the domain.

Now, rerun the Wavelet-Stat command listed above, and when it is finished, reload the PostScript plot. On the first page of the PostScript plot, note the following:

  • A tile of dimension 128 by 128 was chosen in the center of the domain. Since the dimension increased from 64 (= 2^6) to 128 (= 2^7) the number of scales has increased by 1.
  • The tile chosen includes a large amount of missing data in the observation field. You should try to avoid including missing data when running the Wavelet-Stat tool as it will cause misleading results. Missing data is replaced with a value of 0 for preciptation fields or the mean value of the valid data for non-preciptation fields.

Close that PostScript file.

Output

Output cindyhg Thu, 04/25/2019 - 10:45

As mentioned on the previous page, the output of Wavelet-Stat typically consists of four files: two ASCII statistics files, one NetCDF scale decomposition file, and one PostScript summary plot. The output of these files may be disabled in the configuration file or using the appropriate command line argument. In this example, the output is written to the $MET_TUTORIAL_DATA/output/wavelet_stat directory as we requested on the command line.

The Wavelet-Stat output file naming convention is similar to that of the Point-Stat and Grid-Stat tools. The four Wavelet-Stat output files are described briefly below:

  • The PostScript file was described on the previous page.
  • The NetCDF scale decomposition file contains the raw, thresholded, and decomposed fields for each variable, tile, and threshold used. Note that while the PostScript plot only shows the difference (f-o) fields, the NetCDF file contains the actual forecast, observation, and difference fields decomposed for each scale.
  • The ASCII ISC file contains the ISC line type with a header row for the column names.
  • The ASCII STAT file contains only the ISC line type. Currently, the Wavelet-Stat tool only creates one output line type. So the ISC file and the STAT file are almost identical. In future versions of MET, the Wavelet-Stat tool may be enhanced to produce additional line types.

Open up the $MET_TUTORIAL_DATA/output/wavelet_stat/wavelet_stat_240000L_20050808_000000V_isc.txt ISC file using the text editor of your choice, and note the following:

  • The header columns are identical to the other ASCII output files from Point-Stat and Grid-Stat.
  • The LINE_TYPE column is set to ISC, indicating that the columns to follow contain information about the intensity-scale method.
  • This file contains eights rows of data. The ISCALE column indicates the scale for that row. The row with ISCALE equal to 0 contains scores for the thresholded binary fields. The rows with ISCALE greater than 0 contain scores for the thresholded binary fields decomposed into separate scales.
  • Looking carefully you'll see that the columns for MSEFENERGY2, and OENERGY2 are additive across the scales. The sum of these values in the lines where ISCALE is greater than 0 equals the values in the line where ISCALE equals 0.

Close this file and use the ncdump and ncview utilities (if available on your machine) to view the NetCDF output of Wavelet-Stat:

ncdump -h $MET_TUTORIAL_DATA/output/wavelet_stat/wavelet_stat_240000L_20050808_000000V.nc
ncview $MET_TUTORIAL_DATA/output/wavelet_stat/wavelet_stat_240000L_20050808_000000V.nc

When clicking through and displaying each variable, note that some have a dimension for scale. Click through the different scales to see the decompositions of those fields.