MET Tool: Gen-Ens-Prod

MET Tool: Gen-Ens-Prod
IMPORTANT NOTE: If you are returning to the tutorial, you must source the tutorial setup script before running the following instructions. If you are unsure if you have done this step, please navigate to the Verify Environment is Set Correctly page.

Gen-Ens-Prod Tool: General

Gen-Ens-Prod Functionality

The Gen-Ens-Prod tool may be used to generate simple ensemble products from the provided ensemble forecast members. If climatological mean and standard deviation data is provided, it can be used to set thresholds for the ensemble product generation at each grid point. This tool does not provide methods to generate statistical output from the ensemble members, nor does it allow comparisons. If this is the desired outcome, Gen-Ens-Prod output can be passed to additional MET tools for further verification steps.

Gen-Ens-Prod Usage

View the usage statement for Gen-Ens-Prod by simply typing the following:

gen_ens_prod
Usage: gen_ens_prod  
  -ens file_1 ... file_n | ens_file_list Input gridded ensemble files to be used, or ASCII list of ensemble member file names.
  -out file netCDF output file.
  -config file GenEnsProd file containing the desired configuration settings.
  [-ctrl file] Contains the ensemble control data (optional).
  [-log file] Outputs log messages to the specified file (optional).
  [-v level] Level of logging (optional).

Gen-Ens-Prod has three required arguments. You can specify the list of ensemble files to be used either by entering the file name for each ensemble member file (-ens ens_file_1 ... ens_file_n) or as an ASCII file containing the names of the ensemble files to be used (ens_file_list). Choose whichever way is most convenient for you. The output netCDF file containing the requested ensemble products is set with -out file. Finally, -config file requires a configuration file for processing.

Gen-Ens-Prod has additional optional settings. These include the -ctrl file, allowing users to set an ensemble control member file. This file's data will be included for ensemble mean computations, but excluded for ensemble spread. Log messages created during runtime can be saved using -log file. And as seen in other MET tools, -v level will control the verbosity of the tool's log messages.

jopatz Tue, 01/17/2023 - 17:22

Configure

Configure
Start by making an output directory for Gen-Ens-Prod and changing directories:
mkdir -p ${METPLUS_TUTORIAL_DIR}/output/met_output/gen_ens_prod
cd ${METPLUS_TUTORIAL_DIR}/output/met_output/gen_ens_prod

Similar to other MET tools, the behavior of Gen-Ens-Prod is controlled by the contents of the configuration file passed to it on the command line. The default Gen-Ens-Prod configuration file may be found in the data/config/GenEnsProdConfig_default file. The configurations used by the test script may be found in the scripts/config/GenEnsProdConfig* files.

Prior to modifying the configuration file, users are advised to make a copy of the default:
cp ${MET_BUILD_BASE}/share/met/config/GenEnsProdConfig_default GenEnsProdConfig_tutorial
Open up the GenEnsProdConfig_tutorial file for editing with your preferred text editor.
vi GenEnsProdConfig_tutorial

The configurable items for Gen-Ens-Prod are a more limited version of those found in tools such as Grid-Stat and Point-Stat. The reason for this is tied to the purpose of the tool: Gen-Ens-Prod is for creating simple ensemble products and not for statistical verification with observation datasets. This is made clear in the configuration file's use of the ensemble ens dictionary, rather than a forecast and observation dictionary.

ens = {
ens_thresh = 1.0;
vld_thresh = 1.0;

field = [
{
name = "APCP";
level = "A03";
cat_thresh = [ >0.0, >=5.0 ];
}
];
}

The configuration file also does not have the ability to request statistical line type output, instead allowing users to create ensemble products with the ensemble_flag dictionary.

ensemble_flag = {
latlon = TRUE;
mean = TRUE;
stdev = TRUE;
minus = TRUE;
plus = TRUE;
min = TRUE;
max = TRUE;
range = TRUE;
vld_count = TRUE;
frequency = TRUE;
nep = FALSE;
nmep = FALSE;
climo = FALSE;
climo_cdp = FALSE;
}
You may find a complete description of the configurable items in the gen_ens_prod configuration file section of the MET User's Guide. Please take some time to review them.

For this tutorial, we'll configure Ensemble-Stat to verify 24-hour accumulated precipitation. While we'll run Ensemble-Stat on a single field, please note that it may be configured to operate on multiple fields. The ensemble we're verifying consists of 6 members defined over the west coast of the United States.

In the ens dictionary, set
ens = {
ens_thresh = 0.5;
vld_thresh = 0.5;

field = [
{
name = "APCP";
level = "A24";
cat_thresh = [ >0.0, >=13.0, >=25.0, >=101.0];
}
];
}

Setting ens_thresh at 0.5 allows half of the ensemble member files being passed at runtime to contain invalid data and before GenEnsProd quits with an error. Similarly, vld_thresh at 0.5 allows each grid point to have invalid data for half of the ensemble member files passed at runtime and still compute the requested products for that grid point.

In the ensemble_flag dictionary, set
ensemble_flag = {
latlon = TRUE;
mean = TRUE;
stdev = TRUE;
minus = FALSE;
plus = FALSE;
min = FALSE;
max = FALSE;
range = FALSE;
vld_count = TRUE;
frequency = TRUE;
nep = FALSE;
nmep = FALSE;
climo = FALSE;
climo_cdp = FALSE;
}

These products will be output to the netCDF designated at runtime with the -out flag. They will show a small sample of the options from Gen-Ens-Prod, as well as highlighting how the output changes when one of the files is entered as the control ensemble member, which will be done later this session.

Save and close this file.
jopatz Wed, 01/18/2023 - 14:07

Run

Run
Let's run Gen-Ens-Prod on the command line using the following command:
gen_ens_prod \
-ens ${METPLUS_DATA}/met_test/data/sample_fcst/2009123112/*gep*/d01_2009123112_02400.grib \
-out ./GenEnsProd_APCP24.nc \
-config GenEnsProdConfig_tutorial \
-v 2

Gen-Ens-Prod creates the products we requested in the configuration file, using the categorical thresholds specified. Note that we've passed the input ensemble data directly on the command line by specifying the ensemble member names using wildcards.

When Gen-Ens-Prod has completed running, there will be 1 netCDF output file, GenEnsProd_APCP24.nc

jopatz Wed, 01/18/2023 - 15:19

(content)

Output

Output

As mentioned previously, Gen-Ens-Prod only produces 1 netCDF output file. This file contains all of the requested products that were made in the configuration file.

Let's take a look at the contents of the file:
ncdump -h GenEnsProd_APCP24.nc

Note that the file contains 9 variables: the latitude and longitudes, the ensemble mean and standard deviation, the ensemble valid data count, and the four categorical thresholds that were set, all with the prefix APCP_24_A24_ENS_FREQ_. These thresholds act as uncalibrated probability forecasts and can be verified against observational datasets with other MET tools.

Now let's visually inspect the file contents with a graphical viewer:
ncview GenEnsProd_APCP24.nc

Click through the variable names in the ncview window to see plots of the content we saw in the ncdump command.

Now that we've seen a successful run of the Gen-Ens-Prod tool, let's change the run command slightly to show how the -ctrl setting works.

jopatz Wed, 01/18/2023 - 15:42

Rerun

Rerun

Now that we've seen the output for Gen-Ens-Prod without any control ensemble members, let's change the run slightly by selecting one of the previous ensemble members as the control. Because the required changes will be performed in the run command, no edits will be made to the configuration file.

Run the following command. Note the change to the wildcards, additional flag, and change in output file name:
gen_ens_prod \
-ens ${METPLUS_DATA}/met_test/data/sample_fcst/2009123112/*gep[23567]/d01_2009123112_02400.grib \
-out ./GenEnsProd_APCP24_run2_ctrl.nc \
-config GenEnsProdConfig_tutorial \
-ctrl ${METPLUS_DATA}/met_test/data/sample_fcst/2009123112/arw-fer-gep1/d01_2009123112_02400.grib \
-v 2

When MET has successfully finished running, 1 new netCDF file will be in the current directory containing the same ensemble products that were available in the first run. However, the output is now changed by using one of the original ensemble members as the control member.

Take a look at the contents of the new netCDF file with ncview:
ncview GenEnsProd_APCP24_run2_ctrl.nc

You'll see as you click through the variables that the only difference between the two runs of Gen-Ens-Prod is APCP_24_A24_ENS_STDEV. The mean (APCP_24_A24_ENS_MEAN) and all of the categorical thresholds still use all 6 ensemble members. This is quickly confirmed by viewing APCP_24_A24_ENS_VLD for both, which shows 6 ensembles were used for both runs across the domain.

To better illustrate the difference in the APCP_24_A24_ENS_STDEV products, let's calculate the difference of the two fields using PCP-Combine, and plot the results using Plot-Data-Plane.

Use the following command to subtract the control standard deviation field from the full ensemble standard deviation field, saving it to the netCDF file and variable name provided:
pcp_combine -subtract \
GenEnsProd_APCP24.nc 'name="APCP_24_A24_ENS_STDEV"; level="(*,*)";' \
GenEnsProd_APCP24_run2_ctrl.nc 'name="APCP_24_A24_ENS_STDEV"; level="(*,*)";' \
Full_Ensemble_STDDEV_minus_ctrl_STDDEV.nc -name STDDEV_DIFF
Now, take the PCP-Combine output and create a plot of the results:
plot_data_plane Full_Ensemble_STDDEV_minus_ctrl_STDDEV.nc STDDEV_diff.ps \
'name="STDDEV_DIFF"; level="(*,*)";'
Finally, view the contents of the plot:
display STDDEV_diff.ps

If the differences weren't apparent before, this plot makes it very clear how using a control ensemble member can drastically change a product. As long as it's desired, the -ctrl option is a quick way to run Gen-Ens-Prod without control members for some products, while including it in others.

jopatz Thu, 01/19/2023 - 15:22