MET Tool: Point-Stat
MET Tool: Point-StatPoint-Stat Tool: General
Point-Stat Functionality
The Point-Stat tool provides verification statistics for comparing gridded forecasts to observation points, as opposed to gridded analyses like Grid-Stat. The Point-Stat tool matches gridded forecasts to point observation locations using one or more configurable interpolation methods. The tool then computes a configurable set of verification statistics for these matched pairs. Continuous statistics are computed over the raw matched pair values. Categorical statistics are generally calculated by applying a threshold to the forecast and observation values. Confidence intervals, which represent a measure of uncertainty, are computed for all of the verification statistics.
Point-Stat Usage
Usage: point_stat | ||
fcst_file | Input gridded file path/name | |
obs_file | Input NetCDF observation file path/name | |
config_file | Configuration file | |
[-point_obs file] | Additional NetCDF observation files to be used (optional) | |
[-obs_valid_beg time] | Sets the beginning of the matching time window in YYYYMMDD[_HH[MMSS]] format (optional) | |
[-obs_valid_end time] | Sets the end of the matching time window in YYYYMMDD[_HH[MMSS]] format (optional) | |
[-outdir path] | Overrides the default output directory (optional) | |
[-log file] | Outputs log messages to the specified file (optional) | |
[-v level] | Level of logging (optional) |
At a minimum, the input gridded fcst_file, the input NetCDF obs_file (output of PB2NC, ASCII2NC, MADIS2NC, and LIDAR2NC, last two not covered in these exercises), and the configuration config_file must be passed in on the command line. You may use the -point_obs command line argument to specify additional NetCDF observation files to be used.
Configure
ConfigurePoint-Stat Tool: Configure
cd ${METPLUS_TUTORIAL_DIR}/output/met_output/point_stat
The behavior of Point-Stat is controlled by the contents of the configuration file passed to it on the command line. The default Point-Stat configuration file may be found in the data/config/PointStatConfig_default file.
The configurable items for Point-Stat are used to specify how the verification is to be performed. The configurable items include specifications for the following:
- The forecast fields to be verified at the specified vertical levels.
- The type of point observations to be matched to the forecasts.
- The threshold values to be applied.
- The areas over which to aggregate statistics - as predefined grids, lat/lon polylines, or individual stations.
- The confidence interval methods to be used.
- The interpolation methods to be used.
- The types of verification methods to be used.
- Set:
fcst = {
message_type = [ "ADPUPA" ];
field = [
{
name = "TMP";
level = [ "P850-1050", "P500-850" ];
cat_thresh = [ <=273, >273 ];
}
];
}
obs = fcst;to verify temperature over two different pressure ranges against ADPUPA observations using the thresholds specified.
- Set:
ci_alpha = [ 0.05, 0.10 ];to compute confidence intervals using both a 5% and a 10% level of certainty.
- Set:
output_flag = {
fho = BOTH;
ctc = BOTH;
cts = STAT;
mctc = NONE;
mcts = NONE;
cnt = BOTH;
sl1l2 = STAT;
sal1l2 = NONE;
vl1l2 = NONE;
val1l2 = NONE;
pct = NONE;
pstd = NONE;
pjc = NONE;
prc = NONE;
ecnt = NONE;
eclv = BOTH;
mpr = BOTH;
}to indicate that the forecast-hit-observation (FHO) counts, contingency table counts (CTC), contingency table statistics (CTS), continuous statistics (CNT), partial sums (SL1L2), economic cost/loss value (ECLV), and the matched pair data (MPR) line types should be output. Setting SL1L2 and CTS to STAT causes those lines to only be written to the output .stat file, while setting others to BOTH causes them to be written to both the .stat file and the optional LINE_TYPE.txt file.
- Set:
output_prefix = "run1";to customize the output file names for this run.
Note that in the mask dictionary, the grid entry is set to FULL. This instructs Point-Stat to compute statistics over the entire input model domain. Setting grid to FULL has this special meaning.
Run
RunPoint-Stat Tool: Run
Next, run Point-Stat to compare a GRIB forecast to the NetCDF point observation output of the ASCII2NC tool.
${METPLUS_DATA}/met_test/data/sample_fcst/2007033000/nam.t00z.awip1236.tm00.20070330.grb \
../ascii2nc/tutorial_ascii.nc \
PointStatConfig_tutorial_run1 \
-outdir . \
-v 2
Point-Stat is now performing the verification tasks we requested in the configuration file. It should take less than a minute to run. You should see several status messages printed to the screen to indicate progress.
If you receive a syntax error such as the one listed below, review PointStatConfig_tutorial_run1 for an extra comma after the "}" on line number 59
cat_thresh = [ >273.0 ];
}, <---Remove the comma
DEBUG 1: Default Config File: /usr/local/met-9.0/share/met/config/PointStatConfig_default
DEBUG 1: User Config File: PointStatConfig_tutorial_run1
ERROR :
ERROR : yyerror() -> syntax error in file "/tmp/met_config_26760_0"
ERROR :
ERROR : line = 59
ERROR :
ERROR : column = 0
ERROR :
ERROR : text = "]"
ERROR :
ERROR :
ERROR : ];
ERROR : _____
ERROR :
Notice the more detailed information about which observations were used for each verification task. If you run Point-Stat and get fewer matched pairs than you expected, try using the -v 3 option to see why the observations were rejected.
Output
OutputPoint-Stat Tool: Output
The output of Point-Stat is one or more ASCII files containing statistics summarizing the verification performed. Since we wrote output to the current directory, it should now contain 6 ASCII files that begin with the point_stat_ prefix, one each for the FHO, CTC, CNT, ECLV, and MPR types, and a sixth for the STAT file. The STAT file contains all of the output statistics while the other ASCII files contain the exact same data organized by line type.
- In the kwrite editor, select Settings->Configure Editor, de-select Dynamic Word Wrap and click OK.
- In the vi editor, type the command :set nowrap. To set this as the default behavior, run the following command:
echo "set nowrap" >> ~/.exrc
- This is a simple ASCII file consisting of several rows of data.
- Each row contains data for a single verification task.
- The FCST_LEAD, FCST_VALID_BEG, and FCST_VALID_END columns indicate the timing information of the forecast field.
- The OBS_LEAD, OBS_VALID_BEG, and OBS_VALID_END columns indicate the timing information of the observation field.
- The FCST_VAR, FCST_UNITS, FCST_LEV, OBS_VAR, OBS_UNITS, and OBS_LEV columns indicate the two parts of the forecast and observation fields set in the configure file.
- The OBTYPE column indicates the PrepBufr message type used for this verification task.
- The VX_MASK column indicates the masking region over which the statistics were accumulated.
- The INTERP_MTHD and INTERP_PNTS columns indicate the method used to interpolate the forecast data to the observation location.
- The FCST_THRESH and OBS_THRESH columns indicate the thresholds applied to FCST_VAR and OBS_VAR.
- The COV_THRESH column is not applicable here and will always have NA when using point_stat.
- The ALPHA column indicates the alpha used for confidence intervals.
- The LINE_TYPE column indicates that these are CTC contingency table count lines.
- The TOTAL column indicates the total number of matched pairs.
- The remaining columns contain the counts for the contingency table computed by applying the threshold to the forecast/observation matched pairs. The FY_OY (forecast: yes, observation: yes), FY_ON (forecast: yes, observation: no), FN_OY (forecast: no, observation: yes), and FN_ON (forecast: no, observation: no) columns indicate those counts.
- What do you notice about the structure of the contingency table counts with respect to the two thresholds used? Does this make sense?
- Does the model appear to resolve relatively cold surface temperatures?
- Based on these observations, are temperatures >273 relatively rare or common in the P850-500 range? How can this affect the ability to get a good score using contingency table statistics? What about temperatures <=273 at the surface?
- The columns prior to LINE_TYPE contain the same data as the previous file we viewed.
- The LINE_TYPE column indicates that these are CNT continuous lines.
- The remaining columns contain continuous statistics derived from the raw forecast/observation pairs. See the CNT OUTPUT FORMAT section in the Point-Stat section of the MET User's Guide for a thorough description of the output.
- Again, confidence intervals are given for each of these statistics as described above.
- What conclusions can you draw about the model's performance at each level using continuous statistics? Justify your answer. Did you use a single metric in your evaluation? Why or why not?
- Comparing the first line with an alpha value of 0.05 to the second line with an alpha value of 0.10, how does the level of confidence change the upper and lower bounds of the confidence intervals (CIs)?
- Similarly, comparing the first line with few numbers of matched pairs in the TOTAL column to the third line with more, how does the sample size affect how you interpret your results?
- The columns prior to LINE_TYPE contain the same data as the previous file we viewed.
- The LINE_TYPE column indicates that these are FHO forecast-hit-observation rate lines.
- The remaining columns are similar to the contingency table output and contain the total number of matched pairs, the forecast rate, the hit rate, and observation rate.
- The forecast, hit, and observation rates should back up your answer to the third question about the contingency table output.
- The columns prior to LINE_TYPE contain the same data as the previous file we viewed.
- The LINE_TYPE column indicates that these are MPR matched pair lines.
- The remaining columns are similar to the contingency table output and contain the total number of matched pairs, the matched pair index, the latitude, longitude, and elevation of the observation, the forecasted value, the observed value, and the climatological value (if applicable).
- There is a lot of data here and it is recommended that the MPR line_type is used only to verify the tool is working properly.
Reconfigure
ReconfigurePoint-Stat Tool: Reconfigure
Now we'll reconfigure and rerun Point-Stat.
This time, we'll use two dictionary entries to specify the forecast field in order to set different thresholds for each vertical level. Point-Stat may be configured to verify as many or as few model variables and vertical levels as you desire.
- Set:
fcst = {
field = [
{
name = "TMP";
level = [ "Z2" ];
cat_thresh = [ >273, >278, >283, >288 ];
},
{
name = "TMP";
level = [ "P750-850" ];
cat_thresh = [ >278 ];
}
];
}
obs = fcst;to verify 2-meter temperature and temperature fields between 750hPa and 850hPa, using the thresholds specified.
- Set:
message_type = ["ADPUPA","ADPSFC"];
sid_inc = [];
sid_exc = [];
obs_quality = [];
duplicate_flag = NONE;
obs_summary = NONE;
obs_perc_value = 50;to include the Upper Air (UPA) and Surface (SFC) observations in the evaluation
- Set:
mask = {
grid = [ "G212" ];
poly = [ "MET_BASE/poly/EAST.poly",
"MET_BASE/poly/WEST.poly" ];
sid = [];
llpnt = [];
}to compute statistics over the NCEP Grid 212 region and over the Eastern and Western United States, as defined by the polylines specified.
- Set:
interp = {
vld_thresh = 1.0;
shape = SQUARE;
type = [
{
method = NEAREST;
width = 1;
},
{
method = DW_MEAN;
width = 5;
}
];
}to indicate that the forecast values should be interpolated to the observation locations using the nearest neighbor method and by computing a distance-weighted average of the forecast values over the 5 by 5 box surrounding the observation location.
- Set:
output_flag = {
fho = BOTH;
ctc = BOTH;
cts = BOTH;
mctc = NONE;
mcts = NONE;
cnt = BOTH;
sl1l2 = BOTH;
sal1l2 = NONE;
vl1l2 = NONE;
val1l2 = NONE;
pct = NONE;
pstd = NONE;
pjc = NONE;
prc = NONE;
ecnt = NONE;
eclv = BOTH;
mpr = BOTH;
}to switch the SL1L2 and CTS output to BOTH and generate the optional ASCII output files for them.
- Set:
output_prefix = "run2";to customize the output file names for this run.
- 2 fields: TMP/Z2 and TMP/P750-850
- 2 observing message types: ADPUPA and ADPSFC
- 3 masking regions: G212, EAST.poly, and WEST.poly
- 2 interpolations: UW_MEAN width 1 (nearest-neighbor) and DW_MEAN width 5
Multiplying 2 * 2 * 3 * 2 = 24. So in this example, Point-Stat will accumulate matched forecast/observation pairs into 24 groups. However, some of these groups will result in 0 matched pairs being found. To each non-zero group, the specified threshold(s) will be applied to compute contingency tables.
Rerun
RerunPoint-Stat Tool: Rerun
Next, run Point-Stat to compare a GRIB forecast to the NetCDF point observation output of the PB2NC tool, as opposed to the much smaller ASCII2NC output we used in the first run.
${METPLUS_DATA}/met_test/data/sample_fcst/2007033000/nam.t00z.awip1236.tm00.20070330.grb \
../pb2nc/tutorial_pb_run1.nc \
PointStatConfig_tutorial_run2 \
-outdir . \
-v 2
Point-Stat is now performing the verification tasks we requested in the configuration file. It should take a minute or two to run. You should see several status messages printed to the screen to indicate progress. Note the number of matched pairs found for each verification task, some of which are 0.
Plot-Data-Plane Tool
In this step, we have verified 2-meter temperature. The Plot-Data-Plane tool within MET provides a way to visualize the gridded data fields that MET can read.
${METPLUS_DATA}/met_test/data/sample_fcst/2007033000/nam.t00z.awip1236.tm00.20070330.grb \
nam.t00z.awip1236.tm00.20070330_TMPZ2.ps \
'name="TMP"; level="Z2";'
Plot-Data-Plane requires an input gridded data file, an output postscript image file name, and a configuration string defining which 2-D field is to be plotted.
-
Set the title to 2-m Temperature.
-
Set the plotting range as 250 to 305.
-
Use the color table named ${MET_BUILD_BASE}/share/met/colortables/NCL_colortables/wgne15.ctable
Next, we'll take a look at the Point-Stat output we just generated.
See the usage statement for all MET tools using the --help command line option or with no options at all.
Output
OutputPoint-Stat Tool: Output
The format for the CTC, CTS, and CNT line types are the same. However, the numbers will be different as we used a different set of observations for the verification.
- The columns prior to LINE_TYPE contain header information.
- The LINE_TYPE column indicates that these are CTS lines.
- The remaining columns contain statistics derived from the threshold contingency table counts. See the point_stat output section of the MET User's Guide for a thorough description of the output.
- Confidence intervals are given for each of these statistics, computed using either one or two methods. The columns ending in _NCL(normal confidence lower) and _NCU (normal confidence upper) give lower and upper confidence limits computed using assumptions of normality. The columns ending in _BCL (bootstrap confidence lower) and _BCU (bootstrap confidence upper) give lower and upper confidence limits computed using bootstrapping.
- The columns prior to LINE_TYPE contain header information.
- The LINE_TYPE column indicates these are SL1L2 partial sums lines.
Lastly, the point_stat_run2_360000L_20070331_120000V.stat file contains all of the same data we just viewed but in a single file. The Stat-Analysis tool, which we'll use later in this tutorial, searches for the .stat output files by default but can also read the .txt output files.