Configure

Configure johnhg Tue, 06/25/2019 - 09:08

The behavior of Stat-Analysis is controlled by the contents of the configuration file or the job command passed to it on the command line. The default Stat-Analysis configuration may be found in the $MET_BASE/config/STATAnalysisConfig_default file. The configuration used by the test script may be found in the met-8.0/scripts/config/STATAnalysisConfig file. Prior to modifying the configuration file, users are advised to make a copy of the default:

cp $MET_BASE/config/STATAnalysisConfig_default $MET_TUTORIAL_DATA/config/STATAnalysisConfig_tutorial

Open up the $MET_TUTORIAL_DATA/config/STATAnalysisConfig_tutorial file for editing with your preferred text editor.

For this tutorial, we'll set up a configuration file to run a few jobs. Then, we'll show an example of running a single analysis job on the command line.

The Stat-Analysis configuration file has two main sections. The items in the first section are used to filter the STAT data being processed. Only those lines which meet the filtering requirements specified are retained and passed down to the second section. The second section defines the analysis jobs to be performed on the filtered data. When defining analysis jobs, additional filtering parameters may be defined to further refine the STAT data with which to perform that particular job.

As a word of caution, the Stat-Analysis tool is designed to be extremely flexible. However, with that flexibility comes potential for improperly specifying your job requests, leading to unintended results. It is the user's responsibility to ensure that each analysis job is performed over the intended subset of STAT data. The -dump_row job command option is useful for verifying that the analysis was performed over the intended subset of STAT data.

We'll configure the Stat-Analysis tool to analyze the results of the Point-Stat tool output and aggregate scores for the EAST and WEST verification regions. Edit the $MET_TUTORIAL_DATA/config/STATAnalysisConfig_tutorial file as follows:

Set fcst_var = [ "TMP" ]; To only use STAT lines for temperature (TMP).
Set fcst_lev = [ "P850-750" ]; To only use STAT lines for the forecast level specified.
Set obtype = [ "ADPUPA" ]; To only use STAT lines verified with the ADPUPA message type.
Set vx_mask = [ "EAST", "WEST" ]; To use STAT lines computed over the EAST or WEST verification polyline regions.
Set line_type = [ "CTC" ]; To only use the CTC lines.
Set jobs as follows:

jobs = [

"-job filter -dump_row ${MET_TUTORIAL_DATA}/output/stat_analysis/job1_filter.stat",

"-job aggregate -interp_pnts 1 -dump_row ${MET_TUTORIAL_DATA}/output/stat_analysis/job2_aggr_ctc_1.stat",

"-job aggregate -interp_pnts 25 -dump_row ${MET_TUTORIAL_DATA}/output/stat_analysis/job3_aggr_ctc_25.stat",

"-job aggregate_stat -out_line_type CTS -interp_pnts 25 -dump_row ${MET_TUTORIAL_DATA}/output/stat_analysis/job4_aggr_stat_cts.stat"

];

Save and close this file. The four jobs listed above achieve the following:

Filter out those STAT lines which meet the filtering criteria, and write them to an output STAT file.
Aggregate the CTC lines which have the INTERP_PNTS column set to 1 and write the lines which meet the filtering criteria to an output STAT file.
Aggregate the CTC lines which have the INTERP_PNTS column set to 25 and write the lines which meet the filtering criteria to an output STAT file.
Do the same as the third job, but write out the aggregated contingency table stats (CTS) rather than the aggregated contingency table counts (CTC).

Note that all four jobs use the -dump_row job command option which dumps the lines of STAT data used for this job to the specified file name. We'll look at these files to ensure that the jobs ran over the intended subsets of STAT data.