Next, run Stat-Analysis on the command line using the following command:
-lookin $MET_TUTORIAL_DATA/output/point_stat \
-out $MET_TUTORIAL_DATA/output/stat_analysis/stat_analysis.out \
-config $MET_TUTORIAL_DATA/config/STATAnalysisConfig_tutorial \
-v 2
Stat-Analysis is now performing the analysis jobs we requested in the configuration file. It is writing the output of our four jobs to the file we specified using the -out command line argument. It should run in only a couple of seconds since we're analyzing such a small sample of STAT data. In general though, Stat-Analysis can be used to process very large amounts of data, a whole season's worth, in a relatively short amount of time.
By Case and STAT Output
Notice that jobs 2 and 3 do basically the same thing but for different -interp_pnts values. We can actually perform those job in a much simpler way. The -by job command option specifies one or more columns which define case information. Those case columns are concatenated and the job is performed for each unique case found in the data.
Also notice that while the output file generated above contains aggregated statistics, it is missing the 22 header STAT header columns. The -out_stat job command option specfies that name for an output STAT file which does contain those header columns.
Let's rerun jobs 2 and 3 but on the command line using both the -by and -out_stat options:
-lookin $MET_TUTORIAL_DATA/output/point_stat \
-out_stat $MET_TUTORIAL_DATA/output/stat_analysis/ctc_by_interp_pnts.stat \
-job aggregate -line_type CTC -fcst_var TMP -fcst_lev P850-750 -obtype ADPUPA -vx_mask EAST,WEST \
-by INTERP_PNTS \
-v 2
Note that this single job was run on the command line with no configuration file. The multiple values for -vx_mask are specified as a comma-separated list (specifying -vx_mask multiple times works too. Open up the output STAT file ($MET_TUTORIAL_DATA/output/stat_analysis/ctc_by_interp_pnts.stat) and notice the content of the VX_MASK column (EAST,WEST). For string header columns (e.g. VX_MASK, OBTYPE, FCST_VAR) multiple values are simply concatenated while for date/time columns (e.g. FCST_VALID_BEG, FCST_VALID_END, FCST_LEAD) the min/max values are reported.
Next, rerun that same job but use the -set_hdr option to explicitly specify the contents of the output VX_MASK column:
-lookin $MET_TUTORIAL_DATA/output/point_stat \
-out_stat $MET_TUTORIAL_DATA/output/stat_analysis/ctc_by_interp_pnts_set_hdr.stat \
-job aggregate -line_type CTC -fcst_var TMP -fcst_lev P850-750 -obtype ADPUPA -vx_mask EAST,WEST \
-by INTERP_PNTS -set_hdr VX_MASK CONUS \
-v 2
Open up the output file and check the VX_MASK column.
You can use the -by option an arbitrary number of times to define case information. Try rerunning with -by INTERP_PNTS,VX_MASK.
Process Probabilistic Output
Next, we'll run an analysis job on the probabilistic output from the Grid-Stat tool on the command line:
-lookin $MET_TUTORIAL_DATA/output/grid_stat \
-job aggregate_stat \
-dump_row $MET_TUTORIAL_DATA/output/stat_analysis/aggr_stat_pstd.stat \
-vx_mask EAST,WEST \
-line_type PCT \
-out_line_type PSTD \
-v 2
The output of this Stat-Analysis job is printed to the screen since we didn't redirect the output to a file using the -out command line option. This job has aggregated two probability contingency table count (PCT) STAT lines, one for the EAST and one for the WEST, and it has written out the corresponding statistics (PSTD) STAT line. The output for this job includes the following 3 lines:
- The JOB_LIST line lists the job command options that were used to perform this job.
- The COL_NAME line consists of the column names for the statistics listed in the next line.
- The PSTD line consists of the output for the probabilistic statistics (PSTD) STAT line. However, only the columns that appear after the LINE_TYPE column are shown.
As we did above, let's switch to using the -by vx_mask option:
-lookin $MET_TUTORIAL_DATA/output/grid_stat \
-job aggregate_stat \
-dump_row $MET_TUTORIAL_DATA/output/stat_analysis/job5_aggr_stat_pstd.stat \
-by vx_mask \
-line_type PCT \
-out_line_type PSTD \
-v 2
This job has aggregated the probability contingency table count (PCT) STAT lines by vx_mask and has written out the corresponding statistics (PSTD) STAT line. There is one output line per vx_mask (CONUS, EAST, G212, WEST) for a total of four lines in this case.
Next, we'll run a job to look at verification of wind direction in the output of the Point-Stat tool. Just as we did above, we'll use the EAST and WEST verification regions to aggregate the vector partial sums (VL1L2) for winds and look at errors in the wind direction. Run the following job on the command line:
-lookin $MET_TUTORIAL_DATA/output/point_stat \
-job aggregate_stat \
-dump_row $MET_TUTORIAL_DATA/output/stat_analysis/job6_aggr_stat_wdir.stat \
-vx_mask EAST -vx_mask WEST \
-interp_pnts 25 \
-line_type VL1L2 \
-fcst_thresh ge1.0 \
-out_line_type WDIR \
-v 2
The output of this Stat-Analysis job includes the following 4 lines: the JOB_LIST, COL_NAME, ROW_MEAN_WDIR, and AGGR_WDIR. See the MET Users Guide for details on these output lines.
Process Ensemble Output
Lastly, we'll aggregate together two ranked histograms from the Ensemble-Stat output, the ones for the NWC and SWC verification areas. Execute the following command:
-lookin $MET_TUTORIAL_DATA/output/ensemble_stat \
-job aggregate \
-line_type RHIST \
-vx_mask SWC -vx_mask NWC \
-obtype ADPSFC \
-v 2
This job aggregates together the 358 ranks from the NWC region with the 382 ranks from the SWC region and writes out the aggregated counts. This aggregation is only possible because the number of ranks in each input line (7) remains constant.