TC-Pairs

TC-Pairs griggs Wed, 04/24/2019 - 16:30

TC-Pairs Functionality

The TC-Pairs tool parses ADeck (forecast) and BDeck (BEST track analyses) ATCF files, filters the data, and matches them up. It writes an output ASCII Tropical-Cyclone Statistics (TCST) file containing track pair information.

Users should consider the frequency with which they run the TC-Pairs tool. Passing it too much data to process (e.g. all operational forecast and analysis tracks for the past 3 years) will consume too much memory and make it run slowly. Passing it a single initialization for each model will create many tiny files. If TC-Pairs is running slowly, try reducing the amount of track data you pass to it in any one call. For example, passing it all forecast models for each initialization of a storm works well.

TC-Pairs Usage

View the usage statement for TC-Pairs by simply typing the following:

tc_pairs

At a minimum, the -adeck or -edeck and -bdeck options must be used to specify the track data locations. They may be set to specific file names or to a top-level directory containing files ending with .dat to be processed. The configuration file must also be specified using the -config option. The -out option may be used to override the default output file name.

While users will typically run TC-Pairs to compare forecast and analysis tracks, it can be run to compare two sets of forecast tracks or two sets of analysis tracks.

Configure

Configure griggs Wed, 04/24/2019 - 16:31

The behavior of TC-Pairs is controlled by the contents of the configuration file passed to it on the command line. The default TC-Pairs configuration file may be found in the $MET_BASE/config/TCPairsConfig_default file. Prior to modifying the configuration file, users are advised to make a copy of the default:

cp $MET_BASE/config/TCPairsConfig_default $MET_TUTORIAL_DATA/config/TCPairsConfig

The configurable items for TC-Pairs are used to filter the desired track data, create derived tracks, and specify supplemental track information:

  • Filter track data by model, storm ID, basin, cyclone number, storm name, initialization time, valid time, and geographic area. Only those tracks meeting all of the criteria specified are retained.
  • Derive new tracks by applying interp12 logic, defining a consensus (i.e. average) track, specifying lagged forecast tracks, or enabling the derivation of CLIPER/SHIFOR tracks from BEST and operational tracks.
  • Specify the distance to land file.
  • Specify watch/warning information to be incorporated into the paired track data.

You may find a complete description of the configurable items in the MET Users Guide or in the $MET_BASE/config/README_TC file. Please take some time to review them.

For this tutorial, we'll run TC-Pairs to verify multiple initializations for Hurricane Sandy, a large storm which affected the East Coast of the United States in 2012. Operational and analysis tracks for Hurricane Sandy are included in the tutorial test data.

Open up the $MET_TUTORIAL_DATA/config/TCPairsConfig file for editing with your preferred text editor and edit it as follows:

  • Set the model entry to list the forecast tracks of interest:
    model = [ "OFCL", "GFDL", "AVNO", "HWRF", "UKM", "OCD5" ];
  • Set the storm_id entry to only retain the 18th major storm of 2012 in the Atlantic basin:
    storm_id = [ "AL182012" ];

    Since we will only pass TC-Pairs data for this storm, this step is not absolutely necessary.

  • Point to the TC-Dland output for the northwest hemisphere, which include the Atlantic basin:
    dland_file = "${MET_BASE}/tc_data/dland_nw_hem_tenth_degree.nc";

Save and close this file.

Run

Run griggs Wed, 04/24/2019 - 16:33

Like the STAT-Analysis tool, TC-Stat may be run with or without a configuration file. When running multiple analysis jobs over the same subset of data using a configuration file is most effiecient. However, when running simple jobs to quickly explore your data, using the command line is more convenient. For this tutorial, we'll run command line jobs.

Filter Job

The TC-Stat filter job subsets your data and writes that subset to an output file. TC-Stat supports the following types of filtering:

  • Header string columns using -amodel, -bmodel, -storm_id, -basin, -cyclone, and -storm_name. Multiple values may be specified as a comma-separated list or using multiple switches. When multiple values are specified, the output will contain their union.
  • Timing information using -init_beg, -init_end, -init_inc, -init_exc, -init_hour, similar switches for valid times, and -lead for the lead time.
  • The -init_mask, -valid_mask, and -track_watch_warn options filter by the corresponding data columns.
  • The -column_thresh option specifis the name of the column followed by a threshold to apply (e.g. -column_thresh TK_ERR gt10).
  • The -column_str option specifis the name of the column followed by a list of one or more strings to match (e.g. -column_str LEVEL HU,TS).
  • The -init_thresh and -init_str options work the same way but are only applied to the initial forecast track point (i.e. LEAD equals 0).
  • The -water_only option excludes any points where the distance to land is <= 0.
  • The -rirw and -landfall options subset the tracks down to the RIRW or landfall points, respectively.
  • The -event_equal and -event_equal_lead options control the logic for subsetting down to a homogenous sample.
  • The -out_init_mask and -out_valid_mask options define lat/lon polyline regions. The initial forecast track point (i.e. LEAD equals 0) must fall within the -out_init_mask while the entire track must fall within -out_valid_mask.
  • If your TC-Pairs output includes all track points, the -match_points options subsets tracks down to common times.

Next, run the following jobs:

  • Select data for the official forecast only (e.g. AMODEL equals OFCL). After you run the job, inspect the output file:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job filter -dump_row $MET_TUTORIAL_DATA/output/tc_stat/OFCL_sandy.tcst \
-amodel OFCL
  • The input TCST file includes tracks for 00, 06, 12, and 18Z initializations. Select only initialization hour 00 and inspect the output file:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job filter -dump_row $MET_TUTORIAL_DATA/output/tc_stat/INIT_sandy.tcst \
-init_hour 00
  • Select hurricane strength lines where the track error exceeds 150 nm and inspect the output file:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job filter -dump_row $MET_TUTORIAL_DATA/output/tc_stat/TKERR_sandy.tcst \
-column_str LEVEL HU \
-column_thresh TK_ERR gt150

Summary Job

Next, we'll run some summary jobs, applying additional filtering criteria as well. Just like the STAT-Analysis tool, TC-Stat supports the -by job command option which is a very convenient way of running the same job over multiple subsets of data:

  • Summarize all of the track (TK_ERR) and intensity (AMAX_WIND-BMAX_WIND) error values:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job summary \
-column TK_ERR -column AMAX_WIND-BMAX_WIND
  • Now use the -by option to run the same job for each unique combination of model name (AMODEL) and lead time (LEAD):
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job summary -by AMODEL,LEAD \
-column TK_ERR -column AMAX_WIND-BMAX_WIND
  • That's a lot of output, but we could filter it down using the -lead option to select particular lead times of interest:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job summary -by AMODEL,LEAD -lead 00,24,48,72 \
-column TK_ERR -column AMAX_WIND-BMAX_WIND
  • Run that same job one more time but use event equalization to compare three specific models (OFCL, OCD5, and HWRF) over a homogenous set of cases:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job summary -by AMODEL,LEAD -lead 00,24,48,72 \
-amodel OFCL,OCD5,HWRF -event_equal TRUE \
-column TK_ERR -column AMAX_WIND-BMAX_WIND

Notice that the counts (TOTAL column) are now constant across all models for each lead time.

By default, TC-Stat writes its job output to the screen but it can easily be redirected to a file using the -out option.

RIRW Job

Next, we'll run a sample Rapid Intensification job:

  • Run job for each unique model using all the default settings:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job rirw -by AMODEL
  • By default, TC-Stat dumps the contingency table counts (RIRW_CTC) and contingency table statistics (RIRW_CTS). Notice that there are differing counts in the TOTAL column. Let's rerun but turn off the RIRW_CTS output, and event equalize 3 models:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job rirw -by AMODEL -amodel OFCL,OCD5,HWRF -event_equal TRUE \
-out_line_type CTC
  • Notice that the TOTAL column remains constant meaning that event equalization worked as expected. By default, rapid intensification is defined an increase of 30 kts in exactly 24 hours which is a rather rare event. Let's try changing that to be a 20 kts maximum increase in 24 hours:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job rirw -by AMODEL -amodel OFCL,OCD5,HWRF -event_equal TRUE \
-out_line_type CTC -rirw_exact FALSE -rirw_thresh ge20
  • When populating the contingency table, we only get a hit when the rapid intensification occurs at exactly the same time in both tracks. But how do the scores change if we only require that the events be within 12 hours of eachother for a hit?
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job rirw -by AMODEL -amodel OFCL,OCD5,HWRF -event_equal TRUE \
-out_line_type CTC -rirw_exact FALSE -rirw_thresh ge20 -rirw_window 12
  • Lastly, rerun but write all possible line types (CTC, CTS, and MPR) to an output file:
     
tc_stat \
-lookin $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst \
-job rirw -by AMODEL -amodel OFCL,OCD5,HWRF -event_equal TRUE \
-out_line_type CTC,CTS,MPR -rirw_exact FALSE -rirw_thresh ge20 -rirw_window 12 \
-out $MET_TUTORIAL_DATA/output/tc_stat/RIRW_sandy.txt

Open the output $MET_TUTORIAL_DATA/output/tc_stat/RIRW_sandy.txt file and inspect the results.

Output

Output griggs Wed, 04/24/2019 - 16:35

The output of the TC-Pairs tool is simply an ASCII file ending with the .tcst suffix. Open up the $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst output file using the text editor of your choice and note the following:

  • The file contains a table which summarizes the ADeck and BDeck pairs.
  • Browse the header row and note the data types. For a complete description of the contents, see section 19.2.3 of the MET Users Guide.
  • The header columns contain the model name, storm identification, and timing information.
  • There is one output line for each track point but the points are grouped together sequentially into tracks.
  • Scroll over to the LINE_TYPE column which is set to TCMPR for Tropical-Cyclone Matched Pairs. Additional line types will be added in future releases when support for new data types are added.
  • Scroll over to the TOTAL column which lists the total number of track points and the INDEX column which counts from 1 up to TOTAL.
  • Scroll over to the LEVEL and WATCH_WARN columns which indicate the storm intensity and hurricane watches or warnings in effect at that time.
  • Scroll over to the ALAT, ALON, BLAT, and BLON columns which indicate the ADeck and BDeck storm locations. Notice that some values contain NA. By default, TC-Pairs writes all ADeck and BDeck track points, regardless of whether the same time exists in the other track. This behavior is configurable, as we'll see below.
  • Scorll over to the TK_ERR column. This is the distance in nautical miles between the ADeck and BDeck locations. The X_ERR and Y_ERR columns decompose the difference into East-West and North-South components. The ALTK_ERR and CRTRK_ERR columns decompose the difference into track-relative components.
  • The AMAX_WIND and BMAX_WIND columns indicate the ADeck and BDeck maximum wind speeds in knots and are used to determine storm intensity.
  • There are many additional columns for the 34, 50, and 64 knot radius winds broken down by quadrant and several other summary columns. These columns are only populated when the input ATCF files contain that data.

Since the lines of data in these ASCII files are so long, we strongly recommend configuring your text editor to NOT use dynamic word wrapping. The files will be much easier to read that way.

Match Points

As mentioned above, TC-Pairs by default writes every track point regardless of whether that time is present in the other track. This is the behavior when the match_points configuration entry is set to FALSE. Let's try switching that option and rerunning TC-Pairs:

  • Open up the $MET_TUTORIAL_DATA/config/TCPairsConfig configuraiton file for editing and set:

match_points = TRUE;
  • Save and close that file and rerun TC-Pairs:

tc_pairs \
-adeck input/tc_data/aal182012.dat \
-bdeck input/tc_data/bal182012.dat \
-config $MET_TUTORIAL_DATA/config/TCPairsConfig \
-out $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy
  • Open up the output $MET_TUTORIAL_DATA/output/tc_pairs/tc_pairs_sandy.tcst file and inspect the ALAT, ALON, BLAT, and BLON columns. The NA values should be gone. Also note that the TOTAL number of pairs for each track has decreased.