METplus Practical Session Guide (Version 5.0) | Session 8: METplus Analysis Tools > METdataio

METreformat Overview:

The METreformat Python package was developed to assist in the generation of METplotpy line plots using MET Point-Stat .stat files. METreformat is located in the METdataio repository and utilizes the METdbLoad package for reading Point-Stat files.

The initial versions of the METviewer tool read MET verification statistics output from a database and created plots using R script for statistics and plotting. METplotpy and METcalcpy Python packages were developed to remove the dependency on R for plotting and statistics. METviewer now utilizes METplotpy Python scripts. The METplotpy plots that are available in METviewer are located in the /metplotpy/plots directory of the METplotpy repository. The METplotpy repository also hosts contributed plots that are not available through the METviewer tool. These plots are located in a different directory in the METplotpy repository.

METviewer generates plots based on the database "view" of the input data. Reformatting the MET Point-Stat .stat files into the same format used by METviewer bypasses the need for METviewer and its database. For additional information about METviewer, please refer to the METviewer User's Guide.

The MET Point-Stat data consists of numerous columns with each statistic represented by a column. The various normal and bootstrap confidence limits are each represented by corresponding columns. Upon inspection of a Point-Stat .stat file, there are numerous columns, many of which may be lacking a column label. For additional information about the line types in MET Point-Stat, please refer to all of the tables in the MET User's Guide for Point-Stat.

The METviewer database "view" consists of individual statistics that are collected into the stat_name and the stat_value columns. The normal and bootstrap confidence levels are collected into one of four columns:

  • stat_ncl
  • stat_ncu
  • stat_bcl
  • stat_bcu

The METreformat package replicates this format.

Generic instructions for reformatting a sample MET Point-Stat .stat file are available on Read the Docs: METdataio User's Guide.

The Point-Stat line types that are currently supported in METreformat are as follows:

  • FHO
  • CNT
  • CTS
  • CTC
  • SL1L2

Support for additional line types will be added in the future.

METreformat Components:

The METreformat package utilizes the METdataio METdbLoad package to read .stat files, label columns, and create an intermediate data structure (pandas dataframe). The METreformat package uses this data structure to reformat the data into a single file.

The reformatting requires a yaml configuration file and an xml specification file (from METdbLoad). The xml specification file is used to define the paths to the input data and the type of MET output (i.e. Point-Stat, Grid-Stat, MODE, Stat-Analysis, and Wavelet-Stat). The yaml configuration file indicates the location of the xml specification file and the name and location of the output file.

Example of running METreformat

Set up the prerequisites:

Set up the environment by following the instructions for METplus initial set up. Make sure to follow the relevant instructions for either the Pre-configured Environments or User Configured Environments, based on your host computer. Finish by following the Verify Environment Is Set Correctly instructions.

The Python requirements for METdataio are:

  • Python 3.8.6 or above
  • pymysql (not needed for this tutorial)
  • pandas (1.5 or above)
  • numpy (1.22 or above)
  • pyyaml
  • lxml

 

Make sure you have activated the conda environment for METdataio if you are working on your own host or on host 'seneca':

https://dtcenter.org/metplus-practical-session-guide-version-5-0/session-1-metplus-setupgrid-grid/metplus-setup/metplus-initial-setup/setting-tutorial-environment-seneca

 

Please refer to the METdataio User's Guide for more information on METdataio.

Clone the METdataio repository to a local directory:
mkdir -p ${METPLUS_TUTORIAL_DIR}/metdataio
cd ${METPLUS_TUTORIAL_DIR}/metdataio
git clone https://github.com/dtcenter/METdataio

Reformat MET Point-Stat .stat files

An overview of the steps for running this example are:

  • Set the PYTHONPATH and METDATAIO_BASE environments
  • Copy the yaml configuration and xml specification file to the user_config directory
  • Modify the yaml configuration and xml specification files in the user_config directory
  • Run the Python script to reformat the sample data
Set the METDATAIO_BASE and PYTHONPATH environment variables, using only the commands appropriate to your environment.

 

Work in bash shell, at the command line, enter the following:

 

bash

 

bash:

export METDATAIO_BASE=${METPLUS_TUTORIAL_DIR}/metdataio/METdataio
export PYTHONPATH=$METDATAIO_BASE:$METDATAIO_BASE/METdbLoad:$METDATAIO_BASE/METdbLoad/ush
Copy the yaml configuration and xml specification files to the new directory:
mkdir -p ${METPLUS_TUTORIAL_DIR}/metdataio/user_config
cp $METDATAIO_BASE/METreformat/point_stat.yaml ${METPLUS_TUTORIAL_DIR}/metdataio/user_config
cp $METDATAIO_BASE/METreformat/point_stat.xml ${METPLUS_TUTORIAL_DIR}/metdataio/user_config

Next, we need to modify the point_stat.xml file to include the correct directory path to the input data. In order to do this, we need to obtain a full path to where the input data is, copy it, and paste that path inside the .xml file.

NOTE: You cannot use environment values to pass in the input data path, you must use the full path.
To get the full path to the METPLUS_DATA environment variable, enter the following on the command line:
echo $METPLUS_DATA

That command should return a path similar to the following example:

/d1/projects/METplus/METplus_Data.v5.0
Using the appropriate keyboard shortcuts and/or mouse click options for your operating system, copy this path into computer memory. (So for windows, this would be a ctrl+C and on a Mac this would be cmd+C.)

Alternatively, you can open a text file and copy and paste the directory path into the text file for easy use.

Now, modify point_stat.xml, using the output from the echo command to fill the path to the Point-Stat data. Be sure to add met_reformat/point_stat path at the end:
vi ${METPLUS_TUTORIAL_DIR}/metdataio/user_config/point_stat.xml
<load_spec>
        <folder_tmpl>/d1/projects/METplus/METplus_Data.v5.0/met_reformat/point_stat</folder_tmpl>
        <verbose>true</verbose>
        <load_val>
                <field name="met_tool">
                        <val>point_stat</val>
                </field>
        </load_val>
        <description>MET output </description>
</load_spec>

All we have done in the file is replace the content between the <folder_tmpl> </folder_tmpl> with the full path to the data, complete with the met_reformat/point_stat subdirectories.

Save your changes and close the file.

Now we need to modify the point_stat.yaml file in a similar way to the point_stat.xml file. In order to do this, we need to obtain a full path to the $METPLUS_TUTORIAL_DIR, copy it, and paste that path inside the .yaml file in multiple areas.

NOTE: You cannot use environment variables to pass directory paths, you must use the full path.
To get the full path to the METPLUS_TUTORIAL_DIR environment variable, enter the following on the command line:
echo $METPLUS_TUTORIAL_DIR

That command should return a path similar to the following example:

/d1/personal/user/METplus-5.0.0_Tutorial
Using the appropriate keyboard shortcuts and/or mouse click options for your software version, copy this path into computer memory. (So for windows, this would be a ctrl+C or on a Mac this would be would be cmd+C.)
Modify the yaml configuration file in the $METPLUS_TUTORIAL_DIR/metdataio/user_config directory with the following updates:
vi ${METPLUS_TUTORIAL_DIR}/metdataio/user_config/point_stat.yaml
    output_dir: /d1/personal/user/METplus-5.0.0_Tutorial/metdataio
    output_filename: point_stat_reformatted.txt
    xml_spec_file: /d1/personal/user/METplus-5.0.0_Tutorial/metdataio/user_config/point_stat.xml

In the above example, the output_dir entry of /path/to/output_dir was replaced with the full path of the METPLUS_TUTORIAL_DIR/metdataio. The xml_spec_file entry replaced /path/to/xml_file with the full path of the METPLUS_TUTORIAL_DIR/metdataio/user_config along with the point_stat file name. NOTE: make sure you have one space between the colon (:) and your settings.

Save your changes and close the file.

We are now ready to generate the reformatted file from the sample Point-Stat files in the METPLUS_DATA/met_reformat/point_stat directory. To understand how the file changes, we will look at a .stat file that has not been reformatted.

Open one of the .stat files
vi $METPLUS_DATA/met_reformat/point_stat/point_stat_FV3_GFS_v15p2_CONUS_25km_NDAS_ADPSFC_010000L_20190615_010000V.stat

Observe how there are numerous columns of data, some of which are lacking descriptive names (i.e. 'MODEL', 'FCST_LEV', etc.). This can be a confusing file to read for users that are unfamiliar with .stat file layouts.

Close the file by entering the following:
:q!
From any directory, run the following from the command line:
python $METDATAIO_BASE/METreformat/write_stat_ascii.py ${METPLUS_TUTORIAL_DIR}/metdataio/user_config/point_stat.yaml

The following will be sent to the terminal and are generated by METdbLoad (the warning message can be safely ignored):

WARNING:root:!!! ALPHA line_type has ALPHA value of NA:
 51      VCNT
150     VCNT
244     VCNT
518     VCNT
525     VCNT
532     VCNT
539     VCNT
546     VCNT
553     VCNT
560     VCNT
567     VCNT
574     VCNT
581     VCNT
588     VCNT
595     VCNT
602     VCNT
609     VCNT
1238    VCNT
1245    VCNT
1252    VCNT
1259    VCNT
1266    VCNT
1273    VCNT
1280    VCNT
1287    VCNT
1294    VCNT
1301    VCNT
1308    VCNT
1723    VCNT
1822    VCNT
1921    VCNT
2015    VCNT
Name: line_type, dtype: object

A text file will be created in the $METPLUS_TUTORIAL_DIR/metdataio directory named point_stat_reformatted.txt (as specified in the point_stat.yaml config file).

Open the point_stat_reformatted.txt file.
vi ${METPLUS_TUTORIAL_DIR}/metdataio/point_stat_reformatted.txt

You will notice that the data has been reformatted where the statistics are now under stat_name and stat_value, and the presence of the stat_bcl, stat_bcu, stat_ncl, and stat_ncu columns. All columns have names.

Close the file.

Generate a METplotpy line plot

METplotpy Set Up Prerequisites:

The Python requirements for METplotpy are found in the User's Guide, under the Installation section.

If you are working on a host that does not have the necessary Python packages installed, create your own conda environment using the instructions provided by the Seneca host instructions as guidance.

Continue working in the bash shell.

METcalcpy is a requirement for METplotpy. The following description is one of numerous methods to set up the working environment to utilize METcalcpy from the METplotpy source code. These instructions are suitable for users that are not working within a conda environment:

Clone the METcalcpy repository to a local directory:
mkdir -p ${METPLUS_TUTORIAL_DIR}/metcalcpy
cd ${METPLUS_TUTORIAL_DIR}/metcalcpy
git clone https://github.com/dtcenter/METcalcpy
If you have access to conda environments, select one of the two following methods to install METplotpy:
pip install metcalcpy==2.0.1
Clone the METplotpy repository to a local directory:
mkdir -p ${METPLUS_TUTORIAL_DIR}/metplotpy
cd ${METPLUS_TUTORIAL_DIR}/metplotpy
git clone https://github.com/dtcenter/METplotpy

An overview of the steps for creating this plot are:

  • Set the PYTHONPATH and METPLOTPY_BASE environments
  • Copy the yaml configuration file to the user_config directory
  • Modify the yaml configuration file in the user_config directory
  • Run the Python script to create a line plot

Set the PYTHONPATH and METPLOTPY_BASE environment variables

For this step, you'll need to choose the appropriate command depending on how you installed METcalcpy. Read both blue instruction blocks, and proceed with the one relevant to you.

If you used Method 1 above:

 

bash:

export METPLOTPY_BASE=${METPLUS_TUTORIAL_DIR}/metplotpy/METplotpy
export METCALCPY_BASE=${METPLUS_TUTORIAL_DIR}/metcalcpy/METcalcpy
export PYTHONPATH=$METCALCPY_BASE:$METCALCPY_BASE/metcalcpy:$METPLOTPY_BASE:$METPLOTPY_BASE/metplotpy:$METPLOTPY_BASE/metplotpy/plots
If you installed METcalcpy using Method 2 above (i.e. PyPI and pip), then set the PYTHONPATH without the METcalcpy paths:

 

bash:

export METPLOTPY_BASE=${METPLUS_TUTORIAL_DIR}/metplotpy/METplotpy
export PYTHONPATH=$METPLOTPY_BASE:$METPLOTPY_BASE/metplotpy:$METPLOTPY_BASE/metplotpy/plots

Copy custom config file to user_config

To get this started, we need to create a user_config directory.

Create a new user_config directory by entering the following command:
mkdir -p ${METPLUS_TUTORIAL_DIR}/metplotpy/user_config
Now copy the .yaml file to the user_config directory.
cp $METPLOTPY_BASE/test/line/custom_line.yaml ${METPLUS_TUTORIAL_DIR}/metplotpy/user_config

Modify the custom configuration file

Change directory to the user_config directory.
cd ${METPLUS_TUTORIAL_DIR}/metplotpy/user_config

We will need to modify the output directory in the file with a full path to where we curerntly are, ${METPLUS_TUTORIAL_DIR}/metplotpy/user_config. In order to do this, we need to copy the full path, and paste that path inside the .yaml file.

NOTE: You cannot use environment variables to pass directory paths, you must use the full path.
To get the full path to the current location, enter the following on the command line:
pwd

That command should return a path similar to the following example:

/d1/personal/user/METplus-5.0.0_Tutorial/metplotpy/user_config
Using the appropriate keyboard shortcuts and/or mouse click options for your software version, copy this path into computer memory. (So for windows, this would be a ctrl+C on on a Mac this would be cmd+C.)
The sample data only contains one model, therefore only one line can be created on the line plot. Many of the following file edits will reflect this.
Open the custom_line.yaml file.
vi ${METPLUS_TUTORIAL_DIR}/metplotpy/user_config/custom_line.yaml
Use the first color in the list and delete the rest.
colors:
- '#ff0000'

con_series:

The above change will produce a red line on the final graph.

Set up the series ordering (con_series) by removing four settings, leaving only one setting:
con_series:
- 1

create_html: 'False'

Comment out the derived_series_1 and derived_series2 by placing a # at the beginning of the line.
#derived_series_1:
#- - CONTROL RH MAE
# - GTS RH MAE
# - DIFF
#derived_series_2: []
Set event_equal to 'False'.
event_equal: 'False'

There is only one model in the sample data and event equalization is not needed.

Edit the fcst variable fcst_var_val_1 from RH to WIND, and statistic MAE to F_RATE.
fcst_var_val_1:
WIND:
- F_RATE
Comment out the fcst_var_val_2 and its settings by placing a # at the beginning of the line.
#fcst_var_val_2:
# TMP:
# - ME
Edit the fcst_level (fcst_lev_0) under the fixed_vars_vals_input setting from Z02 to Z10.

fixed_vars_vals_input:

   fcst_lev:

     fcst_lev_0:

      - Z10

Change the indy_labels to '0', '1', and '5':
indy_label:
- '0'
- '1'
- '5'
Change the indy_vals values to '0', '10000', and '50000' to plot the 0, 10000, and 50000 fcst lead points.

indy_vals:

- '0'

- '10000'

- '50000'

 

Change the list_stat_1 statistics value from MAE to F_RATE and remove the -ME from the list_stat_2 setting.

list_stat_1:

- F_RATE

list_stat_2:

list_static_val:

 

Under the plot_ci setting, keep only one of the -std values and delete the remaining four.

plot_ci:

- std

plot_disp:

 

Remove four of the - 'True' entries under the plot_ci so only one value of - 'True' remains.

plot_disp:

- 'True'

plot_filename: ./line.png

 

Set the path to the output plot file, replacing path-to with the path that was copied prior to editing this file:
plot_filename: /d1/personal/user/METplus-5.0.0_Tutorial/metplotpy/user_config/line.png

 

Remove four of the values under each of the following entries series_line_style, keeping only one:
series_line_style:
- '-'

series_line_width:
- 1

series_order:
- 1

series_symbols:
- .

series_type:
- b

series_val_1:

Change the model setting under series_val_1 to FV3_GFS_v15p2_CONUS_25km and comment out the settings for series_val_2 (or remove them).
series_val_1:
model:
- FV3_GFS_v15p2_CONUS_25km
#series_val_2:
# model:
# - CONTROL
# - GTS
Remove four of the settings under show_signif, keeping one setting:
show_signif:
- 'False'

stat_input: ./line.data

Modify the stat_input setting.  This is the name of the reformatted file. Replace path-to with the full path to the METplus-5.0.0_Tutorial (working) directory. Do NOT use environment variables, use actual paths.
stat_input: path-to/METplus-5.0.0_Tutorial/metdataio/point_stat_reformatted.txt

 

Comment out or remove the lines settings at the end of the config file.

#lines:

#- color:

'#8000ff'

# line_width: '2'

# position: '11'

# type: horiz_line

# line_style: '--'

#- line_style: '-'

# color: '#000000'

# line_width: 1

# position: "18"

# type: vert_line

 

Save and close the custom_line.yaml file. 

Generate the line plot:

From any directory, enter the following from the command line:
python $METPLOTPY_BASE/metplotpy/plots/line/line.py $METPLUS_TUTORIAL_DIR/metplotpy/user_config/custom_line.yaml

 

In your $METPLUS_TUTORIAL_DIR/output directory, you will now see a line.png file.  You can view the png file using an appropriate viewer, such as 'display' if on a Linux/Unix host:
cd $METPLUS_TUTORIAL_DIR/metplotpy/user_config
display line.png

 

Your line plot will look like the following: