Python Embedding Overview

Python Embedding Overview

General MET Python Embedding Elements

The MET tools support Python Embedding for both 2D planes of gridded data and point data. Both gridded data and point data have some specific requirements which are covered below. More generally, Python Embedding can be broken down into three key elements that are required by MET tools.

The first is a Python installation that is used by the MET tools when Python Embedding is requested. When MET is installed, the --enable-python compile flag must be used, which requires a local Python installation. After the MET tools are installed, the version of Python provided with the --enable-python compile flag is the version of Python that will be used by default when a user requests Python Embedding. Any Python packages installed in the version of Python that MET was compiled against are available to a user when using Python Embedding.

The second element of Python embedding is a Python Embedding Keyword. These keywords are used to instruct the MET tools that Python Embedding is being requested:

PYTHON_NUMPY is used when passing 2D planes of data in a NumPy N-dimensional array object, or when using point data
PYTHON_XARRAY is used when passing 2D planes of data in an Xarray DataArray object only

The third element of Python embedding is the absolute path to your Python script along with any command line arguments that the script requires. These three elements enable the user to invoke Python Embedding within the MET tools.

Details for Python Embedding Scripts with 2D Gridded Data

In your Python Embedding script, be sure to adhere to the following requirements:

  1. Your variable containing the 2D dataplane of gridded data must be named met_data. This applies to both NumPy N-dimensional array objects (for PYTHON_NUMPY), and Xarray DataArray objects (for PYTHON_XARRAY).
  2. For PYTHON_NUMPY, you must define a Python variable that is a dictionary named attrs in your script that contains the following keys and their respective values :
    1. valid
    2. init
    3. lead
    4. accum
    5. name
    6. long_name
    7. level
    8. units
    9. grid
  3. For PYTHON_XARRAY, your Xarray DataArray must have a dictionary of attributes attached to it (accessible via the .attrs method of the DataArray object), and the keys must match the keys listed about for PYTHON_NUMPY.
In a later section, the demonstration of writing your own Python Embedding script will go into more details of constructing the required attributes.

Details for Python Embedding Scripts with Point Data

In your Python Embedding script, be sure to adhere to the following requirements:

  1. If you are using Python Embedding with ascii2nc, your data must be in a nested list (i.e. "list of lists") representation of the MET 11-column point data format where each list is one of the 11 columns. The fastest way to achieve this is to use the Python package Pandas, and use the method to_list() on the Pandas DataFrame object. Additionally, the nested list variable must be named point_data.
  2. If you are using Python Embedding with other MET tools for point data such as plot_point_obs, point_statensemble_stat, or point2grid, you must provide the point data in a special format that can be created from the MET 11-column format. This can be accomplished by creating a nested list (i.e. "list of lists") of the MET 11-column point data format and then using the helper Python function called convert_point_data(), which can be found in the met_point_obs class in ${MET_BUILD_BASE}/scripts/python/met_point_obs.py. Additionally, the variable that is returned from convert_point_data() must be named met_point_data which differs from using ascii2nc.
In the example for Python Embedding with point data later in this session, plot_point_obs is used. By inspecting the Python Embedding script used in that example, the details described above may become clearer.

Advanced Python Requirements

In some cases, a user may require one or more Python packages that are not installed in the version of Python that was used when installing the MET tools. In this case, the user can set a special environment variable called MET_PYTHON_EXE, which contains the relative path to the "/bin" directory where the Python executable is that contains the Python packages the user requires.

NOTE: using MET_PYTHON_EXE will force MET to write data files to a temporary area and then read them in again, instead of receiving data directly from within memory. This may negatively effect (increase) workflow run time. In some cases this cannot be avoided (i.e. multiple users sharing a single MET installation), and allows users maximum accessibility to the Python ecosystem, but users should be aware it could increase run time.

Setup for Python Embedding Practice

In the next two sections, you will practice using Python Embedding for both gridded and point data using MET tools directly and also via METplus Wrappers. To prepare for those sections, please follow the setup instructions below:

In your tutorial directory, make a new directory to hold Python scripts that you will use to demonstrate Python Embedding:
mkdir ${METPLUS_TUTORIAL_DIR}/python_embed
Change to the directory you just created:
cd ${METPLUS_TUTORIAL_DIR}/python_embed
Copy and rename the read_ascii_numpy.py example script included with the MET software to your directory:
cp ${MET_BUILD_BASE}/share/met/python/read_ascii_numpy.py ./my_gridded_pyembed.py
Copy and rename the read_ascii_point.py example script included with the MET software to your directory:
cp ${MET_BUILD_BASE}/share/met/python/read_ascii_point.py ./my_point_pyembed.py
Set your PYTHONPATH variable to include a required Python module to demonstrate Python Embedding with point data:

csh:

setenv PYTHONPATH ${MET_BUILD_BASE}/share/met/python:${PYTHONPATH}

bash:

export PYTHONPATH=${MET_BUILD_BASE}/share/met/python:${PYTHONPATH}
You are now prepared to practice Python Embedding! Proceed to the next section to get started.
dadriaan Fri, 01/20/2023 - 14:52