CCPP AMS2020 Short Course

CCPP AMS2020 Short Course
admin Tue, 01/07/2020 - 09:50
NOTE: These instructions were valid/used in January 2020.  THESE INSTRUCTIONS WILL LIKELY NOT WORK AS IS FOR THE V4 RELEASE.

The goal of this course is to familiarize participants with new tools for experimentation and development of physical parameterizations for Numerical Weather Prediction (NWP). Students will be exposed to the physics suites available through the Common Community Physics Package (CCPP), a library of physical parameterizations that is in use with NOAA’s Unified Forecast System. Supported suites include the operational GFS, the suite under development for the next operational GFS implementation, the suite used by the Rapid Refresh and High-Resolution Rapid Refresh (RAP/HRRR) models, and a suite developed under the auspices of a NOAA Climate Process Team.

In this course, the CCPP will be taught in conjunction with the CCPP single-column model, a simplified framework that enables experimentation in a controlled setting. Various research cases will be provided as forcing datasets for the single-column model, all originating from experimental field campaigns focused on specific meteorological phenomena, such as a DOE-ARM LASSO case focused on shallow convection and a TWP-ICE case focused on maritime deep convection. In addition to its conceptual simplicity, the single-column model is not computationally demanding and can be executed on computers readily available to graduate students or in the cloud. The CCPP and the CCPP single-column model are publicly released and supported community codes (https://dtcenter.org/ccpp/). 

Getting Started

Getting Started carson Fri, 12/27/2019 - 15:11

 

An AMI (Amazon Machine Image) is provided to you with:

 

  • All prerequisites installed (compiler, netCDF, NCEPlibs, Python packages)

  • A sandpit directory with the following subdirectories:

    • A data directory with 

      • Lookup tables for the Thompson microphysics (qr_acr_qg.dat, qr_acr_qs.dat, and freezeH2O.dat)
      • File source_mods.tar with modified code for one of the exercises

    • A source directory with 

      • Single column mode code (public_release branch of gmtb-scm repository) and its CCPP submodules (ufs_public_release branches of ccpp-framework and ccpp-physics repositories)

      • Single Column Model executable gmtb-scm previously created in the bin directory

Setup and configure the AWS connection:

 

In order to setup the AMI, download and untar the following file on your local machine, and configure the connection to AWS.

Download the following file to your local machine:

https://dtcenter.org/GMTB/AWS/aws.tar

This file will be downloaded into a local directory, depending on your system and browser.  Downloads, Documents, Desktop, etc.

Open a terminal window on your system (PuTTy, iTerm, Terminal, etc)

cd <your local download dir>
tar -xvf aws.tar

Source one of the scripts:

for t/csh users: source aws_init.csh
for bash users: . aws_init.sh

This script will prompt you to enter a Group Number. This has been provided to you today.

Enter Group number: <enter>

Next, connect to the AWS instance, using the following commands:

ssh-aws-ams2020
cd sandpit
ls

At this point, you will be logged into a Linux AWS instance, with the SCM pre-installed and configured for this short course.

 

Thanks to the Joint Center for Satellite Data Assimilation (JCSDA) for providing the infrastructure to run the instances in the Amazon Web Services (AWS) cloud.

SCM and CCPP suites

SCM and CCPP suites carson Fri, 12/20/2019 - 15:03

Review the physics suites

We’d like to run several experiments with the SCM, showing differences among 3 physics suites for two different meteorological regimes. Let’s first take a look at the composition of the physics suites and then how the two cases are set up.

Navigate to the directory where available suite definition files are stored and see what is available:

cd code/gmtb-scm/ccpp/suites
ls

We’re interested in the following suites: suite_SCM_GFS_v15p2.xml, suite_SCM_GSD_v1.xml, and suite_SCM_csawmg.xml

Notice the XML format, paying attention to the key elements: suite (with name attribute), group, subcycle, scheme.

Supplemental Slides 2-6 review the syntax of suite definition files.

You will find primary and interstitial schemes.  All suites share many common elements, particularly interstitial schemes that are required to prepare data for schemes and calculate diagnostics expected in a GFS run, since they were constructed by modifications to the original GFS operational suite. 

Supplemental Slides 7-8 show the differences between the operational GFS suite and the experimental GSD_v1 and csawmg suites.

Navigate to the directory where case setup namelists are stored to see what we’re going to run.

cd ../../scm/etc/case_config
ls

Look at twpice.nml to see what kinds of information are required to set up a case. The variables most likely to differ among cases are: case_name, dt, runtime, [thermo,mom]_forcing_type, sfc_flux_spec, year, month, day, hour.

See supplemental slide 9.

Each case also has a netCDF file associated with it (determined by the case_name variable) that contains the initial conditions and forcing data for the case. Take a look at one of the files to see what kind of information is expected:

ncdump -h ../../data/processed_case_input/twpice.nc | more

There are groups for scalars (single values relevant for the simulation), initial conditions (function of vertical level only), and forcing (function of time and/or vertical level). Note that not all forcing variables are required to be non-zero.

See supplemental slides 10-12.

Run SCM cases

Run SCM cases carson Fri, 12/20/2019 - 15:25

Run the SCM

Now that we have an idea what the physics suites contain and how to set up cases, let’s use the run script called multi_run_gmtb_scm.py which will run several instances of the SCM serially.

See supplemental slide 13 for a description of the run scripts’ interfaces.

If invoked by itself, this script will run through all permutations of supported suites and cases. If you provide only a case, it will run all supported suites for that case. If you provide only a suite, it will run all supported cases for that suite. Alternatively, you can provide the script with a file that defines which cases, suites, and associated namelists to use. We will use this final option.

Navigate to the run directory, and look at the multi_run setup file:

cd ../../bin
less ../src/short_course_runs.py

Exit the less command with q.

See supplemental slide 14 for the contents of short_course_runs.py.

This file just defines a few Python lists. We’ll be running the TWP-ICE maritime deep convection case and the LASSO continental shallow convection case from May 18, 2016 with the suites discussed previously. The namelists variable is optional if the suites have dictionary entries in ../src/default_namelists.py. This would also be the place to specify non-default namelists to change parameters or physics options.

Before running the SCM, copy the lookup tables used by the Thompson microphysics so they are available at runtime.

stage_thompson_tables.sh

Run:

./multi_run_gmtb_scm.py -f ../src/short_course_runs.py

Running should take a couple minutes with progress displayed on the console.

You should see something like the output on supplemental slide 15.

Analyze SCM results

Analyze SCM results carson Fri, 12/20/2019 - 15:27

Let’s analyze the results

 

The output directory for each case is controlled by its case configuration namelist. For this exercise, a directory containing the output for each integration is placed in the bin directory. You may now inspect that each of these output directories (e.g. bin/output_$CASE_$SUITENAME) now contains a populated output.nc file.

At this point, one could use whatever plotting tools they wish to examine the output. For this course, we will use the basic Python scripts that are included in the repository. The script is called gmtb_scm_analysis.py and expects a configuration file as an argument. The configuration file specifies which output files to read, which variables to plot, and how to do so.

Run the following to generate plots for the LASSO case:

./gmtb_scm_analysis.py lasso_short_course.ini

 

The script creates a new directory within the bin directory called plots_LASSO_2016051812/comp/full. 

Open plot files for variables, such as mean profiles of water vapor specific humidity, cloud water mixing ratio, temperature, and temperature tendencies due to forcing and physics processes.

cd plots_LASSO_2016051812/comp/full
eog profiles_mean_qv.png
Supplemental slide 16 provides some context for how the simulation changes through time and slides 17-20 show the plots mentioned above.  You may ignore any warnings about libGL.

Despite using very different physics suites, there is relatively little difference in the results. 

Q: Why do you think that is?

 

A: This particular continental shallow convection case is very weakly forced. Two of the suites (csawmg and GSD_v1) produce a very small amount of cloud water. Moist processes tend to produce the largest differences among physics suites, so the fact that none of the suites produce much in the way of cloud cover means that there is less of a chance for the solutions to diverge.

TWP-ICE case

TWP-ICE case carson Fri, 12/20/2019 - 15:31

TWP-ICE maritime deep convection case

 

Let’s move on to the TWP-ICE maritime deep convection case. Run the following to generate plots for this case:

cd ../../..

./gmtb_scm_analysis.py twpice_short_course.ini

cd plots_twpice_short_course/comp/active/

eog profiles_bias_T.png

While this case spans periods of clear skies, shallow convection, and deep convection, the plots represent time means over the actively deep convection period of the simulation. Let’s focus on the mean temperature profile, its bias (difference from observations), and the temperature tendencies due to forcing and physics processes.

profiles_mean_T.png
profiles_bias_T.png
profiles_mean_multi_T_forcing.png

Supplemental slide 21 provides some context for how the simulation changes through time and slides 22-24 show the plots mentioned above.

The mean temperature profile by itself doesn’t provide much information since the differences among the suites is much smaller than the magnitude of the quantity being plotted, but the profiles begin to hint at differences. The bias profile is more instructive in this case. One suite clearly has a cool bias throughout most of the atmosphere, while the remaining two suites have a similar magnitude cool bias, but are closer to zero bias at different heights in the column.

 

Q: Using the temperature tendencies plot, summarize the main contributors to the temperature biases. 

 

eog profiles_mean_multi_T_forcing.png

 

A: The main contributor to the bias for the csawmg suite is reduced heating from the deep convection scheme (is this related to the scale-awareness of the Chikira-Sugiyama Arakawa-Wu scheme?). Aloft, the increased bias in the GFS suite appears to be from the microphysics scheme, although the longwave radiation scheme contributes too. The GSD suite has the least overall bias, although it has a much different relationship between the deep convection and microphysics that leads to a bit larger bias between 900 and 500 hPa.

Modify a scheme

Modify a scheme carson Mon, 12/30/2019 - 14:07

Modify a scheme

Let’s modify a scheme and re-run an experiment to see if we can obtain a reduction in the temperature bias profile. For this exercise, we will concentrate on the suite with the largest temperature bias (csawmg) and the most obvious culprit, the deep convection scheme. Further, assume that we are the laziest physics developer in the world and we will “solve” the temperature bias problem through brute force.

See supplemental slide 25 for an overview of what modifications we will implement.

NOTE: the following steps have been provided for you in the modified source code instructions at the end of this page, but are described here for your information.

Looking at the suite definition file for the SCM_csawmg suite (file gmtb-scm/ccpp/suites/suite_SCM_csawmg.xml), we see that the deep convection scheme is called “cs_conv” (Chikira-Sugiyama convection). Its top-level CCPP entry-point subroutine (cs_conv_run) is found in the file gmtb-scm/ccpp/physics/physics/cs_conv.F90. The call to CS_CUMLUS produces tendencies of the state variables, including temperature, and they are applied in the code shortly thereafter.

We will define a new variable used to multiply the temperature tendency before it is applied to the temperature state. Rather than defining a local variable, we’ll turn it into a “tuning knob” by making it a namelist variable and pass it in to the scheme through the CCPP framework. Hint: We can assume that this variable does not already exist in any supported CCPP scheme or in the list of variables supplied by the host SCM for this course, although in practice, one should check to see if an appropriate variable already exists for our use.

The implementation begins with straight Fortran code. The variable is added to the argument list, declared as in INTENT(IN) scalar real, multiplied by the temperature tendency variable at the appropriate place in the code.

See supplemental slide 26 for these code changes.

Since we modified the interface by adding an argument, we will need to add CCPP metadata and “advertise” that the scheme requires the new variable. Notice that the new variable has been added (in the right order) in the metadata for the subroutine cs_conv_run in cs_conv.meta.

See supplemental slide 27 for these code changes.

At this point, IF the new variable already existed in the SCM’s list of CCPP-available variables, we would be done! After rerunning cmake to regenerate the software caps and make to compile, the modified code could be run. In this case, we will need to add the variable to the SCM and “advertise” that it is available for CCPP schemes to use. For the SCM, model control variables are contained within the GFS_control_type derived datatype found in the gmtb-scm/scm/src/GFS_typedefs.F90, so we will add the new variable (global_ttend_mult) there. Note that it is not necessary for the local variable names to match between the host and the physics scheme -- the CCPP standard name and other metadata needs to match, however. This derived datatype is initialized with a default value (1.0) in the control_intialize subroutine in the same file, and read in from the same physics namelist as other physics variables, potentially overriding the default.

See supplemental slide 28 for these code changes.

Once the Fortran has been added, we will need to edit the SCM’s metadata by adding the new variable’s information in the appropriate place in gmtb-scm/scm/src/GFS_typedefs.meta (note that each derived datatype has its own section beginning with [ccpp-arg-table] in this file).

See supplemental slide 29 for these code changes.

This completes the addition of a new variable on the physics side and the SCM host side. 

Using the default namelist for the SCM_csawmg suite would produce identical results at this point, so we’ll replicate the existing namelist for this suite with a non-default value for the new variable (global_ttend_mult = 1.15). See gmtb-scm/ccpp/physics_namelists/input_csawmg_short_course.nml.

See supplemental slide 30 for these changes.

Rather than having you implement these changes yourself, they have been provided to you in a tar file. Using the instructions below, extracting the tar file will put the modified files in the correct directories and update their timestamps so that cmake/make will know to recompile them and their dependencies. Note: This can be undone by running git checkout . on the top-level gmtb-scm directory and the ccpp/physics subdirectory.

cd $HOME/sandpit/code/gmtb-scm
tar -xvmf $HOME/sandpit/data/source_mods.tar
cd scm/bin
cmake -DSTATIC=ON ../src
make

Re-run the updated code and case

Finally, re-run the TWP-ICE case with the modified code and namelist (NOTE: this command is all one line, your browser may split the line, be sure to copy/paste it into one command!):

./run_gmtb_scm.py -c twpice -s SCM_csawmg -n input_csawmg_short_course.nml

 

Let’s see how we did!

 

Re-run the plotting script using a new plot configuration file to plot all 4 runs of the TWP-ICE case.

./gmtb_scm_analysis.py twpice_short_course_mod.ini

 

Look in plots_twpice_short_course/comp/active. The temperature bias profile plot  (profiles_bias_T.png; also supplemental slide 32)  shows the 3 original suites alongside the modified csawmg suite (labeled csawmg+) on the plot. For this case, we have definitely improved the performance of this suite for this metric by brute force!

Open the temperature tendencies plot (profiles_mean_multi_T_forcing.png; also supplemental slide 33).

 

Q: Are the tendencies produced by the run with the modified code different than before?
A: Yes.

 

Q: We only modified the deep convection scheme. Why are there differences in the microphysics, PBL, and radiation tendencies?
A: Because the model is a nonlinear system and, once the atmospheric state is altered by the modified convective scheme, all parameterizations will respond differently.

Another Case

Another Case carson Mon, 12/30/2019 - 14:24

Run another case

Physics suites need to be general enough to work everywhere. Let’s try a different deep convection case to see whether our modification shows promise. The arm_sgp_summer_1997_A case features continental deep convection over the central US plains. We’ll run the same 3 suites through this case plus the modified suite using the same method as above (from the bin directory):

./multi_run_gmtb_scm.py -f ../src/short_course_runs2.py
See supplemental slide 34 for the contents of this file and expected console output.

Once completed, run the plotting script on the new output:

./gmtb_scm_analysis.py arm_short_course_mod.ini

The output is placed in bin/plots_arm_short_course

 

Open the temperature bias plot (profiles_bias_T.png; also supplemental slide 35).

Q: Does the brute force modification help the csawmg+ suite perform better in the temperature bias metric for this case?
A: No. This highlights an inherent limitation for physics development using a single column model. One needs to be extremely careful not to “overfit” or “tune” a physics suite to a limited number of cases. Although this problem can be mitigated somewhat by using a large number of experiments based on a set of cases that more completely spans the meteorological parameter space, robust verification using the global modeling system will remain the gold standard. Nevertheless, a single column model’s simplicity, inexpensiveness, and propensity toward exploratory studies should cement its place as a valuable part of the physics development testing hierarchy.
 

Additional Topic: compilation

Additional Topic: compilation carson Mon, 12/30/2019 - 14:31

Source code and compilation

 

The source code for the Single Column Model and the CCPP Physics are hosted on GitHub in public repositories.  For this Short Course, these have been pre-installled for you.  You can download the code yourself on your own machine as follows:

 

git clone --recursive -b dtc/develop https://github.com/NCAR/gmtb-scm

 

The Users Guide for the SCM describes the required pre-requisite compilers and libraries.  These MUST be installed (or already available) on your system BEFORE attempting to compile the SCM.  Install any necessary pre-requisite compilers, MPI libraries, etc, as described in the Users Guide, then compile the SCM as follows:

 

cd gmtb-scm/scm
mkdir bin
cd bin
cmake -DSTATIC=ON ../src
make

This will create an executable, gmtb_scm, in the current working directory. 

Additional Topic: References

Additional Topic: References carson Mon, 12/30/2019 - 14:40

CCPP website

SCM v3+ User's Guide

CCPP v3 Technical Documentation

CCPP v3 Scientific Documentation

CCPP user forum

Selected AMS 2020 presentations involving CCPP and the Single-Column Model