3. Running GSI¶
This chapter discusses the issues of running GSI. It starts with introductions to the input data required to run GSI, then proceeds with a detailed explanation of an example GSI run script and introductions to files produced by a successful GSI run. It concludes with some frequently used options from the GSI namelist.
Input Data Required to Run GSI¶
In most cases, three types of input data (background, observations, and fixed files) must be available before running GSI. In some special idealized cases, such as a pseudo single observation test, GSI can be run without any observations. If running GSI with the 3D EnVar hybrid option, global or regional ensemble forecasts are also needed.
Background or First Guess Field¶
As with other data analysis systems, the background or first guess fields may come from a model forecast conducted separately or from a previous data assimilation cycle. The following is a list of the types of background files that can be used by this release version of GSI:
- WRF-NMM input fields in binary format
- WRF-NMM input fields in NetCDF format
- WRF-ARW input fields in binary format
- WRF-ARW input fields in NetCDF format
- GFS input fields in binary format or through NEMS I/O
- NEMS-NMMB input fields
- RTMA input files (2-dimensional binary format)
- WRF-Chem GOCART input fields with NetCDF format
- CMAQ binary file
The Weather Research and Forecasting (WRF) community modeling system includes two dynamical cores: the Advanced Research WRF (ARW) and the Nonhydrostatic Mesoscale Model (NMM). The GFS (Global Forecast System), NEMS (National Environmental Modeling System)-NMMB (Nonhydrostatic Mesoscale Model B-Grid), and RTMA (Real-Time Mesoscale Analysis) are operational systems at the National Center for Environmental Prediction (NCEP). The DTC mainly supports GSI for regional WRF applications. Therefore, most of the multiple platform tests were conducted using WRF netcdf background files (d). The DTC also supports the GSI in global and chemical applications with limited resources. The following backgrounds have been tested for this release:
- ARW NetCDF (d) were tested with multiple cases
- GFS (e) was tested with multiple NCEP cases
- WRF-Chem NetCDF (h) was tested with a single case
- NEMS-NMMB(f) was tested with a single case
Observations¶
GSI can analyze many types of observational data, including conventional data, satellite radiance observations, GPS Radio Occultations, and radar data, among others. The default observation file names are given in the released GSI namelist, with corresponding observations included in each file. Sample BUFR files available for download from the NCEP website listed in table [t31].
The observations are complex and many observations need format converting and quality control before being used by GSI. GSI ingests observations saved in BUFR format (with NCEP specified features). The NCEP processed PrepBUFR and BUFR files can be used directly. If users need to introduce their own data into GSI, please check the following website for the Users Guide and examples of BUFR/PreBUFR processing:
http://www.dtcenter.org/com-GSI/BUFR/index.php
DTC supports BUFR/PrepBUFR data processing and quality control as part of the GSI community tasks.
GSI can analyze all of the data types in table [t31], but each GSI run (for both operation and case study purposes) only uses a subset of the data. Some data may be outdated and not available, some are in monitoring mode, and some may have quality issues during certain periods. Users are encouraged to check data quality prior to running an analysis. The following NCEP links provide resources that include data quality history:
Because the current regional models do not have ozone as a prognostic variable, ozone data are not assimilated on the regional scale.
GSI can be run without any observations to see how the moisture constraint modifies the first guess (background) field. GSI can also be run in a pseudo single observation mode, which does not require any BUFR observation files. In this mode, users should specify observation information in the namelist section SINGLEOB_TEST (see Section [sec4.2] for details). As more data files are used, additional information will be added through the GSI analysis.
GSI Name | Content | Example file names |
---|---|---|
prepbufr | Conventional observations, including ps, t, q, pw, uv, spd, dw, sst | gdas1.t12z.prepbufr.nr |
satwndbufr | satellite winds observations | gdas1.t12z.satwnd.tm00.bufr_d |
amsuabufr | AMSU-A 1b radiance (brightness temperatures) from satellites NOAA-15, 16, 17,18, 19 and METOP-A/B | gdas1.t12z.1bamua.tm00.bufr_d |
amsubbufr | AMSU-B 1b radiance (brightness temperatures) from satellites NOAA-15, 16,17 | gdas1.t12z.1bamub.tm00.bufr_d |
radarbufr | Radar radial velocity Level 2.5 data | ndas.t12z.radwnd.tm12.bufr_d |
gpsrobufr | GPS radio occultation and bending angle observation | gdas1.t12z.gpsro.tm00.bufr_d |
ssmirrbufr | Precipitation rate observations from SSM/I | gdas1.t12z.spssmi.tm00.bufr_d |
tmirrbufr | Precipitation rate observations from TMI | gdas1.t12z.sptrmm.tm00.bufr_d |
sbuvbufr | SBUV/2 ozone observations from satellite NOAA-16, 17, 18, 19 | gdas1.t12z.osbuv8.tm00.bufr_d |
hirs2bufr | HIRS2 1b radiance from satellite NOAA-14 | gdas1.t12z.1bhrs2.tm00.bufr_d |
hirs3bufr | HIRS3 1b radiance observations from satellite NOAA-16, 17 | gdas1.t12z.1bhrs3.tm00.bufr_d |
hirs4bufr | HIRS4 1b radiance observation from satellite NOAA-18, 19 and METOP-A/B | gdas1.t12z.1bhrs4.tm00.bufr_d |
msubufr | MSU observation from satellite NOAA 14 | gdas1.t12z.1bmsu.tm00.bufr_d |
airsbufr | AMSU-A and AIRS radiances from satellite AQUA | gdas1.t12z.airsev.tm00.bufr_d |
mhsbufr | Microwave Humidity Sounder observation from NOAA-18, 19 and METOP-A/B | gdas1.t12z.1bmhs.tm00.bufr_d |
ssmitbufr | SSMI observation from satellite f13, f14, f15 | gdas1.t12z.ssmit.tm00.bufr_d |
amsrebufr | AMSR-E radiance from satellite AQUA | gdas1.t12z.amsre.tm00.bufr_d |
ssmisbufr | SSMIS radiances from satellite f16 | gdas1.t12z.ssmis.tm00.bufr_d |
gsnd1bufr | GOES sounder radiance (sndrd1, sndrd2, sndrd3 sndrd4) from GOES-11, 12, 13, 14, 15. | gdas1.t12z.goesfv.tm00.bufr_d |
l2rwbufr | NEXRAD Level 2 radial velocity | ndas.t12z.nexrad.tm12.bufr_d |
gsndrbufr | GOES sounder radiance from GOES-11, 12 | gdas1.t12z.goesnd.tm00.bufr_d |
gimgrbufr | GOES imager radiance from GOE-11, 12 | |
omibufr | Ozone Monitoring Instrument (OMI) observation NASA Aura | gdas1.t12z.omi.tm00.bufr_d |
iasibufr | Infrared Atmospheric Sounding Interfero-meter sounder observations from METOP-A/B | gdas1.t12z.mtiasi.tm00.bufr_d |
gomebufr | The Global Ozone Monitoring Experiment (GOME) ozone observation from METOP-A/B | gdas1.t12z.gome.tm00.bufr_d |
mlsbufr | Aura MLS stratospheric ozone data from Aura | gdas1.t12z.mlsbufr.tm00.bufr_d |
tcvitl | Synthetic Tropic Cyclone-MSLP observation | gdas1.t12z.syndata.tcvitals.tm00 |
seviribufr | SEVIRI radiance from MET-08,09,10 | gdas1.t12z. sevcsr.tm00.bufr_d |
atmsbufr | ATMS radiance from Suomi NPP | gdas1.t12z.atms.tm00.bufr_d |
crisbufr | CRIS radiance from Suomi NPP | gdas1.t12z.cris.tm00.bufr_d |
modisbufr | MODIS aerosol total column AOD observations from AQUA and TERRA |
[t31]
Fixed Files (Statistics and Control Files)¶
A GSI analysis also needs to read specific information from statistic
files, configuration files, bias correction files, and CRTM coefficient
files. We refer to these files as fixed files and they are located in a
directory called fix/
in the release package, except for CRTM
coefficients.
Table [t32] lists fixed files required for a GSI run, the content of the files, and corresponding example files from the regional and global applications:
Because most of those fixed files have hardwired names inside the GSI, a
GSI run script needs to copy or link those files (right column in table
[t32]) from the ./fix
directory to the GSI run directory
with the file name required in GSI (left column in table
[t32]). For example, if GSI runs with an ARW background, the
following line should be in the run script:
cp ${path of the fix directory}/anavinfo_arw_netcdf anavinfo
Note that in this release, there is a strict rule that the numbers of
vertical levels in the file anavinfo
must match the background file
(for example, wrfinput_d01
) for the 3-dimensional variables.
Otherwise GSI will fail. To identify the correct numbers of vertical
levels, users can dump out (use ncdump -h
) the dimensions from the
NetCDF background file and find the number for bottom_top
and
bottom_top_stag
. For example, if the dimensions for the background
file is:
bottom_top = 50 ;
bottom_top_stag = 51 ;
Then the corresponding anavinfo
file should have 51 levels for
prse
(3-dimemsional pressure field) and 50 levels for other
three-dimensional variables such as u, v, tv, q, oz, cw, etc. For
details, users can dump out the global attributes of the background file
and find the number of vertical levels for each variable. The following
shows part of the anavinfo
file for the above background:
state_derivatives::
!var level src
ps 1 met_guess
u 50 met_guess
v 50 met_guess
tv 50 met_guess
q 50 met_guess
oz 50 met_guess
cw 50 met_guess
prse 51 met_guess
::
GSI Name | Content | Example file names |
---|---|---|
anavinfo | Information file to set control and analysis variables | anavinfo_arw_netcdf anavinfo_ndas_netcdf global_anavinfo.l64.txt |
berror_stats | background error covariance | nam_nmmstat_na.gcv nam_glb_berror.f77.gcv global_berror.l64y386.f77 |
errtable | Observation error table | nam_errtable.r3dv prepobs_errtable.global |
convinfo | Conventional observation information file | global_convinfo.txt nam_regional_convinfo.txt |
satinfo | satellite channel information file | global_satinfo.txt |
pcpinfo | precipitation rate observation information file | global_pcpinfo.txt |
ozinfo | ozone observation information file | global_ozinfo.txt |
satbias_angle | satellite scan angle dependent bias correction file | global_satangbias.txt |
satellite mass bias correction coefficient file | sample.satbias | |
combined satellite angle dependent and mass bias correction coefficient file | gdas1.t00z.abias.new | |
t_rejectlist, w_rejectlist,.. | Rejetion list for T, wind, et al. in RTMA | new_rtma_t_rejectlist new_rtma_w_rejectlist |
[t32]
Each operational system, such as GFS, NAM, RAP, and RTMA, has their own
set of fixed files. For your specific GSI runs, you need to get the
correct set of fixed files. Fixed files for regional applications are
included in this GSI/EnKF release and put under the fix/ directory.
Fixed files for global applications are not included in this release in
order to save space. Please download comGSIv3.7_EnKFv1.3_fix_global.tar.gz
comGSIv3.7_EnKFv1.3_fix_global.tar.gz
if you need to run global
cases. Note that little endian background error covariance files are no
longer supported.
Each release version of the GSI calls a certain version of the CRTM library and needs corresponding CRTM coefficients to do radiance data assimilation. This version of GSI uses CRTM 2.2.3. The coefficient files are listed in table [t34].
File name used in GSI | Content | Example Files |
---|---|---|
Nalli.IRwater.EmisCoeff.bin | IR surface emissivity | Nalli.IRwater.EmisCoeff.bin |
NPOESS.IRice.EmisCoeff.bin | coefficients | NPOESS.IRice.EmisCoeff.bin |
NPOESS.IRsnow.EmisCoeff.bin | NPOESS.IRsnow.EmisCoeff.bin | |
NPOESS.IRland.EmisCoeff.bin | NPOESS.IRland.EmisCoeff.bin | |
NPOESS.VISice.EmisCoeff.bin | NPOESS.VISice.EmisCoeff.bin | |
NPOESS.VISland.EmisCoeff.bin | NPOESS.VISland.EmisCoeff.bin | |
NPOESS.VISsnow.EmisCoeff.bin | NPOESS.VISsnow.EmisCoeff.bin | |
NPOESS.VISwater.EmisCoeff.bin | NPOESS.VISwater.EmisCoeff.bin | |
FASTEM6.MWwater.EmisCoeff.bin | FASTEM6.MWwater.EmisCoeff.bin | |
AerosolCoeff.bin | Aerosol coefficients | AerosolCoeff.bin |
CloudCoeff.bin | Cloud scattering and emission coefficients | CloudCoeff.bin |
${satsen}.SpcCoeff.bin | Sensor spectral response characteristics | ${satsen}.SpcCoeff.bin |
${satsen}.TauCoeff.bin | Transmittance coefficients | ${satsen}.TauCoeff.bin |
[t34]
GSI Run Script¶
In this release version, three sample run scripts are available for different GSI applications:
ush/comgsi_run_regional.ksh
for regional GSIush/comgsi_run_global.ksh
for global GSI (GFS)ush/comgsi_run_chem.ksh
for chemical analysis
These scripts will be called to generate GSI namelists:
ush/comgsi_namelist.sh
for regional GSIush/comgsi_namelist_gfs.sh
for global GSI (GFS)ush/comgsi_namelist_chem.sh
for GSI chemical analysis
We will introduce the regional run scripts (comgsi_run_regional.ksh
)
in detail in the following sections and introduce the global run script
when we discuss the GSI global application in the Advanced GSI Users
Guide.
Note there is also a run script for regional EnKF
(comenkf_run_regional.ksh
), a run script for global EnKF
(comenkf_run_global.ksh
) and the EnKF namelist script
(comenkf_namelist.sh
) in the same directory, which will be
introduced in the EnKF Users Guide.
Steps in the GSI Run Script¶
The GSI run script creates a run time environment necessary to run the GSI executable. A typical GSI run script includes the following steps:
- Request computer resources to run GSI.
- Set environmental variables for the machine architecture.
- Set experimental variables (such as experiment name, analysis time, background, and observation).
- Set the script that generates the GSI namelist.
- Check the definitions of required variables.
- Generate a run directory for GSI (sometimes called a working or temporary directory).
- Copy the GSI executable to the run directory.
- Copy the background file to the run directory and create an index file listing the location and name of ensemble members if running with a hybrid set up.
- Link observations to the run directory.
- Link fixed files (statistic, control, and coefficient files) to the run directory.
- Generate namelist for GSI.
- Run the GSI executable.
- Post-process: save analysis results, generate diagnostic files, and clean the run directory.
- Run GSI as observation operator for EnKF, only for
if_observer=Yes
.
Typically, users only need to modify specific parts of the run script (steps 1, 2, and 3) to fit their specific computer environment and point to the correct input/output files and directories. Users may also need to modify step 4 if changes are made to the namelist and it is under a different name or at a different location. The next section (1.2.2) covers each of these modifications for steps 1 to 3. Section 1.2.3 will dissect a sample regional GSI run script and introduce each piece of this sample GSI run script. Users should start with the run script provided in the same release package with the GSI executable and modify it for their own run environment and case configuration.
Customization of the GSI Run Script¶
This section focuses on step 1 of the run script: modifying the machine specific entries. Specifically, this consists of setting Unix/Linux environment variables and selecting the correct parallel run time environment (batch system with options).
GSI can be run with the same parallel environments as other MPI programs, for example:
- IBM supercomputer using LSF (Load Sharing Facility)
- IBM supercomputer using LoadLevel
- Linux clusters using PBS (Portable Batch System)
- Linux clusters using LSF
- Linux workstation (no batch system)
- Intel Mac Darwin workstation with PGI complier (no batch system)
Two queuing systems are listed below as examples:
Machine & queue system | Linux Cluster with LSF | Linux Cluster with PBS | Workstation |
---|---|---|---|
example | #BSUB -P ????????
#BSUB -W 00:10
#BSUB -n 4
#BSUB -R "span[ptile=16]
#BSUB -J gsi
#BSUB -o gsi.%J.out
#BSUB -e gsi.%J.err
#BSUB -q small
|
#PBS -l procs=4
#PBS -n
#PBS -o gsi.out
#PBS -e gsi.err
#PBS -N GSI
#PBS -l walltime=00:20
#PBS -A ??????
|
No batch system, skip this step |
[t35]
In both of the examples above, environment variables are set specifying system resource management, such as the number of processors, the name/type of queue, maximum wall clock time allocated for the job, options for standard out and standard error, etc. Some platforms need additional definitions to specify Unix environment variables that further define the run environment.
These variable settings can significantly impact the GSI run efficiency and accuracy of the GSI results. Please check with your system administrator for optimal settings for your computer system. Note that while the GSI can be run with any number of processors, it will not scale well with the increase of processor numbers after a certain threshold based on the case configuration and GSI application types.
There are only two options to define in this block.
# GSIPROC = processor number used for GSI analysis
#------------------------------------------------
GSIPROC=4
ARCH='LINUX_LSF'
# Supported configurations:
# IBM_LSF,
# LINUX, LINUX_LSF, LINUX_PBS,
# DARWIN_PGI
The option ARCH
selects the machine architecture. It is a function
of platform type and batch queuing system. The option GSIPROC
sets
the number of cores used in the run. This option also decides if the job
is run as a multiple core job or as a single core run. Several choices
of the option ARCH
are listed in the sample run script. Please check
with your system administrator about running parallel MPI jobs on your
system.
Option ARCH | Platform | Compiler | batch queuing system |
---|---|---|---|
IBM_LSF | IBM AIX | xlf, xlc | LSF |
LINUX | Linux workstation | Intel/PGI/GNU | mpirun if GSIPROC > 1 |
LINUX_LSF | Linux cluster | Intel/PGI/GNU | LSF |
LINUX_PBS | Linux cluster | Intel/PGI/GNU | PBS |
DARWIN_PGI | MAC DARWIN | PGI | mpirun if GSIPROC > 1 |
[t36]
This section discusses setting up variables specific to a given case, such as analysis time, working directory, background and observation files, location of fixed files and CRTM coefficients, the GSI executable file, and the script generating GSI namelist.
#####################################################
# case set up (users should change this part)
#####################################################
#
# ANAL_TIME= analysis time (YYYYMMDDHH)
# WORK_ROOT= working directory, where GSI runs
# PREPBURF = path of PreBUFR conventional obs
# BK_FILE = path and name of background file
# OBS_ROOT = path of observations files
# FIX_ROOT = path of fix files
# GSI_EXE = path and name of the gsi executable
# ENS_ROOT = path where ensemble background files exist
ANAL_TIME=2017051318
JOB_DIR=the_job_directory
#normally you put run scripts here and submit jobs form here,
#require a copy of gsi.x at this directory
RUN_NAME=a_descriptive_run_name_such_as_case05_3denvar_etc
OBS_ROOT=the_directory_where_observation_files_are_located
BK_ROOT=the_directory_where_background_files_are_located
GSI_ROOT=the_comgsi_main directory where src/ ush/ fix/ etc are located
CRTM_ROOT=the_CRTM_directory
ENS_ROOT=the_directory_where_ensemble_backgrounds_are_located
#ENS_ROOT is not required if not running hybrid EnVAR
HH=`echo $ANAL_TIME | cut -c9-10`
GSI_EXE=${JOB_DIR}/gsi.x #assume you have a copy of gsi.x here
WORK_ROOT=${JOB_DIR}/${RUN_NAME}
FIX_ROOT=${GSI_ROOT}/fix
GSI_NAMELIST=${GSI_ROOT}/ush/comgsi_namelist.sh
PREPBUFR=${OBS_ROOT}/nam.t${HH}z.prepbufr.tm00
BK_FILE=${BK_ROOT}/wrfinput_d01.${ANAL_TIME}
When picking the observation BUFR files, please be aware of the following:
- GSI run will stop if the time in the background file does not match the cycle time in the observation BUFR file used for the GSI run (there is a namelist option to turn this verification step off).
- Even if their contents are identical, PrepBUFR/BUFR files will differ if they were created on platforms with different endian byte order specification (Linux vs. IBM). Appendix A.1 discusses the conversion tool SSRC used to byte-swap observation files. Since release version 3.2, GSI compiled with PGI and Intel can automatically handle byte order issues in PrepBUFR and BUFR files. Users can directly link BUFR files of any order if working with Intel and PGI platform.
The next part of this block focuses on additional options that specify important aspects of the GSI configuration.
#------------------------------------------------
# bk_core= which WRF core is used as background (NMM or ARW or NMMB)
# bkcv_option= which background error covariance and parameter will be used
# (GLOBAL or NAM)
# if_clean = clean : delete temperal files in working directory (default)
# no : leave running directory as is (this is for debug only)
# if_observer = Yes : only used as observation operater for enkf
# if_hybrid = Yes : Run GSI as 3D/4D EnVar
# if_4DEnVar = Yes : Run GSI as 4D EnVar
# if_nemsio = Yes : The GFS background files are in NEMSIO format
# if_oneob = Yes : Do single observation test
if_hybrid=No # Yes, or, No -- case sensitive !
if_4DEnVar=No # Yes, or, No -- case sensitive (set if_hybrid=Yes first)!
if_observer=No # Yes, or, No -- case sensitive !
if_nemsio=No # Yes, or, No -- case sensitive !
if_oneob=No # Yes, or, No -- case sensitive !
bk_core=ARW
bkcv_option=NAM
if_clean=clean
#
# setup whether to do single obs test
if [ ${if_oneob} = Yes ]; then
if_oneobtest='.true.'
else
if_oneobtest='.false.'
fi
#
# setup for GSI 3D/4D EnVar hybrid
if [ ${if_hybrid} = Yes ] ; then
PDYa=`echo $ANAL_TIME | cut -c1-8`
cyca=`echo $ANAL_TIME | cut -c9-10`
gdate=`date -u -d "$PDYa $cyca -6 hour" +%Y%m%d%H` #guess date is 6hr ago
gHH=`echo $gdate |cut -c9-10`
datem1=`date -u -d "$PDYa $cyca -1 hour" +%Y-%m-%d_%H:%M:%S` #1hr ago
datep1=`date -u -d "$PDYa $cyca 1 hour" +%Y-%m-%d_%H:%M:%S` #1hr later
if [ ${if_nemsio} = Yes ]; then
if_gfs_nemsio='.true.'
ENSEMBLE_FILE_mem=${ENS_ROOT}/gdas.t${gHH}z.atmf006s.mem
else
if_gfs_nemsio='.false.'
ENSEMBLE_FILE_mem=${ENS_ROOT}/sfg_${gdate}_fhr06s_mem
fi
if [ ${if_4DEnVar} = Yes ] ; then
BK_FILE_P1=${BK_ROOT}/wrfout_d01_${datep1}
BK_FILE_M1=${BK_ROOT}/wrfout_d01_${datem1}
if [ ${if_nemsio} = Yes ]; then
ENSEMBLE_FILE_mem_p1=${ENS_ROOT}/gdas.t${gHH}z.atmf009s.mem
ENSEMBLE_FILE_mem_m1=${ENS_ROOT}/gdas.t${gHH}z.atmf003s.mem
else
ENSEMBLE_FILE_mem_p1=${ENS_ROOT}/sfg_${gdate}_fhr09s_mem
ENSEMBLE_FILE_mem_m1=${ENS_ROOT}/sfg_${gdate}_fhr03s_mem
fi
fi
fi
# The following two only apply when if_observer = Yes, i.e. run observation operator for EnKF
# no_member number of ensemble members
# BK_FILE_mem path and base for ensemble members
no_member=20
BK_FILE_mem=${BK_ROOT}/wrfarw.mem
#
Option if_hybrid controls whether to run a hybrid ensemble/variational data analysis. If if_hybrid=Yes, option if_4DEnVar=Yes indicates a hybrid 4D-EnVar analysis will be run, while if_4DEnVar=No indicates a hybrid 3DEnVAR analysis will be run. Option if_observer determines whether GSI is run as an observation operator for EnKF.
Option bk_core indicates the specific dynamic core used to create the background files and specifies the core in the namelist. Option bk_core can be ARW or NMMB. Option bkcv_option specifies the background error covariance to be used in the case. Two regional background error covariance matrices are provided with the release, one from NCEP global data assimilation (GDAS), and one from the NAM data assimilation system (NDAS). Please check Section [sec4.8] for more details about GSI background error covariance. Option if_clean tells the script if it needs to delete temporary intermediate files in the working directory after a GSI run is completed.
In most cases, users should only make minor changes after the following:
#####################################################
# Users should NOT change script after this point
#####################################################
#
BYTE_ORDER=Big_Endian
# BYTE_ORDER=Little_Endian
Description of the Sample Regional Run Script to Run GSI¶
Listed below is an annotated regional run script with explanations on each function block.
For further details on the first three blocks of the script that users need to change, see sections 3.2.2.1, 3.2.2.2, and 3.2.2.3:
#!/bin/ksh
#####################################################
# machine set up (users should change this part)
#####################################################
set -x
#
# GSIPROC = processor number used for GSI analysis
#------------------------------------------------
GSIPROC=1
ARCH='LINUX_LSF'
# Supported configurations:
# IBM_LSF,
# LINUX, LINUX_LSF, LINUX_PBS,
# DARWIN_PGI
#
#####################################################
# case set up (users should change this part)
#####################################################
#
# ANAL_TIME= analysis time (YYYYMMDDHH)
# WORK_ROOT= working directory, where GSI runs
# PREPBURF = path of PreBUFR conventional obs
# BK_FILE = path and name of background file
# OBS_ROOT = path of observations files
# FIX_ROOT = path of fix files
# GSI_EXE = path and name of the gsi executable
# ENS_ROOT = path where ensemble background files exist
ANAL_TIME=2017051318
JOB_DIR=the_job_directory
#normally you put run scripts here and submit jobs form here, require a copy of gsi.x at this directory
RUN_NAME=a_descriptive_run_name_such_as_case05_3denvar_etc
OBS_ROOT=the_directory_where_observation_files_are_located
BK_ROOT=the_directory_where_background_files_are_located
GSI_ROOT=the_comgsi_main directory where src/ ush/ fix/ etc are located
CRTM_ROOT=the_CRTM_directory
ENS_ROOT=the_directory_where_ensemble_backgrounds_are_located
#ENS_ROOT is not required if not running hybrid EnVAR
HH=`echo $ANAL_TIME | cut -c9-10`
GSI_EXE=${JOB_DIR}/gsi.x #assume you have a copy of gsi.x here
WORK_ROOT=${JOB_DIR}/${RUN_NAME}
FIX_ROOT=${GSI_ROOT}/fix
GSI_NAMELIST=${GSI_ROOT}/ush/comgsi_namelist.sh
PREPBUFR=${OBS_ROOT}/nam.t${HH}z.prepbufr.tm00
BK_FILE=${BK_ROOT}/wrfinput_d01.${ANAL_TIME}
#
#------------------------------------------------
# bk_core= which WRF core is used as background (NMM or ARW or NMMB)
# bkcv_option= which background error covariance and parameter will be used
# (GLOBAL or NAM)
# if_clean = clean : delete temperal files in working directory (default)
# no : leave running directory as is (this is for debug only)
# if_observer = Yes : only used as observation operater for enkf
# if_hybrid = Yes : Run GSI as 3D/4D EnVar
# if_4DEnVar = Yes : Run GSI as 4D EnVar
# if_nemsio = Yes : The GFS background files are in NEMSIO format
# if_oneob = Yes : Do single observation test
if_hybrid=No # Yes, or, No -- case sensitive !
if_4DEnVar=No # Yes, or, No -- case sensitive (set if_hybrid=Yes first)!
if_observer=No # Yes, or, No -- case sensitive !
if_nemsio=No # Yes, or, No -- case sensitive !
if_oneob=No # Yes, or, No -- case sensitive !
bk_core=ARW
bkcv_option=NAM
if_clean=clean
#
# setup whether to do single obs test
if [ ${if_oneob} = Yes ]; then
if_oneobtest='.true.'
else
if_oneobtest='.false.'
fi
#
# setup for GSI 3D/4D EnVar hybrid
if [ ${if_hybrid} = Yes ] ; then
PDYa=`echo $ANAL_TIME | cut -c1-8`
cyca=`echo $ANAL_TIME | cut -c9-10`
gdate=`date -u -d "$PDYa $cyca -6 hour" +%Y%m%d%H` #guess date is 6hr ago
gHH=`echo $gdate |cut -c9-10`
datem1=`date -u -d "$PDYa $cyca -1 hour" +%Y-%m-%d_%H:%M:%S` #1hr ago
datep1=`date -u -d "$PDYa $cyca 1 hour" +%Y-%m-%d_%H:%M:%S` #1hr later
if [ ${if_nemsio} = Yes ]; then
if_gfs_nemsio='.true.'
ENSEMBLE_FILE_mem=${ENS_ROOT}/gdas.t${gHH}z.atmf006s.mem
else
if_gfs_nemsio='.false.'
ENSEMBLE_FILE_mem=${ENS_ROOT}/sfg_${gdate}_fhr06s_mem
fi
if [ ${if_4DEnVar} = Yes ] ; then
BK_FILE_P1=${BK_ROOT}/wrfout_d01_${datep1}
BK_FILE_M1=${BK_ROOT}/wrfout_d01_${datem1}
if [ ${if_nemsio} = Yes ]; then
ENSEMBLE_FILE_mem_p1=${ENS_ROOT}/gdas.t${gHH}z.atmf009s.mem
ENSEMBLE_FILE_mem_m1=${ENS_ROOT}/gdas.t${gHH}z.atmf003s.mem
else
ENSEMBLE_FILE_mem_p1=${ENS_ROOT}/sfg_${gdate}_fhr09s_mem
ENSEMBLE_FILE_mem_m1=${ENS_ROOT}/sfg_${gdate}_fhr03s_mem
fi
fi
fi
# The following two only apply when if_observer = Yes, i.e. run observation operator for EnKF
# no_member number of ensemble members
# BK_FILE_mem path and base for ensemble members
no_member=20
BK_FILE_mem=${BK_ROOT}/wrfarw.mem
#
#
At this point, users should be able to run the GSI for simple cases without changing the scripts. However, some advanced users may need to change some of the following blocks for special applications, such as use of radiance data, cycled runs, specifying certain namelist variables, or running GSI on a platform not tested by the DTC.
#####################################################
# Users should NOT change script after this point
#####################################################
The next block sets the run command for GSI on multiple platforms. The ARCH variable is set at the beginning of the script. Option BYTE_ORDER has been set as Big_Endian because GSI compiled with Intel and PGI can read a Big_Endian background error file, BUFR files, and CRTM coefficient files.
#####################################################
# Users should NOT make changes after this point
#####################################################
#
BYTE_ORDER=Big_Endian
# BYTE_ORDER=Little_Endian
case $ARCH in
'IBM_LSF')
###### IBM LSF (Load Sharing Facility)
RUN_COMMAND="mpirun.lsf " ;;
'LINUX')
if [ $GSIPROC = 1 ]; then
#### Linux workstation - single processor
RUN_COMMAND=""
else
###### Linux workstation - mpi run
RUN_COMMAND="mpirun -np ${GSIPROC} -machinefile ~/mach "
fi ;;
'LINUX_LSF')
###### LINUX LSF (Load Sharing Facility)
RUN_COMMAND="mpirun.lsf " ;;
'LINUX_PBS')
#### Linux cluster PBS (Portable Batch System)
RUN_COMMAND="mpirun -np ${GSIPROC} " ;;
'DARWIN_PGI')
### Mac - mpi run
if [ $GSIPROC = 1 ]; then
#### Mac workstation - single processor
RUN_COMMAND=""
else
###### Mac workstation - mpi run
RUN_COMMAND="mpirun -np ${GSIPROC} -machinefile ~/mach "
fi ;;
* )
print "error: $ARCH is not a supported platform configuration."
exit 1 ;;
esac
The next block checks if all the variables needed for a GSI run are properly defined. These variables should have been defined in the first three parts of this script.
##################################################################################
# Check GSI needed environment variables are defined and exist
#
# Make sure ANAL_TIME is defined and in the correct format
if [ ! "${ANAL_TIME}" ]; then
echo "ERROR: \$ANAL_TIME is not defined!"
exit 1
fi
# Make sure WORK_ROOT is defined and exists
if [ ! "${WORK_ROOT}" ]; then
echo "ERROR: \$WORK_ROOT is not defined!"
exit 1
fi
# Make sure the background file exists
if [ ! -r "${BK_FILE}" ]; then
echo "ERROR: ${BK_FILE} does not exist!"
exit 1
fi
# Make sure OBS_ROOT is defined and exists
if [ ! "${OBS_ROOT}" ]; then
echo "ERROR: \$OBS_ROOT is not defined!"
exit 1
fi
if [ ! -d "${OBS_ROOT}" ]; then
echo "ERROR: OBS_ROOT directory '${OBS_ROOT}' does not exist!"
exit 1
fi
# Set the path to the GSI static files
if [ ! "${FIX_ROOT}" ]; then
echo "ERROR: \$FIX_ROOT is not defined!"
exit 1
fi
if [ ! -d "${FIX_ROOT}" ]; then
echo "ERROR: fix directory '${FIX_ROOT}' does not exist!"
exit 1
fi
# Set the path to the CRTM coefficients
if [ ! "${CRTM_ROOT}" ]; then
echo "ERROR: \$CRTM_ROOT is not defined!"
exit 1
fi
if [ ! -d "${CRTM_ROOT}" ]; then
echo "ERROR: fix directory '${CRTM_ROOT}' does not exist!"
exit 1
fi
# Make sure the GSI executable exists
if [ ! -x "${GSI_EXE}" ]; then
echo "ERROR: ${GSI_EXE} does not exist!"
exit 1
fi
# Check to make sure the number of processors for running GSI was specified
if [ -z "${GSIPROC}" ]; then
echo "ERROR: The variable $GSIPROC must be set to contain the number of processors to run GSI"
exit 1
fi
The next block creates a working directory (workdir) in which GSI will run. The directory should have enough disk space to hold all the files needed for this run. This directory is cleaned before each run, therefore, save all the files needed from the previous run before rerunning GSI.
##################################################################################
# Create the work directory and cd into it
workdir=${WORK_ROOT}
echo " Create working directory:" ${workdir}
if [ -d "${workdir}" ]; then
rm -rf ${workdir}
fi
mkdir -p ${workdir}
cd ${workdir}
#
##################################################################################
echo " Copy GSI executable, background file, and link observation bufr to working directory"
# Save a copy of the GSI executable in the workdir
cp ${GSI_EXE} gsi.exe
# Bring over background field (it's modified by GSI so we can't link to it)
cp ${BK_FILE} ./wrf_inout
if [ ${if_4DEnVar} = Yes ] ; then
cp ${BK_FILE_P1} ./wrf_inou3
cp ${BK_FILE_M1} ./wrf_inou1
fi
Note: You can link observation files to the working directory because
GSI will not overwrite these files. The observations that can be
analyzed in GSI are listed in the column “dfile” of the GSI namelist
section OBS_INPUT, as specified in run/comgsi_namelist.sh
. Most of
the conventional observations are in one single file named prepbufr,
while different radiance data are in separate files based on satellite
instruments, such as AMSU-A or HIRS. All these observation files must be
linked as GSI recognized file names in “dfile.” Please check table
[t31] for a detailed explanation of links and the meanings of
each file name listed below.
# Link to the prepbufr data
ln -s ${PREPBUFR} ./prepbufr
# ln -s ${OBS_ROOT}/gdas1.t${HH}z.sptrmm.tm00.bufr_d tmirrbufr
# Link to the radiance data
srcobsfile[1]=${OBS_ROOT}/gdas1.t${HH}z.satwnd.tm00.bufr_d
gsiobsfile[1]=satwnd
srcobsfile[2]=${OBS_ROOT}/gdas1.t${HH}z.1bamua.tm00.bufr_d
gsiobsfile[2]=amsuabufr
srcobsfile[3]=${OBS_ROOT}/gdas1.t${HH}z.1bhrs4.tm00.bufr_d
gsiobsfile[3]=hirs4bufr
srcobsfile[4]=${OBS_ROOT}/gdas1.t${HH}z.1bmhs.tm00.bufr_d
gsiobsfile[4]=mhsbufr
srcobsfile[5]=${OBS_ROOT}/gdas1.t${HH}z.1bamub.tm00.bufr_d
gsiobsfile[5]=amsubbufr
srcobsfile[6]=${OBS_ROOT}/gdas1.t${HH}z.ssmisu.tm00.bufr_d
gsiobsfile[6]=ssmirrbufr
# srcobsfile[7]=${OBS_ROOT}/gdas1.t${HH}z.airsev.tm00.bufr_d
gsiobsfile[7]=airsbufr
srcobsfile[8]=${OBS_ROOT}/gdas1.t${HH}z.sevcsr.tm00.bufr_d
gsiobsfile[8]=seviribufr
srcobsfile[9]=${OBS_ROOT}/gdas1.t${HH}z.iasidb.tm00.bufr_d
gsiobsfile[9]=iasibufr
srcobsfile[10]=${OBS_ROOT}/gdas1.t${HH}z.gpsro.tm00.bufr_d
gsiobsfile[10]=gpsrobufr
srcobsfile[11]=${OBS_ROOT}/gdas1.t${HH}z.amsr2.tm00.bufr_d
gsiobsfile[11]=amsrebufr
srcobsfile[12]=${OBS_ROOT}/gdas1.t${HH}z.atms.tm00.bufr_d
gsiobsfile[12]=atmsbufr
srcobsfile[13]=${OBS_ROOT}/gdas1.t${HH}z.geoimr.tm00.bufr_d
gsiobsfile[13]=gimgrbufr
srcobsfile[14]=${OBS_ROOT}/gdas1.t${HH}z.gome.tm00.bufr_d
gsiobsfile[14]=gomebufr
srcobsfile[15]=${OBS_ROOT}/gdas1.t${HH}z.omi.tm00.bufr_d
gsiobsfile[15]=omibufr
srcobsfile[16]=${OBS_ROOT}/gdas1.t${HH}z.osbuv8.tm00.bufr_d
gsiobsfile[16]=sbuvbufr
srcobsfile[17]=${OBS_ROOT}/gdas1.t${HH}z.eshrs3.tm00.bufr_d
gsiobsfile[17]=hirs3bufrears
srcobsfile[18]=${OBS_ROOT}/gdas1.t${HH}z.esamua.tm00.bufr_d
gsiobsfile[18]=amsuabufrears
srcobsfile[19]=${OBS_ROOT}/gdas1.t${HH}z.esmhs.tm00.bufr_d
gsiobsfile[19]=mhsbufrears
srcobsfile[20]=${OBS_ROOT}/rap.t${HH}z.nexrad.tm00.bufr_d
gsiobsfile[20]=l2rwbufr
srcobsfile[21]=${OBS_ROOT}/rap.t${HH}z.lgycld.tm00.bufr_d
gsiobsfile[21]=larcglb
ii=1
while [[ $ii -le 21 ]]; do
if [ -r "${srcobsfile[$ii]}" ]; then
ln -s ${srcobsfile[$ii]} ${gsiobsfile[$ii]}
echo "link source obs file ${srcobsfile[$ii]}"
fi
(( ii = $ii + 1 ))
done
The following block copies constant fixed files from the fix/ directory and links CRTM coefficients. Please check Section 3.1 for the meanings of each fixed file.
##################################################################################
echo " Copy fixed files and link CRTM coefficient files to working directory"
# Set fixed files
# berror = forecast model background error statistics
# specoef = CRTM spectral coefficients
# trncoef = CRTM transmittance coefficients
# emiscoef = CRTM coefficients for IR sea surface emissivity model
# aerocoef = CRTM coefficients for aerosol effects
# cldcoef = CRTM coefficients for cloud effects
# satinfo = text file with information about assimilation of brightness temperatures
# satangl = angle dependent bias correction file (fixed in time)
# pcpinfo = text file with information about assimilation of prepcipitation rates
# ozinfo = text file with information about assimilation of ozone data
# errtable = text file with obs error for conventional data (regional only)
# convinfo = text file with information about assimilation of conventional data
# bufrtable= text file ONLY needed for single obs test (oneobstest=.true.)
# bftab_sst= bufr table for sst ONLY needed for sst retrieval (retrieval=.true.)
Note: For background error covariances, observation errors, and analysis variable information, we provide two sets of fixed files. One set is based on GFS statistics and another is based on NAM statistics. For this release there is an additional setting in the ANAVINFO file for “bk_core” for both GFS and NAM statistics.
if [ ${bkcv_option} = GLOBAL ] ; then
echo ' Use global background error covariance'
BERROR=${FIX_ROOT}/${BYTE_ORDER}/nam_glb_berror.f77.gcv
OBERROR=${FIX_ROOT}/prepobs_errtable.global
if [ ${bk_core} = NMM ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_ndas_netcdf_glbe
fi
if [ ${bk_core} = ARW ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_arw_netcdf_glbe
fi
if [ ${bk_core} = NMMB ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_nems_nmmb_glb
fi
else
echo ' Use NAM background error covariance'
BERROR=${FIX_ROOT}/${BYTE_ORDER}/nam_nmmstat_na.gcv
OBERROR=${FIX_ROOT}/nam_errtable.r3dv
if [ ${bk_core} = NMM ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_ndas_netcdf
fi
if [ ${bk_core} = ARW ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_arw_netcdf
fi
if [ ${bk_core} = NMMB ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_nems_nmmb
fi
fi
SATINFO=${FIX_ROOT}/global_satinfo.txt
CONVINFO=${FIX_ROOT}/global_convinfo.txt
OZINFO=${FIX_ROOT}/global_ozinfo.txt
PCPINFO=${FIX_ROOT}/global_pcpinfo.txt
# copy Fixed fields to working directory
cp $ANAVINFO anavinfo
cp $BERROR berror_stats
cp $SATINFO satinfo
cp $CONVINFO convinfo
cp $OZINFO ozinfo
cp $PCPINFO pcpinfo
cp $OBERROR errtable
#
# # CRTM Spectral and Transmittance coefficients
CRTM_ROOT_ORDER=${CRTM_ROOT}/${BYTE_ORDER}
emiscoef_IRwater=${CRTM_ROOT_ORDER}/Nalli.IRwater.EmisCoeff.bin
emiscoef_IRice=${CRTM_ROOT_ORDER}/NPOESS.IRice.EmisCoeff.bin
emiscoef_IRland=${CRTM_ROOT_ORDER}/NPOESS.IRland.EmisCoeff.bin
emiscoef_IRsnow=${CRTM_ROOT_ORDER}/NPOESS.IRsnow.EmisCoeff.bin
emiscoef_VISice=${CRTM_ROOT_ORDER}/NPOESS.VISice.EmisCoeff.bin
emiscoef_VISland=${CRTM_ROOT_ORDER}/NPOESS.VISland.EmisCoeff.bin
emiscoef_VISsnow=${CRTM_ROOT_ORDER}/NPOESS.VISsnow.EmisCoeff.bin
emiscoef_VISwater=${CRTM_ROOT_ORDER}/NPOESS.VISwater.EmisCoeff.bin
emiscoef_MWwater=${CRTM_ROOT_ORDER}/FASTEM6.MWwater.EmisCoeff.bin
aercoef=${CRTM_ROOT_ORDER}/AerosolCoeff.bin
cldcoef=${CRTM_ROOT_ORDER}/CloudCoeff.bin
ln -s $emiscoef_IRwater ./Nalli.IRwater.EmisCoeff.bin
ln -s $emiscoef_IRice ./NPOESS.IRice.EmisCoeff.bin
ln -s $emiscoef_IRsnow ./NPOESS.IRsnow.EmisCoeff.bin
ln -s $emiscoef_IRland ./NPOESS.IRland.EmisCoeff.bin
ln -s $emiscoef_VISice ./NPOESS.VISice.EmisCoeff.bin
ln -s $emiscoef_VISland ./NPOESS.VISland.EmisCoeff.bin
ln -s $emiscoef_VISsnow ./NPOESS.VISsnow.EmisCoeff.bin
ln -s $emiscoef_VISwater ./NPOESS.VISwater.EmisCoeff.bin
ln -s $emiscoef_MWwater ./FASTEM6.MWwater.EmisCoeff.bin
ln -s $aercoef ./AerosolCoeff.bin
ln -s $cldcoef ./CloudCoeff.bin
# Copy CRTM coefficient files based on entries in satinfo file
for file in `awk '{if($1!~"!"){print $1}}' ./satinfo | sort | uniq` ;do
ln -s ${CRTM_ROOT_ORDER}/${file}.SpcCoeff.bin ./
ln -s ${CRTM_ROOT_ORDER}/${file}.TauCoeff.bin ./
done
# Only need this file for single obs test
bufrtable=${FIX_ROOT}/prepobs_prep.bufrtable
cp $bufrtable ./prepobs_prep.bufrtable
# for satellite bias correction
# Users may need to use their own satbias files for correct bias correction
cp ${GSI_ROOT}/fix/comgsi_satbias_in ./satbias_in
cp ${GSI_ROOT}/fix/comgsi_satbias_pc_in ./satbias_pc_in
Please note that in the above sample script, two files related to radiance bias correction are copied to the work directory:
cp ${GSI_ROOT}/fix/comgsi_satbias_in ./satbias_in
cp ${GSI_ROOT}/fix/comgsi_satbias_pc_in ./satbias_pc_in
There are two options on how to perform the radiance bias correction.
The first method is to do the angle dependent bias correction offline
and do the mass bias correction inside the GSI analysis, therefore
requiring two input files: satbias_angle
, corresponding to the angle
dependent bias correction file and satbias_in
, being the input file
for mass bias correction. The second method is to combine the angle
dependent and mass bias correction together and do it within the GSI
analysis, requiring one combined input file: satbias_in
. Note that
the input bias correction coefficients file, satbias_in
, is
different for the two options, therefore it is important to use the
appropriate input file for each method. The sample input files for the
first method are provided with this release: global_satangbias.txt
and sample.satbias
. To use the second option - combined angle
dependent and mass bias correction, a sample file,
gdas1.t00z.abias_pc.20150617
, is also provided. As a starting point,
users may also download a GDAS satbias coefficient file from the NOMADS
ftp site as the input file (starting in spring 2015, the GDAS
satbias
files have adopted the following format):
ftp://nomads.ncdc.noaa.gov/GDAS/YYYYMM/YYYYMMDD/gdas1.tHHz.abias
In order to use the combined angle dependent and mass bias correction,
users also need to set adp_anglebc=.true.
in the &SETUP
section
of the GSI namelist (comgsi_namelist.sh
). For more details about the
namelist, please see Appendix C in this document.
Set up some constants used in the GSI namelist. Please note that
bkcv_option
is set for background error tuning. They should be set
based on specific applications. Here we provide three sample sets of the
constants for different background error covariance options, one set is
used in the NAM operations, one for the GFS operations and one for the
NMMB operations. In this release, the capability of NMMB application is
included and therefore the namelist settings for NMMB are provided in
addition to NMM and ARW applications.
##################################################################################
# Set some parameters for use by the GSI executable and to build the namelist
echo " Build the namelist "
# default is NAM
# as_op='1.0,1.0,0.5 ,0.7,0.7,0.5,1.0,1.0,'
vs_op='1.0,'
hzscl_op='0.373,0.746,1.50,'
if [ ${bkcv_option} = GLOBAL ] ; then
# as_op='0.6,0.6,0.75,0.75,0.75,0.75,1.0,1.0'
vs_op='0.7,'
hzscl_op='1.7,0.8,0.5,'
fi
if [ ${bk_core} = NMMB ] ; then
vs_op='0.6,'
fi
# default is NMM
bk_core_arw='.false.'
bk_core_nmm='.true.'
bk_core_nmmb='.false.'
bk_if_netcdf='.true.'
if [ ${bk_core} = ARW ] ; then
bk_core_arw='.true.'
bk_core_nmm='.false.'
bk_core_nmmb='.false.'
bk_if_netcdf='.true.'
fi
if [ ${bk_core} = NMMB ] ; then
bk_core_arw='.false.'
bk_core_nmm='.false.'
bk_core_nmmb='.true.'
bk_if_netcdf='.false.'
fi
The following section specifies the number of outer loops and whether to save GSI read observations based on the setting of ”if_observer”.
if [ ${if_observer} = Yes ] ; then
nummiter=0
if_read_obs_save='.true.'
if_read_obs_skip='.false.'
else
nummiter=2
if_read_obs_save='.false.'
if_read_obs_skip='.false.'
fi
The following section of the script is used to generate the GSI namelist called gsiparm.anl in the working directory. A detailed explanation of each variable can be found in Section 3.4 and Appendix C.
# Build the GSI namelist on-the-fly
. $GSI_NAMELIST
The following block modifies the anavinfo file so that its vertical levels are consistent with the wrf_inout file for WRF ARW or NMM. Users no longer need to manually modify the anavinfo file.
# modify the anavinfo vertical levels based on wrf_inout for WRF ARW and NMM
if [ ${bk_core} = ARW ] || [ ${bk_core} = NMM ] ; then
bklevels=`ncdump -h wrf_inout | grep "bottom_top =" | awk '{print $3}' `
bklevels_stag=`ncdump -h wrf_inout | grep "bottom_top_stag =" | awk '{print $3}' `
anavlevels=`cat anavinfo | grep ' sf ' | tail -1 | awk '{print $2}' ` # levels of sf, vp, u, v, t, etc
anavlevels_stag=`cat anavinfo | grep ' prse ' | tail -1 | awk '{print $2}' ` # levels of prse
sed -i 's/ '$anavlevels'/ '$bklevels'/g' anavinfo
sed -i 's/ '$anavlevels_stag'/ '$bklevels_stag'/g' anavinfo
fi
The following block runs GSI and checks if GSI has successfully completed.
###################################################
# run GSI
###################################################
echo ' Run GSI with' ${bk_core} 'background'
case $ARCH in
'IBM_LSF')
${RUN_COMMAND} ./gsi.exe < gsiparm.anl > stdout 2>&1 ;;
* )
${RUN_COMMAND} ./gsi.exe > stdout 2>&1 ;;
esac
##################################################################
# run time error check
##################################################################
error=$?
if [ ${error} -ne 0 ]; then
echo "ERROR: ${GSI} crashed Exit status=${error}"
exit ${error}
fi
The following block saves the analysis results with an understandable
name and adds the analysis time to some output file names. Among them,
“stdout” contains runtime output of GSI and wrf_inout
is the
resulting analysis file.
##################################################################
#
# GSI updating satbias_in
#
# GSI updating satbias_in (only for cycling assimilation)
# Copy the output to more understandable names
ln -s stdout stdout.anl.${ANAL_TIME}
ln -s wrf_inout wrfanl.${ANAL_TIME}
ln -s fort.201 fit_p1.${ANAL_TIME}
ln -s fort.202 fit_w1.${ANAL_TIME}
ln -s fort.203 fit_t1.${ANAL_TIME}
ln -s fort.204 fit_q1.${ANAL_TIME}
ln -s fort.207 fit_rad1.${ANAL_TIME}
The following block collects the diagnostic files. The diagnostic files are merged and categorized based on outer loop and data type. Setting “write_diag” to true in the namelist directs GSI to write out diagnostic information for each observation. This information is very useful to check analysis details. Please check Appendix A.2 for the tool to read and analyze these diagnostic files.
# Loop over first and last outer loops to generate innovation
# diagnostic files for indicated observation types (groups)
#
# NOTE: Since we set miter=2 in GSI namelist SETUP, outer
# loop 03 will contain innovations with respect to
# the analysis. Creation of o-a innovation files
# is triggered by write_diag(3)=.true. The setting
# write_diag(1)=.true. turns on creation of o-g
# innovation files.
#
loops="01 03"
for loop in $loops; do
case $loop in
01) string=ges;;
03) string=anl;;
*) string=$loop;;
esac
# Collect diagnostic files for obs types (groups) below
# listall="conv amsua_metop-a mhs_metop-a hirs4_metop-a hirs2_n14 msu_n14 \
# sndr_g08 sndr_g10 sndr_g12 sndr_g08_prep sndr_g10_prep sndr_g12_prep \
# sndrd1_g08 sndrd2_g08 sndrd3_g08 sndrd4_g08 sndrd1_g10 sndrd2_g10 \
# sndrd3_g10 sndrd4_g10 sndrd1_g12 sndrd2_g12 sndrd3_g12 sndrd4_g12 \
# hirs3_n15 hirs3_n16 hirs3_n17 amsua_n15 amsua_n16 amsua_n17 \
# amsub_n15 amsub_n16 amsub_n17 hsb_aqua airs_aqua amsua_aqua \
# goes_img_g08 goes_img_g10 goes_img_g11 goes_img_g12 \
# pcp_ssmi_dmsp pcp_tmi_trmm sbuv2_n16 sbuv2_n17 sbuv2_n18 \
# omi_aura ssmi_f13 ssmi_f14 ssmi_f15 hirs4_n18 amsua_n18 mhs_n18 \
# amsre_low_aqua amsre_mid_aqua amsre_hig_aqua ssmis_las_f16 \
# ssmis_uas_f16 ssmis_img_f16 ssmis_env_f16 mhs_metop_b \
# hirs4_metop_b hirs4_n19 amusa_n19 mhs_n19"
listall=`ls pe* | cut -f2 -d"." | awk '{print substr($0, 0, length($0)-3)}' | sort | uniq`
for type in $listall; do
count=`ls pe*${type}_${loop}* | wc -l`
if [[ $count -gt 0 ]]; then
cat pe*${type}_${loop}* > diag_${type}_${string}.${ANAL_TIME}
fi
done
done
The following scripts clean the temporary intermediate files:
# Clean working directory to save only important files
ls -l * > list_run_directory
if [[ ${if_clean} = clean && ${if_observer} != Yes ]]; then
echo ' Clean working directory after GSI run'
rm -f *Coeff.bin # all CRTM coefficient files
rm -f pe0* # diag files on each processor
rm -f obs_input.* # observation middle files
rm -f siganl sigf03 # background middle files
rm -f fsize_* # delete temperal file for bufr size
fi
The following block of the script runs only for if_observer=Yes
,
which runs GSI as an observation operator for EnKF and without doing
minimization. The script first renames the previous diagnostics files
and GSI analysis file by appending .ensmean
to the filenames to
avoid these files being overwritten by the new GSI run.
#################################################
# start to calculate diag files for each member
#################################################
#
if [ ${if_observer} = Yes ] ; then
string=ges
for type in $listall; do
count=0
if [[ -f diag_${type}_${string}.${ANAL_TIME} ]]; then
mv diag_${type}_${string}.${ANAL_TIME} diag_${type}_${string}.ensmean
fi
done
mv wrf_inout wrf_inout_ensmean
Next, the script generates the namelist for each ensemble member.
# Build the GSI namelist on-the-fly for each member
nummiter=0
if_read_obs_save='.false.'
if_read_obs_skip='.true.'
. $GSI_NAMELIST
The rest of the script loops through the ensemble members to get the background ready, run GSI, and check the run status:
# Loop through each member
loop="01"
ensmem=1
while [[ $ensmem -le $no_member ]];do
rm pe0*
print "\$ensmem is $ensmem"
ensmemid=`printf %3.3i $ensmem`
# get new background for each member
if [[ -f wrf_inout ]]; then
rm wrf_inout
fi
BK_FILE=${BK_FILE_mem}${ensmemid}
echo $BK_FILE
ln -s $BK_FILE wrf_inout
# run GSI
echo ' Run GSI with' ${bk_core} 'for member ', ${ensmemid}
case $ARCH in
'IBM_LSF')
${RUN_COMMAND} ./gsi.exe < gsiparm.anl > stdout_mem${ensmemid} 2>&1 ;;
* )
${RUN_COMMAND} ./gsi.exe > stdout_mem${ensmemid} 2>&1 ;;
esac
# run time error check and save run time file status
error=$?
if [ ${error} -ne 0 ]; then
echo "ERROR: ${GSI} crashed for member ${ensmemid} Exit status=${error}"
exit ${error}
fi
ls -l * > list_run_directory_mem${ensmemid}
The following lines generate the diagnostics files for each member.
# generate diag files
for type in $listall; do
count=`ls pe*${type}_${loop}* | wc -l`
if [[ $count -gt 0 ]]; then
cat pe*${type}_${loop}* > diag_${type}_${string}.mem${ensmemid}
fi
done
The following section is to move on to the next ensemble member and run GSI.
# next member
(( ensmem += 1 ))
done
fi
If this point is reached, the GSI successfully finishes and exits with status “0”:
exit 0
GSI Analysis Result Files in Run Directory¶
Once the GSI run script is set up, it is ready to be submitted like any other batch job. When completed, GSI will create a number of files in the run directory. Below is an example of the files generated in the run directory from one of the GSI test case runs. This case was run to perform a regional GSI analysis with a WRF-ARW NetCDF background using conventional (prepbufr), radiance (AMSU-A, HIRS4, and MHS), and GPSRO data. The analysis time is 1200Z on 13 May 2017. Four processors were used. To make the run directory more readable, we turned on the clean option in the run script, which deleted all temporary intermediate files.
amsuabufr fort.206 hirs3bufrears
amsuabufrears fort.207 hirs4bufr
anavinfo fort.208 l2rwbufr
atmsbufr fort.209 larcglb
berror_stats fort.210 list_run_directory
convinfo fort.211 mhsbufr
diag_amsua_n15_anl.2017051312 fort.212 mhsbufrears
diag_amsua_n15_ges.2017051312 fort.213 omibufr
diag_amsua_n18_anl.2017051312 fort.214 ozinfo
diag_amsua_n18_ges.2017051312 fort.215 pcpbias_out
diag_amsua_n19_anl.2017051312 fort.217 pcpinfo
diag_amsua_n19_ges.2017051312 fort.218 prepbufr
diag_conv_anl.2017051312 fort.219 prepobs_prep.bufrtable
diag_conv_ges.2017051312 fort.220 radar_supobs_from_level2
diag_hirs4_n19_anl.2017051312 fort.221 satbias_angle
diag_hirs4_n19_ges.2017051312 fort.223 satbias_ang.out
diag_mhs_n18_anl.2017051312 fort.224 satbias_in
diag_mhs_n18_ges.2017051312 fort.225 satbias_out
diag_mhs_n19_anl.2017051312 fort.226 satbias_out.int
diag_mhs_n19_ges.2017051312 fort.227 satbias_pc_in
errtable fort.228 satbias_pc.out
fit_p1.2017051312 fort.229 satinfo
fit_q1.2017051312 fort.230 satwnd
fit_rad1.2017051312 fort.232 sbuvbufr
fit_t1.2017051312 fort.233 seviribufr
fit_w1.2017051312 fort.234 ssmirrbufr
fort.201 gimgrbufr stdout
fort.202 gomebufr stdout.anl.2017051312
fort.203 gpsrobufr wrfanl.2017051312
fort.204 gsi.exe wrf_inout
fort.205 gsiparm.anl
It is important to know which files hold the GSI analysis results, standard output, and diagnostic information. We will introduce these files and their contents in detail in the following chapter. The following is a brief list of what these files contain:
- stdout or stdout.anl.(time): standard text output file. stdout.anl.(time) is a link to stdout with the analysis time appended. This is the most commonly used file to check the GSI analysis processes and contains basic and important information about the analyses. We will explain the contents of the stdout file in Section 4.1 and users are encouraged to read this file in detail to become familiar with the order of GSI analysis processing.
- wrf_inout or wrfanl.(time): analysis results if GSI completes successfully. It exists only if using WRF for the background. The wrfanl.(time) file is a link to wrf_inout with the analysis time appended. The format is the same as the background file.
- diag_conv_anl.(time): binary diagnostic files for conventional and GPS RO observations at the final analysis step (analysis departure for each observation).
- diag_conv_ges.(time): binary diagnostic files for conventional and GPS RO observations before the initial analysis step (background departure for each observation)
- diag_(instrument_satellite)_anl: diagnostic files for satellite radiance observations at the final analysis step.
- diag_(instrument_satellite)_ges: diagnostic files for satellite radiance observations before the initial analysis step.
- gsiparm.anl: GSI namelist, generated by the run script.
- fit_(variable).(time): links to fort.2?? with meaningful names (variable name plus analysis time). They are statistic results of observation departures from background and analysis results according to observation variables. Please see Section 4.5 for more details.
- fort.220: output from the inner loop minimization (in pcgsoi.f90). Please see Section 4.6 for details.
- anavinfo: info file to set up control, state, and background variables. Please see the Advanced GSI Users Guide for details.
- *info (convinfo,satinfo, …): info files that control data usage. Please see Section [sec4.3] for details.
- berror_stats and errtable: background error file (binary) and observation error file (text).
- *bufr: observation BUFR files linked to the run directoryi. Please see Section 3.1 for details.
- satbias_in: the input coefficients of bias correction for satellite radiance observations.
- satbias_out: the output coefficients of bias correction for satellite radiance observations after the GSI run.
- satbias_pc: the input coefficients of bias correction for passive satellite radiance observations.
- list_run_directory : the complete list of files in the run directory before cleaning takes place. This is generated by the GSI run script.
The diag
files, such as diag_(instrument_satellite)_anl.(time)
and diag_conv_anl.(time)
, contain important information about the
data used in the GSI, including observation departure from analysis
results for each observation (O-A). Similarly, diag_conv_ges
and
diag_(instrumen_satellite)_ges.(time)
include the observation
innovation for each observation (O-B). These files can be very helpful
in understanding the detailed impact of data on the analysis. A tool is
provided to process these files, which is introduced in Appendix A.2.
There are many intermediate files in this directory while GSI is running
or if the run crashes. The complete list of files in the directory
(prior to cleaning) is saved in file list_run_directory
. Some
knowledge about the content of these files is very helpful for debugging
if the GSI run crashes. Please check table [t37] for the
meaning of these files. (Note: you may not see all the files in the list
because different observational data are used. Also, the fixed files
prepared for a GSI run, such as CRTM coefficient files, are not
included.)
File name | Content |
---|---|
sigf03 | This is a temporary file, holding binary format background files (typically sigf03, sigf06 and sigf09 if FGAT used). When you see this file, at the minimum, a background file was successfully read in. |
siganl | Analysis results in binary format. When this file exists, the analysis has finished. |
pe????.(conv or instrument_satellite)_(outer loop) | Diagnostic files for conventional and satellite radiance observations at each outer loop and each sub-domain (????=subdomain id)i. |
obs_input.???? | Observation scratch files (each file contains observations for one observation type within the whole analysis domain and time window. ????=observation type id in namelist). |
pcpbias_out | Output precipitation bias correction file. |
[t37]
Introduction to Frequently Used GSI Namelist Options¶
The complete namelist options and their explanations are listed in Appendix A of the Advanced GSI Users Guide. For most GSI analysis applications, only a few namelist variables need to be changed. Here we introduce frequently used variables for regional analyses:
Set Up the Number of Outer and Inner Loops¶
To change the number of outer loops and the number of inner iterations in each outer loop, the following three variables in the namelist need to be modified:
miter
: number of outer analysis loops.niter(1)
: maximum iteration number of inner loop iterations for the 1st outer loop. The inner loop will stop when it reaches this maximum number, when it reaches the convergence threshold, or when it fails to converge.niter(2)
: maximum iteration number of inner loop iterations for the 2nd outer loop.- If
miter
is larger than two, repeatniter
with larger index.
Set Up the Analysis Variable for Moisture¶
There are two moisture analysis variable options. It is based on the following namelist variable:
qoption = 1 or 2
:
- If
qoption=1
, the moisture analysis variable is pseudo-relative humidity. The saturation specific humidity, qsatg, is computed from the guess and held constant during the inner loop. Thus, the relative humidity control variable can only change via changes in specific humidity, q. - If
qoption=2
, the moisture analysis variable is normalized relative humidity. This formulation allows relative humidity to change in the inner loop via changes to surface pressure, temperature, or specific humidity.
Set Up the Background File¶
The following four variables define which background field will be used in the GSI analyses:
regional
: if true, perform a regional GSI run using either ARW or NMM inputs as the background. If false, perform a global GSI analysis. If eitherwrf_nmm_regional
orwrf_mass_regional
are true, it will be set to true.wrf_nmm_regional
: if true, the background comes from WRF-NMM. When using other background fields, set it to false.wrf_mass_regional
: if true, the background comes from WRF-ARW. When using other background fields, set it to false.nems_nmmb_regional
: if true, the background comes from NMMB. When using other background fields, set it to false.netcdf
: if true, WRF files are in NetCDF format, otherwise WRF files are in binary format. This option only works for a regional GSI analysis.
Set Up the Output of Diagnostic Files¶
The following variables tell the GSI to write out diagnostic results in certain loops:
write_diag(1)
: if true, write out diagnostic data in the beginning of the analysis, so that we can have information on observation \(-\) background (O-B) differences.write_diag(2)
: if true, write out diagnostic data at the end of the 1st outer loop (before the 2nd outer loop starts).write_diag(3)
: if true, write out diagnostic data at the end of the 2nd outer loop (after the analysis finishes if the outer loop number is two), so that we can have information on observation \(-\) analysis (O-A) differences.
Please check appendix A.2 for the tools to read the diagnostic files.
Set Up the GSI Recognized Observation Files¶
The following sets up the GSI recognized observation files for GSI observation ingest:
OBS_INPUT::
! dfile dtype dplat dsis dval dthin dsfcalc
prepbufr ps null ps 1.0 0 0
prepbufr t null t 1.0 0 0
prepbufr q null q 1.0 0 0
prepbufr pw null pw 1.0 0 0
satwndbufr uv null uv 1.0 0 0
prepbufr uv null uv 1.0 0 0
prepbufr spd null spd 1.0 0 0
prepbufr dw null dw 1.0 0 0
radarbufr rw null rw 1.0 0 0
prepbufr sst null sst 1.0 0 0
gpsrobufr gps_ref null gps 1.0 0 0
ssmirrbufr pcp_ssmi dmsp pcp_ssmi 1.0 -1 0
dfile
: GSI recognized observation file name. The observation file contains observations used for a GSI analysis. This file can include several observation variables from different observation types. The file name listed by this parameter will be read in by GSI. This name can be changed as long as the name in the link from the BUFR/PrepBUFR file in the run scripts also changes correspondingly.dtype
: analysis variable name that GSI can read in. Please note this name should be consistent with that used in the GSI code.dplat
: sets up the observation platform for a certain observation, which will be read in from the filedfile
.dsis
: sets up the data name (including both data type and platform name) used inside GSI.
Please see Section 4.3 for examples and explanations of these variables.
Set Up Observation Time Window¶
In the namelist section OBS_INPUT
, use time_window_max
to set
the maximum half time window (hours) for all data types. In the
convinfo
file, you can use the column “twindow” to set the half time
window for a certain data type (hours). For conventional observations,
only observations within the smaller window of these two will be kept
for further processing. For others, observations within
time_window_max
will be kept for further processing.
Set Up Data Thinning¶
- Radiance data thinning
Radiance data thinning is controlled through two GSI namelist variables
in the section &OBS_INPUT
. Below is an example:
&OBS_INPUT
dmesh(1)=120.0,dmesh(2)=60.0,dmesh(3)=30,time_window_max=1.5,ext_sonde=.true.,
/
OBS_INPUT::
! dfile dtype dplat dsis dval dthin dsfcalc
prepbufr ps null ps 1.0 0 0
gpsrobufr gps_ref null gps 1.0 0 0
ssmirrbufr pcp_ssmi dmsp pcp_ssmi 1.0 -1 0
tmirrbufr pcp_tmi trmm pcp_tmi 1.0 -1 0
hirs3bufr hirs3 n17 hirs3_n17 6.0 1 0
hirs4bufr hirs4 metop-a hirs4_metop-a 6.0 2 0
The two namelist variables that control the radiance data thinning are
real array “dmesh” in the 1st line and the “dthin” values in
the 6th column. The “dmesh” array sets mesh sizes for radiance
thinning grids in kilometers, while “dthin” defines if the data type it
represents needs to be thinned and which thinning grid (mesh size) to
use. If the value of dthin
is:
- an integer less than or equal to zero, no thinning is needed
- an integer larger than zero, this kind of radiance data will be thinned using the mesh size defined as dmesh (dthin).
The following section provides several thinning examples defined by the
above sample &OBS_INPUT
section:
- Data type
ps
from prepbufr: no thinning becausedthin=0
- Data type
gps_ref
from gpsrobufr: no thinning becausedthin=0
- Data type
pcp_ssmi
from dmsp: no thinning becausedthin(01)=-1
- Data type
hirs3
from NOAA-17: thinning in a 120 km grid becausedthin=1
anddmesh(1)=120
- Data type
hirs4
from metop-a: thinning in a 60 km grid becausedthin=2
anddmesh(2)=60
- Conventional data thinning
The conventional data can also be thinned. However, the setup of
thinning is not in the namelist. To give users a complete picture of
data thinning, conventional data thinning is briefly introduced here.
There are three columns, ithin
, rmesh
, pmesh
, in the
convinfo
file (more details on this file are in Section 4.3) to
configure conventional data thinning:
ithin
: 0 = no thinning; 1 = thinning with grid mesh decided byrmesh
andpmesh
rmesh
: horizontal thinning grid size in kmpmesh
: vertical thinning grid size in mb; if 0, then use background vertical grid.
Set Up Background Error Factor¶
In the namelist section BKGERR, vs is used to set up the scale factor
for vertical correlation length and hzscl
is defined to set up scale
factors for horizontal smoothing. The scale factors for the variance of
each analysis variables are set in the anavinfo
file. The typical
values used in operations for regional and global background error
covariance are given and picked based on the choice of background error
covariance in the run scripts and sample anavinfo
files
Single Observation Test¶
To do a single observation test, the following namelist option has to be set to true:
oneobtest=.true.
Then go to the namelist section SINGLEOB_TEST
to set up the single
observation location and variable to be tested, please see Section 4.2
for an example and details on the single observation test.