5. Applications for Regional and Global EnKF¶
In this chaper, the elements from the previous chapters will be applied to demonstrate how to run a regional and global case using the GSI observer and EnKF. These examples are intended to give users a clear idea of how to set up the GSI observer and EnKF for a particular application and properly check the run status and analysis results in order to determine if the run was successful. Note that the regional example focuses on WRF ARW, however WRF NMM and NMMB runs are similar, but require different background ensemble and namelist options. Similarly, the global example features a single global configuration (T254), however users may wish to use a different configuration, again requiring different background ensemble and namelist options.
It is assumed that the reader has successfully compiled GSI and EnKF on a local machine. For regional case studies, users should have the following data available:
Ensemble for background
- Ensemble files from WRF-ARW, WRF-NMM, NMM-B forecast files may be
used. For this case, WRF-ARW ensemble members will be used, which
are generated from the GFS ensemble, following the naming
convention:
wrfarw.mem00nn
. Wherenn
corresponds to each ensemble member (“ensmean” appended for ensemble mean).
- Ensemble files from WRF-ARW, WRF-NMM, NMM-B forecast files may be
used. For this case, WRF-ARW ensemble members will be used, which
are generated from the GFS ensemble, following the naming
convention:
Conventional, GPSRO and radiance data
- Real time GDAS and NAM PrepBUFR data can be obtained from the server: ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/
Fixed files
- Fixed files are located in the
comGSIv3.5_EnKFv1.1/fix
directory
- Fixed files are located in the
For global case studies, users should have the following data available:
Ensemble for background
- Ensemble files from GFS forecast files may be used. GFS ensemble
members corresponding to various spectral resolutions may be used,
following the naming convention:
sfg_yyyymmddhh_fhrff_mem0nn
. The “yyyymmddhh” corresponds to the initialization date, “ff” corresponds to the forecast hour, and “nn” corresponds to each ensemble member (ensmean
appended for ensemble mean).
- Ensemble files from GFS forecast files may be used. GFS ensemble
members corresponding to various spectral resolutions may be used,
following the naming convention:
Conventional data
- Real time GDAS PrepBUFR (and BUFR) data can be obtained from the server:
Fixed files
- Fixed files are located in the
comGSIv3.5_EnKFv1.1/fix/global
directory
- Fixed files are located in the
The following cases will give users an example of how to run the EnKF for a regional case with various data sources, as well as a simple global application case. Users are welcome to download these examples from the EnKF Users webpage (online case for release v1.1) or create a new background ensemble and obtain observation data from the above server.
The background ensemble files and observations used in the regional case studies are as follows:
Background files: wrfarw.mem0001 - wrfarw.mem0020, wrfarw.ensmean
Conventional, GPSRO and radiance data from 00 UTC 13 February 2014
- Conventional file:
gdas1.t00z.prepbufr.nr
- GPSRO file:
gdas1.t00z.gpsro.tm00.bufr_d
- Radiance files:
gdas1.t00z.1bamua.tm00.bufr_d
gdas1.t00z.1bhrs4.tm00.bufr_d
gdas1.t00z.1bmhs.tm00.bufr_d
- Conventional file:
Fixed files:
- Fixed files are located in the GSI/EnKF community code package,
under
/comGSIv3.5_EnKFv1.1/fix
- For observation control:
convinfo
: conventional data (PrepBUFR, GPSRO) info fileozinfo
: ozone retrival info filesatinfo
: satellite radiance info file
- Adaptive radiance bias correction (although example files are
provided in the
/fix
directory, it is better to obtain bias correction files valid for the proper date. For this case, GFS bias correction coefficients are provided with the case data):satbias_in
: satellite bias correction coefficient filesatbias_pc
: satellite bias correction coefficient file for passive channels.
- Fixed files are located in the GSI/EnKF community code package,
under
This case study was run on a linux cluster. GSI no longer (from GSI v3.2) requires byte- swapping to little endian format. BUFRLIB can automatically handle byte order issues.
For the regional ARW case, assume the following locations:
Path to the background ensemble files:
/scratch/casedata/arw_2014021300/bk/
the path to the observations:
/scratch/casedata/arw_2014021300/obs
and the GSI release version 3.5/EnKF version 1.1 is located at:
/scratch/comGSIv3.5_EnKFv1.1
For the global GFS case, assume locations as follows: Path to background ensemble files:
/scratch/casedata/enkf_glb_t254/bk
the path to the observations:
/scratch/casedata/enkf_glb_t254/obs
and the GSI release version 3.5/EnKF version 1.1 is located at:
/scratch/comGSIv3.5_EnKFv1.1
Running GSI Observer for Regional Applications¶
Run Script¶
With both GSI and EnKF compiled and the background ensemble files and
observations acquired, the next step is to work with the gsi observer
run script, run_gsi_regional.ksh
. The location of this script is
under comGSIv3.5_EnKFv1.1/run
. This run script is the same as the
one used for a GSI analysis run, with a few specific options selected in
order to loop through all the ensemble members and generate the ensemble
observation priors for each member and the ensemble mean. In addition to
the GSI observer specific options, other user-specific modifications
need to be made:
- Set up batch queueing systemTo run GSI with multiple processors, a job queuing head must be added to the
run_gsi_regional.ksh
script. The set up of the job queue is dependent on the machine and the job control system. Refer to the GSI Users Guide, section 3.2.2, for more examples of the setup section of this script. The following example is setup to run on a Linux cluster with LSF:#BSUB -P ?????? # project code #BSUB -W 00:30 # wall-clock time (hrs:mins) #BSUB -n 4 # number of tasks in job #BSUB -R "span[ptile=16]" # run 16 MPI tasks per node #BSUB -J gsi # job name #BSUB -o gsi.%J.out # output file name in which %J is replaced by the job ID #BSUB -e gsi.%J.err # error file name in which %J is replaced by the job ID #BSUB -q small # queue
Set up number of processors and the job queue system used For this example,
LINUX_LSF
and 4 processors are used:GSIPROC=4 ARCH='LINUX_LSF'
Set up the case data, analysis time, fix files, GSI executable, and CRTM coefficients:
Set up analysis time:
ANAL_TIME=2014021300
Set up working directory, which will hold the analysis results (all ensemble members will be run in this directory). This directory must have the proper write permissions as well as enough space to hold the output.
WORK_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/gsidiag_2014021300
Set path to the observation directory and the PrepBUFR file within the observation directory. All observations to be assimilated should reside in this directory.
OBS_ROOT=/scratch/${user}/casedata/arw_2014021300/obs PREPBUFR=${OBS_ROOT}/gdas1.t00z.prepbufr.nr
Set path to background ensemble files:
BK_ROOT=/scratch/${user}/casedata/arw_2014021300/bk
Set file for background ensemble mean:
BK_FILE=${BK_ROOT}/wrfarw.ensmean
Set the GSI system used for this case, including the paths of the fix files and the CRTM coefficents as well as the location of the GSI executable:
CRTM_ROOT=/scratch/${user}/CRTM_REL-2.2.3 GSI_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/ FIX_ROOT=${GSI_ROOT}/fix GSI_EXE=${GSI_ROOT}/run/gsi.exe GSI_NAMELIST=${GSI_ROOT}/run/comgsi_namelist.sh
Set which background and background error file to use
bk_core=ARW bkcv_option=NAM if_clean=clean
This example uses the ARW NetCDF background; thereforebk_core
is set toARW
. The regional background error covariance file is used in this case, as set bybkcv_option=NAM
. Finally, the run scripts are set to clean the run directory to delete all temporary intermediate files.Choose to run GSI observer, set up background ensemble information
if_observer=Yes no_member=20 BK_FILE_mem=${BK_ROOT}/wrfarw.mem
The optionif_observer=Yes
is the switch that enablesrun_gsi_regional.ksh
to run the GSI observer (rather than GSI analysis). In this example, 20 ensemble members are selected with the naming convention:wrfarw.memnnnn
. Note that memnnnn which is associated with each ensemble member (nnnn), is not included and will be appended later in the script.Link observations
# Link to the prepbufr data ln -s ${PREPBUFR} ./prepbufr # Link to the radiance data ln -s ${OBS_ROOT}/gdas1.t00z.1bamua.tm00.bufr_d amsuabufr ln -s ${OBS_ROOT}/gdas1.t00z.1bhrs4.tm00.bufr_d hirs4bufr ln -s ${OBS_ROOT}/gdas1.t00z.1bmhs.tm00.bufr_d mhsbufr ln -s ${OBS_ROOT}/gdas1.t00z.gpsro.tm00.bufr_d gpsrobufr
Past the arch selection, environment variable checks, and creation of working directory, users will find the location where the observations are linked. For this case, we can see that the conventional PrepBUFR observations have been linked, as well as three different satellite radiance BUFR files (AMSU-A, HIRS4, and MHS) and a GPS RO BUFR file. These files will be linked to the working directory and separate observation innovation (diag) files will be generated for each observation.
In the run script, the proper anavinfo file is selected based on the core and background error covariance used for the case:
echo ' Use NAM background error covariance'
BERROR=${FIX_ROOT}/${BYTE_ORDER}/nam_nmmstat_na.gcv
OBERROR=${FIX_ROOT}/nam_errtable.r3dv
if [ ${bk_core} = NMM ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_ndas_netcdf
fi
if [ ${bk_core} = ARW ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_arw_netcdf
fi
if [ ${bk_core} = NMMB ] ; then
ANAVINFO=${FIX_ROOT}/anavinfo_nems_nmmb
fi
The anavinfo file used for this case is anavinfo_arw_netcdf
, because
the background is NetCDF ARW and the regional background error
covariance has been selected. It is important to check this file,
located in the ./fix
directory. The number of vertical levels in
anavinfo_arw_netcdf
must match those for the background. In this
example, the case has 50 vertical levels with the default value in
anavinfo_arw_netcdf
is 40. Change these values (level column
)
accordingly:
met_guess::
!var level crtm_use desc orig_name
ps 1 -1 surface_pressure ps
z 1 -1 geopotential_height phis
u 50 2 zonal_wind u
v 50 2 meridional_wind v
div 50 -1 zonal_wind div
vor 50 -1 meridional_wind vor
tv 50 2 virtual_temperature tv
q 50 2 specific_humidity sphu
oz 50 2 ozone ozone
cw 50 10 cloud_condensate cw
... ...
control_vector::
!var level itracer as/tsfc_sdv an_amp0 source funcof
sf 50 0 1.00 -1.0 state u,v
vp 50 0 1.00 -1.0 state u,v
ps 1 0 0.50 -1.0 state prse
t 50 0 0.70 -1.0 state tv
q 50 1 0.70 -1.0 state q
oz 50 1 0.50 -1.0 state oz
sst 1 0 1.00 -1.0 state sst
cw 50 1 1.00 -1.0 state cw
stl 1 0 1.00 -1.0 motley sst
sti 1 0 1.00 -1.0 motley sst
::
Run GSI Observer and Check Run Status¶
Once the run script is set up properly for the case and the machine, GSI can be run through the run script. The following command will submit the job:
$ bsub < run_gsi_regional.ksh
While the job is running, move to the working directory and check the details. Given the following working directory setup:
WORK_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/gsidiag_2014021300
Go to directory WORK_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/
and
check the run directory. A directory named gsidiag_2014021300
should
have been created. This directory is the run directory for this GSI
observer case study. The directory will be populated with many files:
amsua_n18.TauCoeff.bin
ssmi_f15.SpcCoeff.bin
amsua_n18.SpcCoeff.bin
ssmi_f15.TauCoeff.bin
imgr_g11.SpcCoeff.bin
imgr_g11.TauCoeff.bin
These files are CRTM coefficients that have been linked to this run directory through the GSI run script. Similar to running the GSI analysis, many other files are linked or copied to this run directory, such as:
gsiparm.anl: | GSI namelist |
prepbufr: | PrepBUFR file for conventional observation |
convinfo: | data usage control for conventional data |
satbias_in: | satellite bias correction coefficient file |
satinfo: | data usage and channel control for satellite radiance data |
berror_stats: | background error file |
errtable: | observation error file |
Additionally, for the GSI observer many files are generated as a result
of the lread_obs_save
and lread_obs_skip
options in the namelist
for writing/reading collective observation selection information:
obs_input.nnnn: | During ensemble mean (save=True, skip=False), save all observation preprocessing to this file. Each file (nnnn) is for a different variable type |
pennnn.obs_setup: | When looping through ensemble members (save=False, skip=True), all observation preprocessing is skipped, obs_input.nnnn files are read and pennnn.obs_setup is written for each processor (nnnn) for all observations. |
As the GSI observer is running for the ensemble mean as well as looping through each member, files are are generated for each ensemble member, as well as the ensemble mean.
pennnn.conv_01: |
|
list_run_directory: |
|
list_run_directory_memnnn: | As as above, for each ensemble member (nnn). |
stdout: | Standard output file for ensemble mean. |
stdout_memnnn: | Standard output file for each ensemble member (nnn). |
The presence of the standard output files in the directory suggest the GSI observer run scripts have successfully set up a run environment for the GSI observer, properly looping through the ensemble members, and the GSI executable is running on each ensemble member.
Once GSI has finished running, diag files should be generated for each observation type for each member as well as the ensemble mean:
diag_conv_ges.ensmean diag_conv_ges.mem001
diag_conv_ges.mem002 diag_conv_ges.mem003
diag_conv_ges.mem004 diag_conv_ges.mem005
... ... diag_conv_ges.mem020
diag_amsua_metop-a_ges.ensmean diag_amsua_metop-a_ges.mem001
diag_amsua_metop-a_ges.mem002 diag_amsua_metop-a_ges.mem003
diag_amsua_metop-a_ges.mem004 diag_amsua_metop-a_ges.mem005
... ... diag_amsua_metop-a_ges.mem020
diag_amsua_n18_ges.ensmean diag_amsua_n18_ges.mem001
diag_amsua_n18_ges.mem002 diag_amsua_n18_ges.mem003
diag_amsua_n18_ges.mem004 diag_amsua_n18_ges.mem005
......
diag_hirs4_n19_ges.ensmean diag_amsua_n15_ges.mem001
diag_hirs4_n19_ges.mem002 diag_amsua_n15_ges.mem003
diag_hirs4_n19_ges.mem004 diag_amsua_n15_ges.mem005
... ... diag_amsua_n15_ges.mem020
Check for Successful GSI Completion¶
The presence of the diag files and standard out files for the ensemble mean and each member indicates that the GSI observer has run without crashing, but does not necessary indicate a successful analysis. It is important to check the stdout files in the run directory to make sure the GSI observer completed each step without any obvious problems. The following are several important areas of the standard out file to check:
Read in the anavinfo and namelist
gsi_metguess_mod*init_: 2D-MET STATE VARIABLES: ps z gsi_metguess_mod*init_: 3D-MET STATE VARIABLES: u v div vor tv q oz cw ..... &SETUP GENCODE = 78.0000000000000 , FACTQMIN = 0.000000000000000E+000, FACTQMAX = 0.000000000000000E+000, .....
Read in the background field
The following lines in stdout immediately following the namelist section, indicate that GSI is reading the background fields. Checking that the range of the max and min values will indicate if particular background fields are normal.
dh1 = 3 iy,m,d,h,m,s= 2014 2 13 0 0 0 dh1 = 3 rmse_var = SMOIS ndim1 = 3 ordering = XYZ staggering = N/A start_index = 1 1 1 0 end_index = 129 70 4 0 WrfType = 104 ierr = 0 ... ... rmse_var = T ndim1= 3 WrfType = 104 WRF_REAL= 104 ierr = 0 ordering = XYZ staggering = N/A start_index = 1 1 1 0 end_index = 129 70 50 0 k,max,min,mid T= 1 313.3391 237.9947 273.2565 k,max,min,mid T= 2 313.4587 238.4884 273.5590 k,max,min,mid T= 3 313.6850 239.3906 274.1599 k,max,min,mid T= 4 314.0594 241.0577 275.0899 k,max,min,mid T= 5 314.6113 243.6186 276.3394 k,max,min,mid T= 6 315.3213 247.0504 277.7711 ... ...
Read in observational data
Thestdout
(ensemble mean) andstdout_memnnn
(ensemble members) contains distinct differences for the reading in the observational data controlled by thelread_obs_save
andlread_obs_skip
options in the namelista. Ensemble meanread_obs_check: bufr file date is 2014021300 prepbufr q read_obs_check: bufr file date is 2014021300 prepbufr ps read_obs_check: bufr file uv not available satwndbufr read_obs_check: bufr file rw not available radarbufr read_obs_check: bufr file pcp_tmi trmm not available tmirrbufr read_obs_check: bufr file hirs3 n17 not available hirs3bufr ... ... number of extra processors 1 READ_OBS: read 33 mhs mhs_n18 using ntasks= 2 0 1795 0 READ_OBS: read 34 mhs mhs_n19 using ntasks= 2 2 584 0 READ_OBS: read 35 mhs mhs_metop-a using ntasks= 2 0 174 0 READ_OBS: read 36 mhs mhs_metop-b using ntasks= 2 2 2 0 ... ... READ_BUFRTOVS : file=mhsbufr type=mhs sis=mhs_n19 nread= 0 ithin= 2 rmesh= 60.000000 isfcalc= 0 ndata= 0 ntask= 2 ... ... READ_PREPBUFR : file=prepbufr type=uv sis=uv nread= 52122 ithin= 0 rmesh= 120.000000 isfcalc= 0 ndata= 50772 ntask= 1 ... ... READ_OBS: write collective obs selection info to obs_input.common
For the ensemble mean, the same procedure for observation processing as the GSI analysis is followed (read_obs_check
,READ_OBS
,READ_BUFRTOVS
,READ_GPS
,READ_PREPBUFR
…). Finally, it is indicated that the collective observations selection information is written out for use with the ensemble members.b. Ensemble membersFor the
stdout_memnnn
, the observation processing steps are skipped and observation information is read in from the ensemble mean:OBSERVER_SET: read collective obs selection info from obs_input.common
Finally, for both the ensemble mean and members, the following lines will appear:
OBS_PARA: ps 2038 2538 5499 6600 OBS_PARA: t 3271 4431 7459 8753 OBS_PARA: q 2823 3917 6110 7187 OBS_PARA: pw 261 217 226 180 OBS_PARA: uv 3333 5207 9038 8902 OBS_PARA: gps_ref 597 1001 210 1085 OBS_PARA: hirs4 metop-a 0 0 0 174 OBS_PARA: amsua n15 542 540 0 0 OBS_PARA: amsua n18 606 672 0 0 OBS_PARA: amsua metop-a 0 0 0 151 OBS_PARA: mhs n18 730 828 0 0 OBS_PARA: mhs metop-a 0 0 0 201
This table is important to check if the observations have been read in, which types of observations have been read in, and the distribution of observations in each subdomain.Indication that the GSI observer has successfully run:
observer_final: successfully finalized glbsoi: complete [000]gsisub(): : complete. ENDING DATE-TIME JUL 14,2016 23:29:19.825 196 THU 2457584 PROGRAM GSI_ANL HAS ENDED. * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
After looking over each section of the standard output files, it can be
concluded that the GSI observer ran without issues. Note that because
the outer loop was set to 0 (miter=0
) for the GSI observer, no
minimization occurred and therefore no analysis results were produced.
Running EnKF for Regional Applications¶
Run Script¶
Once the GSI observer has been run successfully, the next step is to
setup the EnKF run script, run_enkf_wrf.ksh
. This script is located
under comGSIv3.5_EnKFv1.1/run
. This run script uses the diag files
generated by the GSI observer script as observation input, and generates
the EnKF analysis. Similar to the GSI observer script, several
user-specific modifications need to be made:
Set up batch queueing system
#BSUB -P ???? # project code #BSUB -W 00:30 # wall-clock time (hrs:mins) #BSUB -n 32 # number of tasks in job #BSUB -R "span[ptile=16]" # run 16 MPI tasks per node #BSUB -J gsi # job name #BSUB -o gsi.%J.out # output file name in which %J is replaced by the job ID #BSUB -e gsi.%J.err # error file name in which %J is replaced by the job ID #BSUB -q small # queue
Set up number of processors and the job queue system used:
GSIPROC=32 ARCH='LINUX_LSF'
One difference from the GSI observer script is that the number of processors used should be greater than at least the number of ensemble members for running the EnKF. In this case we have 20 ensemble members and have requested 32 cores.Set up the analysis time, fixed files, EnKF executable
ANAL_TIME=2014021300 WORK_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/enkf_2014021300 diag_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/gsidiag_2014021300 BK_ROOT=/scratch/${user}/casedata/arw_2014021300/bk BK_FILE=${BK_ROOT}/wrfarw.ensmean GSI_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1 FIX_ROOT=${GSI_ROOT}/fix ENKF_EXE=${GSI_ROOT}/src/main/enkf/wrf_enkf CRTM_ROOT=/scratch/${user}/CRTM_2.2.3/ ENKF_NAMELIST=${GSI_ROOT}/run/enkf_wrf_namelist.sh
Set background file and location of the diag files (from GSI observer)
diag_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/gsidiag_2014021300 BK_FILE=${BK_ROOT}/wrfarw.ensmean
Two modifications to note for the EnKF are thediag_ROOT
andBK_FILE
. Thediag_ROOT
points to the working directory where the GSI observer was run. This directory contains diag* files which will be linked to the EnKF working directory.BK_FILE
points to the background ensemble mean.Select ensemble parameters
NMEM_ENKF=20 BK_FILE_mem=${BK_ROOT}/wrfarw NLONS=129 NLATS=70 NLEVS=50 IF_ARW=.true. IF_NMM=.false.
The next section sets information about the ensemble, including the number of members, background ensemble members, domain specifications, dynamical core, and list of observations to be used in the EnKF. Note that the ensemble members (BK_FILE_mem
) do not include the 3-digit member number. This will be appended to the name specified for this field later in the script.Select observations
list="conv amsua_n18 amsua_metop-a"
The previous line is contained with the ensemble parameter section. For the observations to be assimilated, be aware that only observations/platforms that have been run through the GSI observer (and therefore a diag* file exists) are valid.
Run EnKF and Check Run Status¶
Once the run script is set up properly for the case and the machine, EnKF can be run through the run script. The following command will submit the job:
$ bsub < run_enkf_wrf.ksh
While the job is running, move to the working directory and check the details. Given the following working directory setup:
WORK_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/enkf_2014021300
Go to directory /scratch/${user}/comGSIv3.5_EnKFv1.1/run
and check
the run directory. A directory named enkf_2014021300
should have
been created. This directory is the run directory for this EnKF case
study. The directory will be populated with many links:
diag_conv_ges.ensmean diag_conv_ges.mem007 diag_conv_ges.mem014
diag_conv_ges.mem001 diag_conv_ges.mem008 diag_conv_ges.mem015
diag_conv_ges.mem002 diag_conv_ges.mem009 diag_conv_ges.mem016
diag_conv_ges.mem003 diag_conv_ges.mem010 diag_conv_ges.mem017
diag_conv_ges.mem004 diag_conv_ges.mem011 diag_conv_ges.mem018
diag_conv_ges.mem005 diag_conv_ges.mem012 diag_conv_ges.mem019
diag_conv_ges.mem006 diag_conv_ges.mem013 diag_conv_ges.mem020
diag_amsua_n18_ges.ensmean diag_amsua_n18_ges.mem007 diag_amsua_n18_ges.mem014
diag_amsua_n18_ges.mem001 diag_amsua_n18_ges.mem008 diag_amsua_n18_ges.mem015
diag_amsua_n18_ges.mem002 diag_amsua_n18_ges.mem009 diag_amsua_n18_ges.mem016
diag_amsua_n18_ges.mem003 diag_amsua_n18_ges.mem010 diag_amsua_n18_ges.mem017
diag_amsua_n18_ges.mem004 diag_amsua_n18_ges.mem011 diag_amsua_n18_ges.mem018
diag_amsua_n18_ges.mem005 diag_amsua_n18_ges.mem012 diag_amsua_n18_ges.mem019
diag_amsua_n18_ges.mem006 diag_amsua_n18_ges.mem013 diag_amsua_n18_ges.mem020
diag_hirs4_metop-a_ges.ensmean diag_hirs4_metop-a_ges.mem007 diag_hirs4_metop-a_ges.mem014
diag_hirs4_metop-a_ges.mem001 diag_hirs4_metop-a_ges.mem008 diag_hirs4_metop-a_ges.mem015
diag_hirs4_metop-a_ges.mem002 diag_hirs4_metop-a_ges.mem009 diag_hirs4_metop-a_ges.mem016
diag_hirs4_metop-a_ges.mem003 diag_hirs4_metop-a_ges.mem010 diag_hirs4_metop-a_ges.mem017
diag_hirs4_metop-a_ges.mem004 diag_hirs4_metop-a_ges.mem011 diag_hirs4_metop-a_ges.mem018
diag_hirs4_metop-a_ges.mem005 diag_hirs4_metop-a_ges.mem012 diag_hirs4_metop-a_ges.mem019
diag_hirs4_metop-a_ges.mem006 diag_hirs4_metop-a_ges.mem013 diag_hirs4_metop-a_ges.mem020
The diag files are linked from the GSI observer working directory. These links are specified in the run script. Note that ozone diag files (gome, omi, sbuv) will appear in the directory without links (zero length files) if ozone diags are not present from the GSI observer.
The first guess files are also linked into the directory from the run script, pointing to the path of the ensemble mean and ensemble members designated in the setup section.
firstguess.ensmean firstguess.mem007 firstguess.mem014
firstguess.mem001 firstguess.mem008 firstguess.mem015
firstguess.mem002 firstguess.mem009 firstguess.mem016
firstguess.mem003 firstguess.mem010 firstguess.mem017
firstguess.mem004 firstguess.mem011 firstguess.mem018
firstguess.mem005 firstguess.mem012 firstguess.mem019
firstguess.mem006 firstguess.mem013 firstguess.mem020
Similar to running the GSI observer, many other static files are linked or copied to this run directory, such as:
enkf.nml: | EnKF namelist |
convinfo: | data usage control for conventional data |
satbias_in: | satellite bias correction coefficient file |
satbias_pc: | satellite bias correction file for passive channels |
satinfo: | data usage and channel control for satellite radiance data |
ozinfo: | data usage control from ozone data |
covinflate.dat: | Three-dimensional multiplicative inflation factor fields |
stdout: | EnKF standard output file |
The presence of the standard output file in the directory suggest the EnKF run script has successfully set up a run environment for the EnKF, properly linking the first guess and diag files from the GSI observer, and the EnKF executable is running. Once EnKF has finished running, analysis files should be generated for each member as well as the ensemble mean:
analysis.ensmean analysis.mem006 analysis.mem012 analysis.mem018
analysis.mem001 analysis.mem007 analysis.mem013 analysis.mem019
analysis.mem002 analysis.mem008 analysis.mem014 analysis.mem020
analysis.mem003 analysis.mem009 analysis.mem015
analysis.mem004 analysis.mem010 analysis.mem016
analysis.mem005 analysis.mem011 analysis.mem017
Check for Successful EnKF Completion¶
The presence of the analysis files for the ensemble mean and each member as well as the standard out file indicates that the EnKF has run without crashing, but does not necessary indicate a successful analysis. It is important to check the stdout file in the run directory to make sure the EnKF completed each step without any obvious problems. The following are several important areas of the standard out file to check:
Check namelist has been properly read in and configuration is correct:
namelist parameters: -------------------- &NAM_ENKF DATEIN = 2014021300, DATAPATH = ./ IASSIM_ORDER = 0, COVINFLATEMAX = 100.0000 , COVINFLATEMIN = 1.000000 , DETERMINISTIC = T, ... ...
Check number of ensemble members, as well as the actual analysis variables and the background type. The maximum and minimum values for surface pressure are printed for a sanity check:
20 members number of background forecast times to be updated = 1 first-guess forecast hours for analysis = 06 5 3d vars to update total of 251 2d grids will be updated (including ps) using multiplicative inflation based on Pa/Pb Vars in Rad-Jacobian (dims) -------------------------- sst 0 enkf_main: number of threads 1 Updating U, V, T, QVAPOR, PH, and MU for WRF-ARW... Surface pressure (spressmn) min/max range: 678.673339843750 1032.22473144531
- Statistics of the ensemble priorFollowing many lines describing the bias correction coefficients, the inventory of the observation number, observation types from the input diag* files, and the time spent reading in each observation type, statistics of the ensemble priors fit to all observations are provided for each region (NH, SH, TR):
innovation statistics for prior: conventional obs region, obtype, nobs, bias, innov stdev, sqrt(S+R), sqrt(S), sqrt(R): NH all ps 14083 -0.591E-02 0.860E+00 0.814E+00 0.601E+00 0.550E+00 TR all ps 50 -0.629E-01 0.422E+00 0.661E+00 0.367E+00 0.550E+00 NH all t 7766 -0.228E+00 0.163E+01 0.128E+01 0.480E+00 0.119E+01 TR all t 138 -0.130E+00 0.118E+01 0.119E+01 0.234E+00 0.117E+01 NH all uv 21680 0.136E-01 0.295E+01 0.276E+01 0.140E+01 0.237E+01 TR all uv 302 -0.379E+00 0.251E+01 0.268E+01 0.124E+01 0.238E+01 NH all q 2553 -0.323E-01 0.146E+00 0.157E+00 0.868E-01 0.131E+00 TR all q 68 -0.721E-01 0.202E+00 0.164E+00 0.582E-01 0.154E+00 satellite brightness temp instrument, channel #, nobs, bias, innov stdev, sqrt(S+R), sqrt(S), sqrt(R): amsua_n18 1 241 -0.199E+00 0.194E+01 0.306E+01 0.176E+01 0.250E+01 amsua_n18 2 248 -0.194E+00 0.145E+01 0.234E+01 0.806E+00 0.220E+01 amsua_n18 3 248 -0.620E+00 0.119E+01 0.206E+01 0.497E+00 0.200E+01 amsua_n18 4 248 -0.189E+00 0.398E+00 0.568E+00 0.140E+00 0.550E+00 amsua_n18 5 248 0.188E-01 0.337E+00 0.315E+00 0.966E-01 0.300E+00 amsua_n18 6 1203 -0.396E-01 0.198E+00 0.238E+00 0.610E-01 0.230E+00 amsua_n18 7 1225 -0.215E+00 0.324E+00 0.235E+00 0.480E-01 0.230E+00 amsua_n18 8 1155 -0.235E+00 0.369E+00 0.259E+00 0.675E-01 0.250E+00 amsua_n18 10 1211 0.408E+00 0.491E+00 0.353E+00 0.429E-01 0.350E+00 amsua_n18 11 12 0.720E+00 0.723E+00 0.404E+00 0.567E-01 0.400E+00 amsua_n18 15 245 -0.311E+00 0.198E+01 0.380E+01 0.148E+01 0.350E+01 hirs4_metop-a 3 1 0.115E+01 0.115E+01 0.534E+00 0.631E-01 0.530E+00 hirs4_metop-a 4 42 0.260E+00 0.521E+00 0.403E+00 0.488E-01 0.400E+00 hirs4_metop-a 5 7 -0.534E+00 0.558E+00 0.367E+00 0.696E-01 0.360E+00 hirs4_metop-a 6 7 -0.748E+00 0.782E+00 0.473E+00 0.109E+00 0.460E+00 hirs4_metop-a 7 9 -0.557E+00 0.815E+00 0.580E+00 0.108E+00 0.570E+00 hirs4_metop-a 8 9 0.374E+00 0.675E+00 0.100E+01 0.208E-01 0.100E+01 hirs4_metop-a 10 9 0.199E+00 0.581E+00 0.612E+00 0.120E+00 0.600E+00 hirs4_metop-a 11 9 -0.380E-01 0.718E+00 0.143E+01 0.770E+00 0.120E+01 hirs4_metop-a 12 34 -0.988E+00 0.179E+01 0.213E+01 0.141E+01 0.160E+01 hirs4_metop-a 13 9 0.694E-01 0.330E+00 0.382E+00 0.116E+00 0.364E+00 hirs4_metop-a 14 9 -0.369E-01 0.257E+00 0.294E+00 0.137E+00 0.260E+00 hirs4_metop-a 15 9 0.352E-01 0.289E+00 0.281E+00 0.107E+00 0.260E+00
This table should be checked in order to determine if the inflation is appropriate. The goal is to make the total ensemble spreads of priors (\(\sqrt{S+R}\) match the innovations (innov std) as much as possible. We can see in certain regions, particularly in the NH the differences are noteable. These differences are associated with higher observation error (\(\sqrt{R}\)). Users should consider tuning of inflation and localization, which is typically determined using cases with multiple assimilation cycles. Refer to section 4.2 of this Users Guide for more information on tuning. Domain and observation partition:
npts = 9030 min/max number of points per proc = 243 317 time to do model space decomp = 1.529277069494128E-003 nobstot = 53078 min/max number of obs per proc = 1658 1659 time to do ob space decomp = 3.935033455491066E-004 sending out observation prior ensemble perts from root ... nobstot*nanals 1061560 npts*ndim 2266530 ... took 1.730861375108361E-002 secs time in load_balance = 7.929413160309196E-002 on proc 0
The analysis variables and the observations are distributed to different processors. We can see in this case that the min/max number of points per processor are 243 and 317, respectively. Similarly, we can see that the min/max number of observations per processor are 1658 and 1659, respectively, indicating that the observations are well dispersed among the processors.
READGRIDDATA_ARW: U 1 -20.55222 20.72632 READGRIDDATA_ARW: U 2 -20.87878 21.20316 READGRIDDATA_ARW: U 3 -21.61046 22.36177 READGRIDDATA_ARW: U 4 -22.62561 24.08465 READGRIDDATA_ARW: U 5 -24.07165 26.47827 READGRIDDATA_ARW: U 6 -25.89237 32.81841 READGRIDDATA_ARW: U 7 -27.10455 35.81213 ... ...
Additionally, check the minimum and maximum values of the fields at each vertical level as a quick sanity check.
Diagnose EnKF Analysis Results¶
Spread inflation of the analysis ensemble
global ps prior std. dev min/max = 21.52301 172.3270 NH mean ps prior standard deviation = 57.82042 NH mean ps posterior standard deviation (before inflation)= 29.20955 NH mean ps posterior standard deviation (after inflation) = 54.58403 NH mean ps inflation = 2.421974 TR mean ps prior standard deviation = 31.09247 TR mean ps posterior standard deviation (before inflation)= 28.31872 TR mean ps posterior standard deviation (after inflation) = 30.79383 TR mean ps inflation = 1.106122 time in inflate_ens = 1.366839744150639E-002 on proc 0
This section is important for checking whether the values of the inflation are reasonable by looking at a summary of the maximum and minimum values. Also, viewing the regional averaged statistics before and after inflation. For this case we can see both the NH and TR show larger standard deviation after the inflation, which is more consistent with the prior standard deviation.Spread inflation of the analysis ensemble
innovation statistics for posterior: conventional obs region, obtype, nobs, bias, innov stdev, sqrt(S+R), sqrt(S), sqrt(R): NH all ps 14083 0.503E-03 0.757E+00 0.569E+00 0.146E+00 0.550E+00 TR all ps 50 -0.562E-01 0.398E+00 0.585E+00 0.198E+00 0.550E+00 NH all t 7766 -0.780E-01 0.145E+01 0.122E+01 0.263E+00 0.119E+01 TR all t 138 -0.107E+00 0.115E+01 0.118E+01 0.199E+00 0.117E+01 NH all uv 21680 0.406E-01 0.267E+01 0.246E+01 0.655E+00 0.237E+01 TR all uv 302 -0.259E+00 0.236E+01 0.246E+01 0.636E+00 0.238E+01 NH all q 2553 -0.174E-01 0.114E+00 0.138E+00 0.444E-01 0.131E+00 TR all q 68 -0.279E-01 0.168E+00 0.159E+00 0.413E-01 0.154E+00 satellite brightness temp instrument, channel #, nobs, bias, innov stdev, sqrt(S+R), sqrt(S), sqrt(R): amsua_n18 1 241 0.614E+02 0.915E+02 0.263E+01 0.823E+00 0.250E+01 amsua_n18 2 248 -0.132E+03 0.171E+03 0.223E+01 0.376E+00 0.220E+01 amsua_n18 3 248 -0.303E+02 0.398E+02 0.201E+01 0.244E+00 0.200E+01 amsua_n18 4 248 -0.231E+00 0.580E+01 0.555E+00 0.776E-01 0.550E+00 amsua_n18 5 248 -0.788E+00 0.199E+01 0.305E+00 0.575E-01 0.300E+00 amsua_n18 6 1203 -0.138E+01 0.158E+01 0.234E+00 0.423E-01 0.230E+00 amsua_n18 7 1225 -0.273E+01 0.287E+01 0.232E+00 0.322E-01 0.230E+00 amsua_n18 8 1155 -0.131E+01 0.186E+01 0.254E+00 0.437E-01 0.250E+00 amsua_n18 10 1211 -0.179E+01 0.196E+01 0.352E+00 0.374E-01 0.350E+00 amsua_n18 11 12 -0.130E+01 0.133E+01 0.404E+00 0.537E-01 0.400E+00 amsua_n18 15 245 -0.163E+03 0.198E+03 0.357E+01 0.693E+00 0.350E+01 hirs4_metop-a 3 1 0.118E+01 0.118E+01 0.533E+00 0.558E-01 0.530E+00 hirs4_metop-a 4 42 0.286E+00 0.542E+00 0.403E+00 0.455E-01 0.400E+00 hirs4_metop-a 5 7 -0.451E+00 0.487E+00 0.364E+00 0.562E-01 0.360E+00 hirs4_metop-a 6 7 -0.724E+00 0.755E+00 0.468E+00 0.852E-01 0.460E+00 hirs4_metop-a 7 9 -0.572E+00 0.817E+00 0.577E+00 0.871E-01 0.570E+00 hirs4_metop-a 8 9 0.122E+01 0.136E+01 0.100E+01 0.205E-01 0.100E+01 hirs4_metop-a 10 9 0.855E+00 0.100E+01 0.609E+00 0.104E+00 0.600E+00 hirs4_metop-a 11 9 -0.236E-01 0.543E+00 0.130E+01 0.507E+00 0.120E+01 hirs4_metop-a 12 34 -0.738E+00 0.153E+01 0.178E+01 0.790E+00 0.160E+01 hirs4_metop-a 13 9 0.121E+01 0.128E+01 0.375E+00 0.905E-01 0.364E+00 hirs4_metop-a 14 9 0.776E-01 0.256E+00 0.281E+00 0.106E+00 0.260E+00 hirs4_metop-a 15 9 0.281E+00 0.384E+00 0.273E+00 0.838E-01 0.260E+00
After the EnKF analysis, it is important to check the innovation statistics, much like the prior fit to all observation.
Running GSI Observer for Global Applications¶
Run Script¶
With both GSI and EnKF compiled (note that a separate global executable
must be generated for EnKF) and the background ensemble files and
observations acquired, the next step is to work with the gsi observer
run script, run_gsi_global.ksh
. The location of this script is under
comGSIv3.5_EnKFv1.1/run
. This run script is the same as the one used
for a GSI global analysis run, with a few specific options selected in
order to loop through all the ensemble members and generate the ensemble
observation priors for each member and the ensemble mean. For the global
examples, users must choose a case that corresponds to the spectral
resolution of the background ensemble. For this example, we will be
using T254. In addition to the GSI observer specific options, other
user-specific modifications need to be made:
- Set up batch queueing systemTo run GSI with multiple processors, a job queuing head must be added to the
run_gsi_global.ksh
script. The set up of the job queue is dependent on the machine and the job control system. Refer to the GSI Users Guide, section 3.2.2, for more examples of the setup section of this script. The following example is setup to run on a Linux cluster with LSF:# LSF batch script to run an MPI application #BSUB -P ??????? # project code #BSUB -W 00:30 # wall-clock time (hrs:mins) #BSUB -n 24 # number of tasks in job #BSUB -R "span[ptile=16]" # run 16 MPI tasks per node #BSUB -J gsi # job name #BSUB -o gsi.%J.out # output file name in which %J is replaced by the job ID #BSUB -e gsi.%J.err # error file name in which %J is replaced by the job ID #BSUB -q small # queue
Set up number of processors and the job queue system used. For this example,
LINUX_LSF
and 24 processors are used:GSIPROC=24 ARCH='LINUX_LSF'
- Set up the case data, analysis time, fix files, GSI executable, and CRTM coefficients:Set up analysis time and select global case:
ANAL_TIME=2014040512 GFSCASE=enkf_glb_t254
Set up working directory, which will hold the analysis results (all ensemble members will be run in this directory). This directory must have the proper write permissions as well as enough space to hold the output.
WORK_ROOT=/scratch/${user}/comGSIv3.5-EnKFv1.1/run/gsidiag_${GFS_CASE}
Set path to background ensemble files:
BK_ROOT=/scratch/${user}/casedata/enkf_glb_t254/bk
Set path to the observation directory and the PrepBUFR file within the observation directory. All observations to be assimilated should reside in this directory.
OBS_ROOT=/scratch/${user}/casedata/enkf_glb_t254/obs PREPBUFR=${OBS_ROOT}/gdas1.t12z.prepbufr.nr
Set the GSI system used for this case, including the paths of the fix files and the CRTM coefficients as well as the location of the GSI executable:
CRTM_ROOT=/scratch/${user}/CRTM_REL-2.2.3 GSI_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/ FIX_ROOT=${GSI_ROOT}/fix GSI_EXE=${GSI_ROOT}/run/gsi.exe GSI_NAMELIST=${GSI_ROOT}/run/comgsi_namelist_gfs.sh
Note that the GSI namelist generation script used for the global case (comgsi_namelist_gfs.sh
) is different from the one used for the regional example. Choose to run GSI observer, set up background ensemble information
if_observer=Yes # Yes ,or, No -- case sensitive!!! no_member=10 BK_FILE_mem=${BK_ROOT}/sfg_2014040506
The optionif_observer=Yes
is the switch that enablesrun_gsi_global.ksh
to run the GSI observer (rather than GSI analysis). In this example, 10 ensemble members are selected with the naming convention:sfg_${GUESS_DATE}_memnnnn
. Note that_memnnnn
which is assocated with each ensemble member (nnnn
), is not included and will be appended later in the script.Set the JCAP resolution for the case:
if [[ "$GFSCASE" = "T62" ]]; then JCAP=62 JCAP_B=62 elif [[ "$GFSCASE" = "T126" ]]; then JCAP=126 JCAP_B=126 elif [[ "$GFSCASE" = "enkf_glb_t254" ]]; then JCAP=254 JCAP_B=254 elif [[ "$GFSCASE" = "T254" ]]; then JCAP=254 JCAP_B=574 elif [[ "$GFSCASE" = "T574" ]]; then JCAP=574 JCAP_B=1534 else echo "INVALID case = $GFSCASE" exit fi LEVS=64
Note that this selection is filled based on selection of the GFSCASE. This example used theenkf_glb_t254
case. All cases use 64 levels.Further in the run scriptrun_gsi_global.ksh
, the resolution parameters are set based on the requested case (resolution). For this case, we are using JCAP=254:elif [[ "$JCAP" = "254" ]]; then LONA=512 LATA=256 DELTIM=1200 resol=2
Link satellite bias correction coefficients, background ensemble files, and observations:
if [[ "$GFSCASE" = "enkf_glb_t254" ]]; then cp $OBS_ROOT/gdas1.t12z.abias ./satbias_in cp $OBS_ROOT/gdas1.t12z.satang ./satbias_angle cp $BK_ROOT/sfcanl_2014040506_fhr03_ensmean ./sfcf03 cp $BK_ROOT/sfcanl_2014040506_fhr06_ensmean ./sfcf06 cp $BK_ROOT/sfcanl_2014040506_fhr06_ensmean ./sfcf09 cp $BK_ROOT/sfg_2014040506_fhr03_mem001 ./sigf03 cp $BK_ROOT/sfg_2014040506_fhr06_mem001 ./sigf06 cp $BK_ROOT/sfg_2014040506_fhr09_mem001 ./sigf09 ... ... # Link to the prepbufr data ln -s ${PREPBUFR} ./prepbufr # Link to the other observation data if [[ "$GFSCASE" = "enkf_glb_t254" ]]; then obsfile_amua=gdas1.t12z.1bamua.tm00.bufr_d obsfile_amub=gdas1.t12z.1bamub.tm00.bufr_d else
Past the arch selection, environment variable checks, and creation of working directory, users will find the location where the observations are linked. For this case, we can see that the conventional PrepBUFR observations have been linked, as well as AMSU-A and AMSU-B satellite radiance BUFR files. These files will be linked to the working directory and separate observation innovation (
diag
) files will be generated for each observation.
Run GSI Observer and Check Run Status¶
Once the run script is set up properly for the case and the machine, GSI can be run through the run script. The following command will submit the job:
$ bsub < run_gsi_global.ksh
While the job is running, move to the working directory and check the details. Given the following working directory setup:
WORK_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/gsidiag_${GFSCASE}
Go to directory /scratch/${user}/comGSIv3.5_EnKFv1.1/
and check the
run directory. A directory named gsidiag_enkf_glb_t254
should have
been created. This directory is the run directory for this GSI observer
case study. The directory will be populated with many files such as:
AerosolCoeff.bin hirs4_n19.TauCoeff.bin sndrD1_g14.TauCoeff.bin
ahi_himawari8.SpcCoeff.bin iasi616_metop-a.SpcCoeff.bin sndrD1_g15.SpcCoeff.bin
ahi_himawari8.TauCoeff.bin iasi616_metop-a.TauCoeff.bin sndrD1_g15.TauCoeff.bin
airs281SUBSET_aqua.SpcCoeff.bin iasi616_metop-b.SpcCoeff.bin sndrD2_g11.SpcCoeff.bin
... ...
These files are CRTM coefficients that have been linked to this run directory through the GSI run script. Similar to running the GSI analysis, many other files are linked or copied to this run directory, such as:
gsiparm.anl : |
GSI namelist |
prepbufr : |
PrepBUFR file for conventional observation |
convinfo : |
data usage control for conventional data |
satbias_in : |
satellite bias correction coefficient file |
satinfo : |
data usage and channel control for satellite radiance data |
berror_stats : |
background error file |
errtable : |
observation error file |
Additionally, for the GSI observer many files are generated as a result
of the lread_obs_save
and lread_obs_skip
options in the namelist
for writing/reading collective observation selection information:
obs_input.nnnn :
|
During ensemble mean (save=True, skip=False), save all observation preprocessing to this file. Each file (nnnn) is for a different variable type |
pennnn.obs_setup :
|
When looping through ensemble members (save=False, skip=True),
all observation preprocessing is skipped, obs_input.nnnn
files are read and pennnn.obs_setup is written for each
processor (nnn) for all observations. |
pennnn.conv_01 :
|
Files generated after O-B, generated for conventional
observations as well as one file per sensor (e.g. AMSU-A n18).
These files are trimmed for resulting diag files and
cleaned. |
list_run_directory :
|
Directory listing after ensemble mean is run, before directory is cleaned for running the ensemble members. |
list_run_directory_memnnn : |
As as above, for each ensemble member (nnn). |
stdout : |
Standard output file for ensemble mean. |
stdout_memnnn : |
Standard output file for each ensemble member (nnn). |
The presence of the standard output files in the directory suggest the
GSI observer run scripts have successfully set up a run environment for
the GSI observer, properly looping through the ensemble members, and the
GSI executable is running on each ensemble member. Once GSI has finished
running, diag
files should be generated for each observation type
for each member as well as the ensemble mean:
diag_conv_ges.ensmean diag_conv_ges.mem004 diag_conv_ges.mem008
diag_conv_ges.mem001 diag_conv_ges.mem005 diag_conv_ges.mem009
diag_conv_ges.mem002 diag_conv_ges.mem006 diag_conv_ges.mem010
diag_conv_ges.mem003 diag_conv_ges.mem007
diag_amsua_metop-a_ges.ensmean diag_amsua_metop-a_ges.mem006
diag_amsua_metop-a_ges.mem001 diag_amsua_metop-a_ges.mem007
diag_amsua_metop-a_ges.mem002 diag_amsua_metop-a_ges.mem008
diag_amsua_metop-a_ges.mem003 diag_amsua_metop-a_ges.mem009
diag_amsua_metop-a_ges.mem004 diag_amsua_metop-a_ges.mem010
diag_amsua_metop-a_ges.mem005
diag_amsua_n18_ges.ensmean diag_amsua_n18_ges.mem004 diag_amsua_n18_ges.mem008
diag_amsua_n18_ges.mem001 diag_amsua_n18_ges.mem005 diag_amsua_n18_ges.mem009
diag_amsua_n18_ges.mem002 diag_amsua_n18_ges.mem006 diag_amsua_n18_ges.mem010
diag_amsua_n18_ges.mem003 diag_amsua_n18_ges.mem007
diag_amsua_n19_ges.ensmean diag_amsua_n19_ges.mem004 diag_amsua_n19_ges.mem008
diag_amsua_n19_ges.mem001 diag_amsua_n19_ges.mem005 diag_amsua_n19_ges.mem009
diag_amsua_n19_ges.mem002 diag_amsua_n19_ges.mem006 diag_amsua_n19_ges.mem010
diag_amsua_n19_ges.mem003 diag_amsua_n19_ges.mem007
Check for Successful GSI Completion¶
The presence of the diag
files and standard out files for the
ensemble mean and each member indicates that the GSI observer has run
without crashing, but does not necessary indicate a successful analysis.
It is important to check the stdout files in the run directory to make
sure the GSI observer completed each step without any obvious problems.
The following are several important areas of the standard out file to
check:
Read in the
anavinfo
and namelistREAD_FILES: analysis date,minutes 2014 4 5 12 0 19070640 gsi_metguess_mod*init_: 2D-MET STATE VARIABLES: ps z gsi_metguess_mod*init_: 3D-MET STATE VARIABLES: u v ... ... &SETUP GENCODE = 78.0000000000000 , FACTQMIN = 0.000000000000000E+000, FACTQMAX = 0.000000000000000E+000,
- Read in the background fieldThe following lines in the stdout immediately following the namelist section, indicate that GSI is reading the background fields. Checking that the range of the max and min values will indicate if particular background fields are normal.
GESINFO: jcap_b= 254, levs= 64, latb= 256, lonb= 512, ntrac= 3, ncldt= 1, idvc= 2, nvcoord= 2, idvm= 0, idsl= 0, idpsfc= 0, idthrm= 0 k,ak,bk,ck,tref= 1 0.00000000000 1.00000000000 0.00000000000 300.000000000 k,ak,bk,ck,tref= 2 0.00000000000 0.994671165943 0.00000000000 300.000000000 k,ak,bk,ck,tref= 3 0.574999988079E-03 0.988626599312 0.00000000000 300.000000000 k,ak,bk,ck,tref= 4 0.574100017548E-02 0.981742262840 0.00000000000 300.000000000 ... ...
- Read in observational dataThe
stdout
(ensemble mean) andstdout_memnnn
(ensemble members) contains distinct differences for the reading in the observational data controlled by thelread_obs_save
andlread_obs_skip
options in the namelist.Ensemble mean
read_obs_check: bufr file date is 2014040512 prepbufr t read_obs_check: bufr file date is 2014040512 prepbufr uv read_obs_check: bufr file date is 2014040512 prepbufr q read_obs_check: bufr file date is 2014040512 prepbufr ps ... ... number of extra processors 13 READ_OBS: read 1 ps ps using ntasks= 1 0 0 0 READ_OBS: read 2 t t using ntasks= 1 1 0 0 READ_OBS: read 3 q q using ntasks= 1 2 0 0 READ_OBS: read 4 pw pw using ntasks= 1 3 2738 0 ... ... READ_PREPBUFR : file=prepbufr type=pw sis=pw nread= 345 ithin= 0 rmesh=1450.000000 isfcalc= 0 ndata= 345 ntask= 1 READ_BUFRTOVS : file=amsuabufr type=amsua sis=amsua_n18 nread= 202575 ithin= 2 rmesh=1500.000000 isfcalc= 0 ndata= 675 ntask= 1 ... ... READ_OBS: write collective obs selection info to obs_input.common
For the ensemble mean, the same procedure for observation processing as the GSI analysis is followed (read_obs_check
,READ_OBS
,READ_BUFRTOVS
,READ_GPS
,READ_PREPBUFR
, …). Finally, it is indicated that the collective observations selection information is written out for use with the ensemble members.
- Ensemble membersFor the
stdout_memnnn
, the observation processing sets are skipped and observation information is read in from the ensemble mean:OBSERVER_SET: read collective obs selection info from obs_input.common
Finally, for both the ensemble mean and members, the following lines will appear:
OBS_PARA: ps 24 400 1649 2577 58 245 1496 506 82 402 616 251 36 67 328 864 44 224 5210 1397 69 414 506 629 OBS_PARA: t 48 403 3406 4875 174 488 4128 1825 84 724 2702 1146 4 215 882 2218 151 913 11864 3167 160 1622 1308 2381 OBS_PARA: q 16 370 3025 4749 154 383 3211 1737 51 594 2126 1102 0 161 697 1951 140 850 10310 2974 155 1525 1104 1506 OBS_PARA: pw 0 0 0 0 0 0 0 0 0 0 3 0 0 0 42 25 0 0 251 32 0 0 0 4 OBS_PARA: uv 52 410 4459 5664 239 1023 4994 1835 86 1553 3792 1316 3 184 877 2242 180 1576 11854 3595 264 3186 2231 2895 OBS_PARA: amsua n15 0 0 0 0 7 9 0 0 1 0 0 0 5 0 0 0 3 10 7 0 0 0 0 0 OBS_PARA: amsua n18 3 1 0 0 5 2 0 0 1 0 0 0 5 6 2 0 2 5 6 8 0 0 0 0 OBS_PARA: amsua n19 0 0 3 7 0 0 0 0 3 1 0 0 2 9 10 6 0 0 0 3 0 0 0 3 OBS_PARA: amsua metop-a 0 0 0 4 0 0 0 2 0 4 9 5 0 0 0 0 0 0 0 0 0 6 9 6 OBS_PARA: amsua metop-b 0 0 0 0 1 0 0 0 6 9 0 0 3 0 0 0 4 0 0 0 4 11 7 0
This table is important to check if the observations have been read in, which types of observations have been read in, and the distribution of observations in each subdomain.
- Indication that the GSI observer has successfully run:
glbsoi: complete [000]gsisub(): : complete. ENDING DATE-TIME JUL 15,2016 15:48:52.880 197 FRI 2457585 PROGRAM GSI_ANL HAS ENDED. * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
After looking over each section of the standard output files, it can be
concluded that the GSI global observer ran without issues. Note that
because the outer loop was set to 0 (miter=0
) for the GSI observer,
no minimization occurred and therefore no analysis results were
produced.
Running EnKF for Global Applications¶
Run Script¶
Once the GSI global observer has been run successfully, the next step is
to setup the EnKF run script, run_enkf_global.ksh
. This script is
located under comGSIv3.5_EnKFv1.1/run
. This run script uses the
diag
files generated by the GSI global observer script as
observation input, and generates the global EnKF analysis. Similar to
the GSI observer script, several user-specific modifications need to be
made:
Set up batch queueing system
#BSUB -P ?????? # project code #BSUB -W 00:30 # wall-clock time (hrs:mins) #BSUB -n 80 # number of tasks in job #BSUB -R "span[ptile=16]" # run 16 MPI tasks per node #BSUB -J gsi # job name #BSUB -o gsi.%J.out # output file name in which %J is replaced by the job ID #BSUB -e gsi.%J.err # error file name in which %J is replaced by the job ID #BSUB -q small # queue
One difference from the GSI observer script is that the number of processors used should be greater than at least the number of ensemble members for running the EnKF. In this case we have 10 ensemble members and have requested 80 cores due to the large global domain.Set up the analysis time, global case, fixed files, EnKF executable
ANAL_TIME=2014040512 GUESS_TIME=2014040506 GFSCASE=enkf_glb_t254 WORK_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/enkf_${GFSCASE} BK_ROOT=/scratch/${user}/casedata/enkf_glb_t254/bk GSI_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1 FIX_ROOT=${GSI_ROOT}/fix ENKF_EXE=${GSI_ROOT}/src/main/enkf/global_enkf CRTM_ROOT=/scratch/${user}/CRTM_2.2.3 ENKF_NAMELIST=${GSI_ROOT}/run/enkf_gfs_namelist.sh
Set location of the diag files (from GSI observer)
DIAG_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/gsidiag_${GFSCASE}
One modifications to note for the EnKF is theDIAG_ROOT
. TheDIAG_ROOT
points to the working directory where the GSI observer runs. This directory containsdiag*
files which will be linked to the EnKF working directory.Select ensemble and case parameters
NMEM_ENKF=10 LEVS=64 NVARS=5 ... ... elif [[ "$GFSCASE" = "enkf_glb_t254" ]]; then JCAP=254 JCAP_B=254 ... ... elif [[ "$JCAP" = "254" ]]; then LONA=512 LATA=256 DELTIM=1200 resol=2
This section sets information about the number of background ensemble members, domain specifications. This information mirrors that of therun_gsi_global.ksh
script.Read in namelist
#Build EnKF namelist on-the-fly . $ENKF_NAMELIST cat << EOF > enkf.nml $enkf_namelist EOF
Similar to the regional enkf (run_enkf_wrf.ksh
), the global EnKF builds the namelist from an outside script for namelist generation (enkf_gfs_namelist.sh
).Select observations
list="conv amsua_metop-a amsua_n18 amsua_n15"
The previous line is contained within the ensemble looping section. For the observations to be assimilated, be aware that only observations/platforms that have been run through the GSI observer (and therefore a
diag*
file exists) are valid. For this example case, we will only run EnKF for conventional observations.
Run EnKF and Check Run Status¶
Once the run script is set up properly for the case and the machine, EnKF can be run through the run script. The following command will submit the job:
$ bsub < run_enkf_global.ksh
While the job is running, move to the working directory and check the details. Given the following working directory setup:
WORK_ROOT=/scratch/${user}/comGSIv3.5_EnKFv1.1/run/enkf_${GFSCASE}
Go to directory /scratch/${user}/comGSIv3.5_EnKFv1.1/
and check the
run directory. A directory named enkf_enkf_glb_t264
should have been
created. This directory is the run directory for this EnKF case study.
The directory will be populated with many links:
diag_conv_ges.ensmean diag_conv_ges.mem004 diag_conv_ges.mem008
diag_conv_ges.mem001 diag_conv_ges.mem005 diag_conv_ges.mem009
diag_conv_ges.mem002 diag_conv_ges.mem006 diag_conv_ges.mem010
diag_conv_ges.mem003 diag_conv_ges.mem007
diag_amsua_metop-a_ges.ensmean diag_amsua_metop-a_ges.mem006
diag_amsua_metop-a_ges.mem001 diag_amsua_metop-a_ges.mem007
diag_amsua_metop-a_ges.mem002 diag_amsua_metop-a_ges.mem008
diag_amsua_metop-a_ges.mem003 diag_amsua_metop-a_ges.mem009
diag_amsua_metop-a_ges.mem004 diag_amsua_metop-a_ges.mem010
diag_amsua_metop-a_ges.mem005
diag_amsua_n18_ges.ensmean diag_amsua_n18_ges.mem004 diag_amsua_n18_ges.mem008
diag_amsua_n18_ges.mem001 diag_amsua_n18_ges.mem005 diag_amsua_n18_ges.mem009
diag_amsua_n18_ges.mem002 diag_amsua_n18_ges.mem006 diag_amsua_n18_ges.mem010
diag_amsua_n18_ges.mem003 diag_amsua_n18_ges.mem007
diag_amsua_n19_ges.ensmean diag_amsua_n19_ges.mem004 diag_amsua_n19_ges.mem008
diag_amsua_n19_ges.mem001 diag_amsua_n19_ges.mem005 diag_amsua_n19_ges.mem009
diag_amsua_n19_ges.mem002 diag_amsua_n19_ges.mem006 diag_amsua_n19_ges.mem010
diag_amsua_n19_ges.mem003 diag_amsua_n19_ges.mem007
The diag
files are linked from the GSI global observer working
directory. These links are specified in the run script. Note that ozone
diag
files (gome
, omi
, sbuv
) will appear in the
directory without links (zero length files) if ozone diags are not
present from the GSI observer.
sfg_2014040512_fhr06_ensmean sfg_2014040512_fhr06_mem004 sfg_2014040512_fhr06_mem008
sfg_2014040512_fhr06_mem001 sfg_2014040512_fhr06_mem005 sfg_2014040512_fhr06_mem009
sfg_2014040512_fhr06_mem002 sfg_2014040512_fhr06_mem006 sfg_2014040512_fhr06_mem010
sfg_2014040512_fhr06_mem003 sfg_2014040512_fhr06_mem007
The first guess files are also linked into the directory from the run script, pointing to the path of the ensemble mean and ensemble members designated in the setup section.
Similar to running the GSI observer, many other static files are linked or copied to this run directory, such as:
enkf.nml: |
EnKF namelist |
convinfo: |
data usage control for conventional data |
satbias_in: |
satellite bias correction coefficient file |
satbias_angle: |
satellite bias correction angle file |
satinfo: |
data usage and channel control for satellite radiance data |
ozinfo: |
data usage control from ozone data |
covinflate.dat: |
Three-dimensional multiplicative inflation factor fields |
stdout: |
EnKF standard output file |
The presence of the standard output file in the directory suggests the EnKF run script has successfully set up a run environment for the EnKF, properly linking the first guess and diag files from the GSI observer, and the EnKF executable is running. Once EnKF has finished running, analysis files should be generated for each member:
sanl_2014040512_mem001 sanl_2014040512_mem005 sanl_2014040512_mem009
sanl_2014040512_mem002 sanl_2014040512_mem006 sanl_2014040512_mem010
sanl_2014040512_mem003 sanl_2014040512_mem007
sanl_2014040512_mem004 sanl_2014040512_mem008
Check for Successful EnKF Completion¶
The presence of the EnKF analysis files for each member as well as the standard out file indicates that the EnKF has run without crashing, but does not necessary indicate a successful analysis. It is important to check the stdout file in the run directory to make sure the EnKF completed each step without any obvious problems. The following are several important areas of the standard out file to check:
Check namelist has been properly read in and configuration is correct:
namelist parameters: -------------------- &NAM_ENKF DATEIN = 2014040512, DATAPATH = ./ IASSIM_ORDER = 0, COVINFLATEMAX = 100.0000 , COVINFLATEMIN = 1.000000 , DETERMINISTIC = T, SORTINC = T, CORRLENGTHNH = 2000.000 , CORRLENGTHTR = 2000.000 , CORRLENGTHSH = 2000.000 , ... ...
Check analysis time, number of ensemble members, as well as the actual analysis variables and the background type. The maximum and minimum values for surface pressure are printed for a sanity check:
analysis time 2014040512 10 members number of background forecast times to be updated = 1 first-guess forecast hours for analysis = 06 5 3d vars to update total of 321 2d grids will be updated (including ps) using multiplicative inflation based on Pa/Pb Vars in Rad-Jacobian (dims) -------------------------- sst 0 enkf_main: number of threads 1 ensemble mean first guess surface pressure: 516.109537854119 1051.19857562343
- Statistics of the ensemble priorAfter many lines describing the bias correction coefficents, the inventory of the observation number, observation types from the input
diag*
files, and the time spent reading in each observation type, statistics of the ensemble priors fit to all observations are provided for each region (NH, SH, TR):innovation statistics for prior: conventional obs region, obtype, nobs, bias, innov stdev, sqrt(S+R), sqrt(S), sqrt(R): NH all ps 12059 -0.700E-01 0.122E+01 0.151E+01 0.499E+00 0.143E+01 TR all ps 2184 0.402E+00 0.118E+01 0.141E+01 0.476E+00 0.133E+01 SH all ps 849 -0.102E+00 0.125E+01 0.148E+01 0.563E+00 0.137E+01 NH all t 20598 0.929E-01 0.165E+01 0.121E+01 0.387E+00 0.115E+01 TR all t 6126 0.218E+00 0.153E+01 0.113E+01 0.418E+00 0.105E+01 SH all t 1169 0.141E+00 0.172E+01 0.113E+01 0.417E+00 0.105E+01 NH all uv 49010 0.199E+00 0.312E+01 0.250E+01 0.951E+00 0.231E+01 TR all uv 20980 0.651E-01 0.335E+01 0.265E+01 0.127E+01 0.232E+01 SH all uv 4682 -0.153E+00 0.375E+01 0.247E+01 0.114E+01 0.219E+01 NH all q 7936 -0.335E-01 0.181E+00 0.207E+00 0.534E-01 0.200E+00 TR all q 2951 -0.119E-01 0.180E+00 0.214E+00 0.770E-01 0.200E+00 SH all q 543 -0.163E-01 0.212E+00 0.213E+00 0.730E-01 0.200E+00
This table should be checked in order to determine if the inflation is appropriate. As mentioned in the regional EnKF example, the goal is to make the total ensemble spreads of priors (\(\sqrt{(S+R)}\) match the innovations (innov std
) as much as possible. Because this is a global run using only conventional observation data, we can see that all regions (NH,TR,SH) are listed and there are no statistics for radiances. We can see certain obtypes, particularly in the uv has fairly close values. On the other hand, there are still regions/obtypes that have noteable differences, mainly associated with larger observation errors (\(\sqrt{(R)}\)). Users should consider tuning of inflation and localization, which is typically determined using cases with multiple assimilation cycles. Refer to section 4.2 of this Users Guide for more information on tuning. Domain and observation partition:
npts = 131072 min/max number of points per proc = 1598 1710 time to do model space decomp = 1.697483193129301E-002 nobstot = 129087 min/max number of obs per proc = 1613 1614 time to do ob space decomp = 5.538591649383307E-004 sending out observation prior ensemble perts from root ... nobstot*nanals 1290870 npts*ndim 42074112 ... took 6.416106596589088E-003 secs
The analysis variables and the observations are distributed to different processors. We can see in this case that the min/max number of points per processor are 1598 and 1710, respectively. Similarly, we can see that the min/max number of observations per processor are 1613 and 1614, respectively, indicating that the observations are well dispersed among the processors.
min/max pressi 1 516.1096 1051.199 min/max pressi 2 513.3593 1045.597 min/max pressi 3 510.2454 1039.249 min/max pressi 4 506.7440 1032.063 min/max pressi 5 502.8375 1023.943 ... ...
Additionally, check the minimum and maximum values of the fields at each vertical level as a quick sanity check.
Diagnose EnKF Analysis Results¶
At the bottom of the standard output file, there are several output statistics and tables that are helpful for users to diagnose the quality of the EnKF analysis.
Statistics of the ensemble analysis
global ps prior std. dev min/max = 7.2904140E-02 3.055338 NH mean ps prior standard deviation = 0.5657563 NH mean ps posterior standard deviation (before inflation)= 0.3681831 NH mean ps posterior standard deviation (after inflation) = 0.5323756 NH mean ps inflation = 1.653412 SH mean ps prior standard deviation = 0.6475344 SH mean ps posterior standard deviation (before inflation)= 0.6009755 SH mean ps posterior standard deviation (after inflation) = 0.6399752 SH mean ps inflation = 1.086266 TR mean ps prior standard deviation = 0.4878409 TR mean ps posterior standard deviation (before inflation)= 0.4240963 TR mean ps posterior standard deviation (after inflation) = 0.4770048 TR mean ps inflation = 1.170121 time in inflate_ens = 5.236316868104041E-002 on proc 0
This section is important for checking whether the values of the inflation are reasonable by looking at a summary of the maximum and minimum values. Also, viewing the regional averaged statistics before and after inflation. For this case we can see for all regions, larger standard deviation after the inflation are reported, which is more consistent with the prior standard deviation.Spread inflation of the analysis ensemble
innovation statistics for posterior: conventional obs region, obtype, nobs, bias, innov stdev, sqrt(S+R), sqrt(S), sqrt(R): NH all ps 12059 0.538E-01 0.912E+00 0.144E+01 0.200E+00 0.143E+01 TR all ps 2184 0.243E+00 0.102E+01 0.136E+01 0.285E+00 0.133E+01 SH all ps 849 -0.755E-01 0.978E+00 0.142E+01 0.388E+00 0.137E+01 NH all t 20598 0.590E-01 0.147E+01 0.117E+01 0.196E+00 0.115E+01 TR all t 6126 0.177E+00 0.135E+01 0.107E+01 0.222E+00 0.105E+01 SH all t 1169 0.100E+00 0.148E+01 0.108E+01 0.246E+00 0.105E+01 NH all uv 49010 0.765E-01 0.269E+01 0.236E+01 0.478E+00 0.231E+01 TR all uv 20980 0.116E-02 0.274E+01 0.240E+01 0.605E+00 0.232E+01 SH all uv 4682 -0.981E-01 0.303E+01 0.227E+01 0.591E+00 0.219E+01 NH all q 7936 -0.333E-01 0.166E+00 0.202E+00 0.268E-01 0.200E+00 TR all q 2951 0.369E-02 0.148E+00 0.203E+00 0.370E-01 0.200E+00 SH all q 543 -0.188E-01 0.187E+00 0.204E+00 0.393E-01 0.200E+00
After the EnKF analysis, it is important to check the innovation statistics, as previously discussed in the regional test case.