.. _gsi_run: .. include:: inclusion.txt 3. Running GSI =============== This chapter discusses the issues of running GSI. It starts with introductions to the input data required to run GSI, then proceeds with a detailed explanation of an example GSI run script and introductions to files produced by a successful GSI run. It concludes with some frequently used options from the GSI namelist. .. _sec3.1: Input Data Required to Run GSI ------------------------------- In most cases, three types of input data (background, observations, and fixed files) must be available before running GSI. In some special idealized cases, such as a pseudo single observation test, GSI can be run without any observations. If running GSI with the 3D EnVar hybrid option, global or regional ensemble forecasts are also needed. Background or First Guess Field ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As with other data analysis systems, the background or first guess fields may come from a model forecast conducted separately or from a previous data assimilation cycle. The following is a list of the types of background files that can be used by this release version of GSI: a) WRF-NMM input fields in binary format b) WRF-NMM input fields in NetCDF format c) WRF-ARW input fields in binary format d) WRF-ARW input fields in NetCDF format e) GFS input fields in binary format or through NEMS I/O f) NEMS-NMMB input fields g) RTMA input files (2-dimensional binary format) h) WRF-Chem GOCART input fields with NetCDF format i) CMAQ binary file The Weather Research and Forecasting (WRF) community modeling system includes two dynamical cores: the Advanced Research WRF (ARW) and the Nonhydrostatic Mesoscale Model (NMM). The GFS (Global Forecast System), NEMS (National Environmental Modeling System)-NMMB (Nonhydrostatic Mesoscale Model B-Grid), and RTMA (Real-Time Mesoscale Analysis) are operational systems at the National Center for Environmental Prediction (NCEP). The DTC mainly supports GSI for regional WRF applications. Therefore, most of the multiple platform tests were conducted using WRF netcdf background files (d). The DTC also supports the GSI in global and chemical applications with limited resources. The following backgrounds have been tested for this release: #. ARW NetCDF (d) were tested with multiple cases #. GFS (e) was tested with multiple NCEP cases #. WRF-Chem NetCDF (h) was tested with a single case #. NEMS-NMMB(f) was tested with a single case Observations ~~~~~~~~~~~~ GSI can analyze many types of observational data, including conventional data, satellite radiance observations, GPS Radio Occultations, and radar data, among others. The default observation file names are given in the released GSI namelist, with corresponding observations included in each file. Sample BUFR files available for download from the NCEP website listed in table `[t31] <#t31>`__. The observations are complex and many observations need format converting and quality control before being used by GSI. GSI ingests observations saved in BUFR format (with NCEP specified features). The NCEP processed PrepBUFR and BUFR files can be used directly. If users need to introduce their own data into GSI, please check the following website for the Users Guide and examples of BUFR/PreBUFR processing: http://www.dtcenter.org/com-GSI/BUFR/index.php DTC supports BUFR/PrepBUFR data processing and quality control as part of the GSI community tasks. GSI can analyze all of the data types in table `[t31] <#t31>`__, but each GSI run (for both operation and case study purposes) only uses a subset of the data. Some data may be outdated and not available, some are in monitoring mode, and some may have quality issues during certain periods. Users are encouraged to check data quality prior to running an analysis. The following NCEP links provide resources that include data quality history: | http://www.emc.ncep.noaa.gov/mmb/data_processing/Satellite_Historical_Documentation.htm | http://www.emc.ncep.noaa.gov/mmb/data_processing/Non-satellite_Historical_Documentation.htm Because the current regional models do not have ozone as a prognostic variable, ozone data are not assimilated on the regional scale. GSI can be run without any observations to see how the moisture constraint modifies the first guess (background) field. GSI can also be run in a pseudo single observation mode, which does not require any BUFR observation files. In this mode, users should specify observation information in the namelist section SINGLEOB_TEST (see Section `[sec4.2] <#sec4.2>`__ for details). As more data files are used, additional information will be added through the GSI analysis. .. table:: GSI observation file names, content, and examples ========== ================================================================================================= =============================== GSI Name Content Example file names ========== ================================================================================================= =============================== prepbufr Conventional observations, including ps, t, q, pw, uv, spd, dw, sst gdas1.t12z.prepbufr.nr satwndbufr satellite winds observations gdas1.t12z.satwnd.tm00.bufr_d amsuabufr AMSU-A 1b radiance (brightness temperatures) from satellites NOAA-15, 16, 17,18, 19 and METOP-A/B gdas1.t12z.1bamua.tm00.bufr_d amsubbufr AMSU-B 1b radiance (brightness temperatures) from satellites NOAA-15, 16,17 gdas1.t12z.1bamub.tm00.bufr_d radarbufr Radar radial velocity Level 2.5 data ndas.t12z.radwnd.tm12.bufr_d gpsrobufr GPS radio occultation and bending angle observation gdas1.t12z.gpsro.tm00.bufr_d ssmirrbufr Precipitation rate observations from SSM/I gdas1.t12z.spssmi.tm00.bufr_d tmirrbufr Precipitation rate observations from TMI gdas1.t12z.sptrmm.tm00.bufr_d sbuvbufr SBUV/2 ozone observations from satellite NOAA-16, 17, 18, 19 gdas1.t12z.osbuv8.tm00.bufr_d hirs2bufr HIRS2 1b radiance from satellite NOAA-14 gdas1.t12z.1bhrs2.tm00.bufr_d hirs3bufr HIRS3 1b radiance observations from satellite NOAA-16, 17 gdas1.t12z.1bhrs3.tm00.bufr_d hirs4bufr HIRS4 1b radiance observation from satellite NOAA-18, 19 and METOP-A/B gdas1.t12z.1bhrs4.tm00.bufr_d msubufr MSU observation from satellite NOAA 14 gdas1.t12z.1bmsu.tm00.bufr_d airsbufr AMSU-A and AIRS radiances from satellite AQUA gdas1.t12z.airsev.tm00.bufr_d mhsbufr Microwave Humidity Sounder observation from NOAA-18, 19 and METOP-A/B gdas1.t12z.1bmhs.tm00.bufr_d ssmitbufr SSMI observation from satellite f13, f14, f15 gdas1.t12z.ssmit.tm00.bufr_d amsrebufr AMSR-E radiance from satellite AQUA gdas1.t12z.amsre.tm00.bufr_d ssmisbufr SSMIS radiances from satellite f16 gdas1.t12z.ssmis.tm00.bufr_d gsnd1bufr GOES sounder radiance (sndrd1, sndrd2, sndrd3 sndrd4) from GOES-11, 12, 13, 14, 15. gdas1.t12z.goesfv.tm00.bufr_d l2rwbufr NEXRAD Level 2 radial velocity ndas.t12z.nexrad.tm12.bufr_d gsndrbufr GOES sounder radiance from GOES-11, 12 gdas1.t12z.goesnd.tm00.bufr_d gimgrbufr GOES imager radiance from GOE-11, 12 omibufr Ozone Monitoring Instrument (OMI) observation NASA Aura gdas1.t12z.omi.tm00.bufr_d iasibufr Infrared Atmospheric Sounding Interfero-meter sounder observations from METOP-A/B gdas1.t12z.mtiasi.tm00.bufr_d gomebufr The Global Ozone Monitoring Experiment (GOME) ozone observation from METOP-A/B gdas1.t12z.gome.tm00.bufr_d mlsbufr Aura MLS stratospheric ozone data from Aura gdas1.t12z.mlsbufr.tm00.bufr_d tcvitl Synthetic Tropic Cyclone-MSLP observation gdas1.t12z.syndata.tcvitals.tm00 seviribufr SEVIRI radiance from MET-08,09,10 gdas1.t12z. sevcsr.tm00.bufr_d atmsbufr ATMS radiance from Suomi NPP gdas1.t12z.atms.tm00.bufr_d crisbufr CRIS radiance from Suomi NPP gdas1.t12z.cris.tm00.bufr_d modisbufr MODIS aerosol total column AOD observations from AQUA and TERRA ========== ================================================================================================= =============================== [t31] Fixed Files (Statistics and Control Files) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A GSI analysis also needs to read specific information from statistic files, configuration files, bias correction files, and CRTM coefficient files. We refer to these files as fixed files and they are located in a directory called ``fix/`` in the release package, except for CRTM coefficients. Table `[t32] <#t32>`__ lists fixed files required for a GSI run, the content of the files, and corresponding example files from the regional and global applications: Because most of those fixed files have hardwired names inside the GSI, a GSI run script needs to copy or link those files (right column in table `[t32] <#t32>`__) from the ``./fix`` directory to the GSI run directory with the file name required in GSI (left column in table `[t32] <#t32>`__). For example, if GSI runs with an ARW background, the following line should be in the run script: :: cp ${path of the fix directory}/anavinfo_arw_netcdf anavinfo Note that in this release, there is a strict rule that the numbers of vertical levels in the file ``anavinfo`` must match the background file (for example, ``wrfinput_d01``) for the 3-dimensional variables. Otherwise GSI will fail. To identify the correct numbers of vertical levels, users can dump out (use ``ncdump -h``) the dimensions from the NetCDF background file and find the number for ``bottom_top`` and ``bottom_top_stag``. For example, if the dimensions for the background file is: :: bottom_top = 50 ; bottom_top_stag = 51 ; Then the corresponding ``anavinfo`` file should have 51 levels for ``prse`` (3-dimemsional pressure field) and 50 levels for other three-dimensional variables such as u, v, tv, q, oz, cw, etc. For details, users can dump out the global attributes of the background file and find the number of vertical levels for each variable. The following shows part of the ``anavinfo`` file for the above background: :: state_derivatives:: !var level src ps 1 met_guess u 50 met_guess v 50 met_guess tv 50 met_guess q 50 met_guess oz 50 met_guess cw 50 met_guess prse 51 met_guess :: .. table:: GSI fixed files, content, and examples ============================= ============================================================================ ==================================================================== GSI Name Content Example file names ============================= ============================================================================ ==================================================================== anavinfo Information file to set control and analysis variables anavinfo_arw_netcdf anavinfo_ndas_netcdf global_anavinfo.l64.txt berror_stats background error covariance nam_nmmstat_na.gcv nam_glb_berror.f77.gcv global_berror.l64y386.f77 errtable Observation error table nam_errtable.r3dv prepobs_errtable.global \ convinfo Conventional observation information file global_convinfo.txt nam_regional_convinfo.txt satinfo satellite channel information file global_satinfo.txt pcpinfo precipitation rate observation information file global_pcpinfo.txt ozinfo ozone observation information file global_ozinfo.txt \ satbias_angle satellite scan angle dependent bias correction file global_satangbias.txt \ satellite mass bias correction coefficient file sample.satbias \ combined satellite angle dependent and mass bias correction coefficient file gdas1.t00z.abias.new t_rejectlist, w_rejectlist,.. Rejetion list for T, wind, et al. in RTMA new_rtma_t_rejectlist new_rtma_w_rejectlist ============================= ============================================================================ ==================================================================== [t32] Each operational system, such as GFS, NAM, RAP, and RTMA, has their own set of fixed files. For your specific GSI runs, you need to get the correct set of fixed files. Fixed files for regional applications are included in this GSI/EnKF release and put under the *fix/* directory. Fixed files for global applications are not included in this release in order to save space. Please download |comGSIversion| ``comGSIv3.7_EnKFv1.3_fix_global.tar.gz`` if you need to run global cases. Note that little endian background error covariance files are no longer supported. Each release version of the GSI calls a certain version of the CRTM library and needs corresponding CRTM coefficients to do radiance data assimilation. This version of GSI uses CRTM 2.2.3. The coefficient files are listed in table `[t34] <#t34>`__. .. table:: List of radiance coefficients used by CRTM ============================= =========================== ============================== File name used in GSI Content Example Files ============================= =========================== ============================== Nalli.IRwater.EmisCoeff.bin IR surface emissivity Nalli.IRwater.EmisCoeff.bin NPOESS.IRice.EmisCoeff.bin coefficients NPOESS.IRice.EmisCoeff.bin NPOESS.IRsnow.EmisCoeff.bin NPOESS.IRsnow.EmisCoeff.bin NPOESS.IRland.EmisCoeff.bin NPOESS.IRland.EmisCoeff.bin NPOESS.VISice.EmisCoeff.bin NPOESS.VISice.EmisCoeff.bin NPOESS.VISland.EmisCoeff.bin NPOESS.VISland.EmisCoeff.bin NPOESS.VISsnow.EmisCoeff.bin NPOESS.VISsnow.EmisCoeff.bin NPOESS.VISwater.EmisCoeff.bin NPOESS.VISwater.EmisCoeff.bin FASTEM6.MWwater.EmisCoeff.bin FASTEM6.MWwater.EmisCoeff.bin AerosolCoeff.bin Aerosol coefficients AerosolCoeff.bin CloudCoeff.bin Cloud scattering and CloudCoeff.bin emission coefficients ${satsen}.SpcCoeff.bin Sensor spectral response ${satsen}.SpcCoeff.bin characteristics ${satsen}.TauCoeff.bin Transmittance coefficients ${satsen}.TauCoeff.bin ============================= =========================== ============================== [t34] GSI Run Script -------------- In this release version, three sample run scripts are available for different GSI applications: - ``ush/comgsi_run_regional.ksh`` for regional GSI - ``ush/comgsi_run_global.ksh`` for global GSI (GFS) - ``ush/comgsi_run_chem.ksh`` for chemical analysis These scripts will be called to generate GSI namelists: - ``ush/comgsi_namelist.sh`` for regional GSI - ``ush/comgsi_namelist_gfs.sh`` for global GSI (GFS) - ``ush/comgsi_namelist_chem.sh`` for GSI chemical analysis We will introduce the regional run scripts (``comgsi_run_regional.ksh``) in detail in the following sections and introduce the global run script when we discuss the GSI global application in the Advanced GSI Users Guide. Note there is also a run script for regional EnKF (``comenkf_run_regional.ksh``), a run script for global EnKF (``comenkf_run_global.ksh``) and the EnKF namelist script (``comenkf_namelist.sh``) in the same directory, which will be introduced in the EnKF Users Guide. Steps in the GSI Run Script ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The GSI run script creates a run time environment necessary to run the GSI executable. A typical GSI run script includes the following steps: #. Request computer resources to run GSI. #. Set environmental variables for the machine architecture. #. Set experimental variables (such as experiment name, analysis time, background, and observation). #. Set the script that generates the GSI namelist. #. Check the definitions of required variables. #. Generate a run directory for GSI (sometimes called a working or temporary directory). #. Copy the GSI executable to the run directory. #. Copy the background file to the run directory and create an index file listing the location and name of ensemble members if running with a hybrid set up. #. Link observations to the run directory. #. Link fixed files (statistic, control, and coefficient files) to the run directory. #. Generate namelist for GSI. #. Run the GSI executable. #. Post-process: save analysis results, generate diagnostic files, and clean the run directory. #. Run GSI as observation operator for EnKF, only for ``if_observer=Yes``. Typically, users only need to modify specific parts of the run script (steps 1, 2, and 3) to fit their specific computer environment and point to the correct input/output files and directories. Users may also need to modify step 4 if changes are made to the namelist and it is under a different name or at a different location. The next section (`1.2.2 <#sec3.2.2>`__) covers each of these modifications for steps 1 to 3. Section `1.2.3 <#sec3.2.3>`__ will dissect a sample regional GSI run script and introduce each piece of this sample GSI run script. Users should start with the run script provided in the same release package with the GSI executable and modify it for their own run environment and case configuration. .. _sec3.2.2: Customization of the GSI Run Script ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This section focuses on step 1 of the run script: modifying the machine specific entries. Specifically, this consists of setting Unix/Linux environment variables and selecting the correct parallel run time environment (batch system with options). GSI can be run with the same parallel environments as other MPI programs, for example: - IBM supercomputer using LSF (Load Sharing Facility) - IBM supercomputer using LoadLevel - Linux clusters using PBS (Portable Batch System) - Linux clusters using LSF - Linux workstation (no batch system) - Intel Mac Darwin workstation with PGI complier (no batch system) Two queuing systems are listed below as examples: ====================== =========================== ========================== ================================ Machine & queue system Linux Cluster with LSF Linux Cluster with PBS Workstation ====================== =========================== ========================== ================================ example :: :: No batch system, skip this step #BSUB -P ???????? #PBS -l procs=4 #BSUB -W 00:10 #PBS -n #BSUB -n 4 #PBS -o gsi.out #BSUB -R "span[ptile=16] #PBS -e gsi.err #BSUB -J gsi #PBS -N GSI #BSUB -o gsi.%J.out #PBS -l walltime=00:20 #BSUB -e gsi.%J.err #PBS -A ?????? #BSUB -q small ====================== =========================== ========================== ================================ [t35] In both of the examples above, environment variables are set specifying system resource management, such as the number of processors, the name/type of queue, maximum wall clock time allocated for the job, options for standard out and standard error, etc. Some platforms need additional definitions to specify Unix environment variables that further define the run environment. These variable settings can significantly impact the GSI run efficiency and accuracy of the GSI results. Please check with your system administrator for optimal settings for your computer system. Note that while the GSI can be run with any number of processors, it will not scale well with the increase of processor numbers after a certain threshold based on the case configuration and GSI application types. There are only two options to define in this block. :: # GSIPROC = processor number used for GSI analysis #------------------------------------------------ GSIPROC=4 ARCH='LINUX_LSF' # Supported configurations: # IBM_LSF, # LINUX, LINUX_LSF, LINUX_PBS, # DARWIN_PGI The option ``ARCH`` selects the machine architecture. It is a function of platform type and batch queuing system. The option ``GSIPROC`` sets the number of cores used in the run. This option also decides if the job is run as a multiple core job or as a single core run. Several choices of the option ``ARCH`` are listed in the sample run script. Please check with your system administrator about running parallel MPI jobs on your system. =========== ================= ============= ========================= Option ARCH Platform Compiler batch queuing system =========== ================= ============= ========================= IBM_LSF IBM AIX xlf, xlc LSF LINUX Linux workstation Intel/PGI/GNU mpirun if ``GSIPROC`` > 1 LINUX_LSF Linux cluster Intel/PGI/GNU LSF LINUX_PBS Linux cluster Intel/PGI/GNU PBS DARWIN_PGI MAC DARWIN PGI mpirun if ``GSIPROC`` > 1 =========== ================= ============= ========================= [t36] This section discusses setting up variables specific to a given case, such as analysis time, working directory, background and observation files, location of fixed files and CRTM coefficients, the GSI executable file, and the script generating GSI namelist. :: ##################################################### # case set up (users should change this part) ##################################################### # # ANAL_TIME= analysis time (YYYYMMDDHH) # WORK_ROOT= working directory, where GSI runs # PREPBURF = path of PreBUFR conventional obs # BK_FILE = path and name of background file # OBS_ROOT = path of observations files # FIX_ROOT = path of fix files # GSI_EXE = path and name of the gsi executable # ENS_ROOT = path where ensemble background files exist ANAL_TIME=2017051318 JOB_DIR=the_job_directory #normally you put run scripts here and submit jobs form here, #require a copy of gsi.x at this directory RUN_NAME=a_descriptive_run_name_such_as_case05_3denvar_etc OBS_ROOT=the_directory_where_observation_files_are_located BK_ROOT=the_directory_where_background_files_are_located GSI_ROOT=the_comgsi_main directory where src/ ush/ fix/ etc are located CRTM_ROOT=the_CRTM_directory ENS_ROOT=the_directory_where_ensemble_backgrounds_are_located #ENS_ROOT is not required if not running hybrid EnVAR HH=`echo $ANAL_TIME | cut -c9-10` GSI_EXE=${JOB_DIR}/gsi.x #assume you have a copy of gsi.x here WORK_ROOT=${JOB_DIR}/${RUN_NAME} FIX_ROOT=${GSI_ROOT}/fix GSI_NAMELIST=${GSI_ROOT}/ush/comgsi_namelist.sh PREPBUFR=${OBS_ROOT}/nam.t${HH}z.prepbufr.tm00 BK_FILE=${BK_ROOT}/wrfinput_d01.${ANAL_TIME} When picking the observation BUFR files, please be aware of the following: - GSI run will stop if the time in the background file does not match the cycle time in the observation BUFR file used for the GSI run (there is a namelist option to turn this verification step off). - Even if their contents are identical, PrepBUFR/BUFR files will differ if they were created on platforms with different endian byte order specification (Linux vs. IBM). Appendix A.1 discusses the conversion tool SSRC used to byte-swap observation files. Since release version 3.2, GSI compiled with PGI and Intel can automatically handle byte order issues in PrepBUFR and BUFR files. Users can directly link BUFR files of any order if working with Intel and PGI platform. The next part of this block focuses on additional options that specify important aspects of the GSI configuration. :: #------------------------------------------------ # bk_core= which WRF core is used as background (NMM or ARW or NMMB) # bkcv_option= which background error covariance and parameter will be used # (GLOBAL or NAM) # if_clean = clean : delete temperal files in working directory (default) # no : leave running directory as is (this is for debug only) # if_observer = Yes : only used as observation operater for enkf # if_hybrid = Yes : Run GSI as 3D/4D EnVar # if_4DEnVar = Yes : Run GSI as 4D EnVar # if_nemsio = Yes : The GFS background files are in NEMSIO format # if_oneob = Yes : Do single observation test if_hybrid=No # Yes, or, No -- case sensitive ! if_4DEnVar=No # Yes, or, No -- case sensitive (set if_hybrid=Yes first)! if_observer=No # Yes, or, No -- case sensitive ! if_nemsio=No # Yes, or, No -- case sensitive ! if_oneob=No # Yes, or, No -- case sensitive ! bk_core=ARW bkcv_option=NAM if_clean=clean # # setup whether to do single obs test if [ ${if_oneob} = Yes ]; then if_oneobtest='.true.' else if_oneobtest='.false.' fi # # setup for GSI 3D/4D EnVar hybrid if [ ${if_hybrid} = Yes ] ; then PDYa=`echo $ANAL_TIME | cut -c1-8` cyca=`echo $ANAL_TIME | cut -c9-10` gdate=`date -u -d "$PDYa $cyca -6 hour" +%Y%m%d%H` #guess date is 6hr ago gHH=`echo $gdate |cut -c9-10` datem1=`date -u -d "$PDYa $cyca -1 hour" +%Y-%m-%d_%H:%M:%S` #1hr ago datep1=`date -u -d "$PDYa $cyca 1 hour" +%Y-%m-%d_%H:%M:%S` #1hr later if [ ${if_nemsio} = Yes ]; then if_gfs_nemsio='.true.' ENSEMBLE_FILE_mem=${ENS_ROOT}/gdas.t${gHH}z.atmf006s.mem else if_gfs_nemsio='.false.' ENSEMBLE_FILE_mem=${ENS_ROOT}/sfg_${gdate}_fhr06s_mem fi if [ ${if_4DEnVar} = Yes ] ; then BK_FILE_P1=${BK_ROOT}/wrfout_d01_${datep1} BK_FILE_M1=${BK_ROOT}/wrfout_d01_${datem1} if [ ${if_nemsio} = Yes ]; then ENSEMBLE_FILE_mem_p1=${ENS_ROOT}/gdas.t${gHH}z.atmf009s.mem ENSEMBLE_FILE_mem_m1=${ENS_ROOT}/gdas.t${gHH}z.atmf003s.mem else ENSEMBLE_FILE_mem_p1=${ENS_ROOT}/sfg_${gdate}_fhr09s_mem ENSEMBLE_FILE_mem_m1=${ENS_ROOT}/sfg_${gdate}_fhr03s_mem fi fi fi # The following two only apply when if_observer = Yes, i.e. run observation operator for EnKF # no_member number of ensemble members # BK_FILE_mem path and base for ensemble members no_member=20 BK_FILE_mem=${BK_ROOT}/wrfarw.mem # Option if_hybrid controls whether to run a hybrid ensemble/variational data analysis. If if_hybrid=Yes, option if_4DEnVar=Yes indicates a hybrid 4D-EnVar analysis will be run, while if_4DEnVar=No indicates a hybrid 3DEnVAR analysis will be run. Option if_observer determines whether GSI is run as an observation operator for EnKF. Option bk_core indicates the specific dynamic core used to create the background files and specifies the core in the namelist. Option bk_core can be ARW or NMMB. Option bkcv_option specifies the background error covariance to be used in the case. Two regional background error covariance matrices are provided with the release, one from NCEP global data assimilation (GDAS), and one from the NAM data assimilation system (NDAS). Please check Section `[sec4.8] <#sec4.8>`__ for more details about GSI background error covariance. Option if_clean tells the script if it needs to delete temporary intermediate files in the working directory after a GSI run is completed. In most cases, users should only make minor changes after the following: :: ##################################################### # Users should NOT change script after this point ##################################################### # BYTE_ORDER=Big_Endian # BYTE_ORDER=Little_Endian .. _sec3.2.3: Description of the Sample Regional Run Script to Run GSI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Listed below is an annotated regional run script with explanations on each function block. For further details on the first three blocks of the script that users need to change, see sections 3.2.2.1, 3.2.2.2, and 3.2.2.3: :: #!/bin/ksh ##################################################### # machine set up (users should change this part) ##################################################### set -x # # GSIPROC = processor number used for GSI analysis #------------------------------------------------ GSIPROC=1 ARCH='LINUX_LSF' # Supported configurations: # IBM_LSF, # LINUX, LINUX_LSF, LINUX_PBS, # DARWIN_PGI # ##################################################### # case set up (users should change this part) ##################################################### # # ANAL_TIME= analysis time (YYYYMMDDHH) # WORK_ROOT= working directory, where GSI runs # PREPBURF = path of PreBUFR conventional obs # BK_FILE = path and name of background file # OBS_ROOT = path of observations files # FIX_ROOT = path of fix files # GSI_EXE = path and name of the gsi executable # ENS_ROOT = path where ensemble background files exist ANAL_TIME=2017051318 JOB_DIR=the_job_directory #normally you put run scripts here and submit jobs form here, require a copy of gsi.x at this directory RUN_NAME=a_descriptive_run_name_such_as_case05_3denvar_etc OBS_ROOT=the_directory_where_observation_files_are_located BK_ROOT=the_directory_where_background_files_are_located GSI_ROOT=the_comgsi_main directory where src/ ush/ fix/ etc are located CRTM_ROOT=the_CRTM_directory ENS_ROOT=the_directory_where_ensemble_backgrounds_are_located #ENS_ROOT is not required if not running hybrid EnVAR HH=`echo $ANAL_TIME | cut -c9-10` GSI_EXE=${JOB_DIR}/gsi.x #assume you have a copy of gsi.x here WORK_ROOT=${JOB_DIR}/${RUN_NAME} FIX_ROOT=${GSI_ROOT}/fix GSI_NAMELIST=${GSI_ROOT}/ush/comgsi_namelist.sh PREPBUFR=${OBS_ROOT}/nam.t${HH}z.prepbufr.tm00 BK_FILE=${BK_ROOT}/wrfinput_d01.${ANAL_TIME} # #------------------------------------------------ # bk_core= which WRF core is used as background (NMM or ARW or NMMB) # bkcv_option= which background error covariance and parameter will be used # (GLOBAL or NAM) # if_clean = clean : delete temperal files in working directory (default) # no : leave running directory as is (this is for debug only) # if_observer = Yes : only used as observation operater for enkf # if_hybrid = Yes : Run GSI as 3D/4D EnVar # if_4DEnVar = Yes : Run GSI as 4D EnVar # if_nemsio = Yes : The GFS background files are in NEMSIO format # if_oneob = Yes : Do single observation test if_hybrid=No # Yes, or, No -- case sensitive ! if_4DEnVar=No # Yes, or, No -- case sensitive (set if_hybrid=Yes first)! if_observer=No # Yes, or, No -- case sensitive ! if_nemsio=No # Yes, or, No -- case sensitive ! if_oneob=No # Yes, or, No -- case sensitive ! bk_core=ARW bkcv_option=NAM if_clean=clean # # setup whether to do single obs test if [ ${if_oneob} = Yes ]; then if_oneobtest='.true.' else if_oneobtest='.false.' fi # # setup for GSI 3D/4D EnVar hybrid if [ ${if_hybrid} = Yes ] ; then PDYa=`echo $ANAL_TIME | cut -c1-8` cyca=`echo $ANAL_TIME | cut -c9-10` gdate=`date -u -d "$PDYa $cyca -6 hour" +%Y%m%d%H` #guess date is 6hr ago gHH=`echo $gdate |cut -c9-10` datem1=`date -u -d "$PDYa $cyca -1 hour" +%Y-%m-%d_%H:%M:%S` #1hr ago datep1=`date -u -d "$PDYa $cyca 1 hour" +%Y-%m-%d_%H:%M:%S` #1hr later if [ ${if_nemsio} = Yes ]; then if_gfs_nemsio='.true.' ENSEMBLE_FILE_mem=${ENS_ROOT}/gdas.t${gHH}z.atmf006s.mem else if_gfs_nemsio='.false.' ENSEMBLE_FILE_mem=${ENS_ROOT}/sfg_${gdate}_fhr06s_mem fi if [ ${if_4DEnVar} = Yes ] ; then BK_FILE_P1=${BK_ROOT}/wrfout_d01_${datep1} BK_FILE_M1=${BK_ROOT}/wrfout_d01_${datem1} if [ ${if_nemsio} = Yes ]; then ENSEMBLE_FILE_mem_p1=${ENS_ROOT}/gdas.t${gHH}z.atmf009s.mem ENSEMBLE_FILE_mem_m1=${ENS_ROOT}/gdas.t${gHH}z.atmf003s.mem else ENSEMBLE_FILE_mem_p1=${ENS_ROOT}/sfg_${gdate}_fhr09s_mem ENSEMBLE_FILE_mem_m1=${ENS_ROOT}/sfg_${gdate}_fhr03s_mem fi fi fi # The following two only apply when if_observer = Yes, i.e. run observation operator for EnKF # no_member number of ensemble members # BK_FILE_mem path and base for ensemble members no_member=20 BK_FILE_mem=${BK_ROOT}/wrfarw.mem # # At this point, users should be able to run the GSI for simple cases without changing the scripts. However, some advanced users may need to change some of the following blocks for special applications, such as use of radiance data, cycled runs, specifying certain namelist variables, or running GSI on a platform not tested by the DTC. :: ##################################################### # Users should NOT change script after this point ##################################################### The next block sets the run command for GSI on multiple platforms. The ARCH variable is set at the beginning of the script. Option BYTE_ORDER has been set as Big_Endian because GSI compiled with Intel and PGI can read a Big_Endian background error file, BUFR files, and CRTM coefficient files. :: ##################################################### # Users should NOT make changes after this point ##################################################### # BYTE_ORDER=Big_Endian # BYTE_ORDER=Little_Endian case $ARCH in 'IBM_LSF') ###### IBM LSF (Load Sharing Facility) RUN_COMMAND="mpirun.lsf " ;; 'LINUX') if [ $GSIPROC = 1 ]; then #### Linux workstation - single processor RUN_COMMAND="" else ###### Linux workstation - mpi run RUN_COMMAND="mpirun -np ${GSIPROC} -machinefile ~/mach " fi ;; 'LINUX_LSF') ###### LINUX LSF (Load Sharing Facility) RUN_COMMAND="mpirun.lsf " ;; 'LINUX_PBS') #### Linux cluster PBS (Portable Batch System) RUN_COMMAND="mpirun -np ${GSIPROC} " ;; 'DARWIN_PGI') ### Mac - mpi run if [ $GSIPROC = 1 ]; then #### Mac workstation - single processor RUN_COMMAND="" else ###### Mac workstation - mpi run RUN_COMMAND="mpirun -np ${GSIPROC} -machinefile ~/mach " fi ;; * ) print "error: $ARCH is not a supported platform configuration." exit 1 ;; esac The next block checks if all the variables needed for a GSI run are properly defined. These variables should have been defined in the first three parts of this script. :: ################################################################################## # Check GSI needed environment variables are defined and exist # # Make sure ANAL_TIME is defined and in the correct format if [ ! "${ANAL_TIME}" ]; then echo "ERROR: \$ANAL_TIME is not defined!" exit 1 fi # Make sure WORK_ROOT is defined and exists if [ ! "${WORK_ROOT}" ]; then echo "ERROR: \$WORK_ROOT is not defined!" exit 1 fi # Make sure the background file exists if [ ! -r "${BK_FILE}" ]; then echo "ERROR: ${BK_FILE} does not exist!" exit 1 fi # Make sure OBS_ROOT is defined and exists if [ ! "${OBS_ROOT}" ]; then echo "ERROR: \$OBS_ROOT is not defined!" exit 1 fi if [ ! -d "${OBS_ROOT}" ]; then echo "ERROR: OBS_ROOT directory '${OBS_ROOT}' does not exist!" exit 1 fi # Set the path to the GSI static files if [ ! "${FIX_ROOT}" ]; then echo "ERROR: \$FIX_ROOT is not defined!" exit 1 fi if [ ! -d "${FIX_ROOT}" ]; then echo "ERROR: fix directory '${FIX_ROOT}' does not exist!" exit 1 fi # Set the path to the CRTM coefficients if [ ! "${CRTM_ROOT}" ]; then echo "ERROR: \$CRTM_ROOT is not defined!" exit 1 fi if [ ! -d "${CRTM_ROOT}" ]; then echo "ERROR: fix directory '${CRTM_ROOT}' does not exist!" exit 1 fi # Make sure the GSI executable exists if [ ! -x "${GSI_EXE}" ]; then echo "ERROR: ${GSI_EXE} does not exist!" exit 1 fi # Check to make sure the number of processors for running GSI was specified if [ -z "${GSIPROC}" ]; then echo "ERROR: The variable $GSIPROC must be set to contain the number of processors to run GSI" exit 1 fi The next block creates a working directory (workdir) in which GSI will run. The directory should have enough disk space to hold all the files needed for this run. This directory is cleaned before each run, therefore, save all the files needed from the previous run before rerunning GSI. :: ################################################################################## # Create the work directory and cd into it workdir=${WORK_ROOT} echo " Create working directory:" ${workdir} if [ -d "${workdir}" ]; then rm -rf ${workdir} fi mkdir -p ${workdir} cd ${workdir} # ################################################################################## echo " Copy GSI executable, background file, and link observation bufr to working directory" # Save a copy of the GSI executable in the workdir cp ${GSI_EXE} gsi.exe # Bring over background field (it's modified by GSI so we can't link to it) cp ${BK_FILE} ./wrf_inout if [ ${if_4DEnVar} = Yes ] ; then cp ${BK_FILE_P1} ./wrf_inou3 cp ${BK_FILE_M1} ./wrf_inou1 fi Note: You can link observation files to the working directory because GSI will not overwrite these files. The observations that can be analyzed in GSI are listed in the column "dfile" of the GSI namelist section OBS_INPUT, as specified in ``run/comgsi_namelist.sh``. Most of the conventional observations are in one single file named prepbufr, while different radiance data are in separate files based on satellite instruments, such as AMSU-A or HIRS. All these observation files must be linked as GSI recognized file names in "dfile." Please check table `[t31] <#t31>`__ for a detailed explanation of links and the meanings of each file name listed below. :: # Link to the prepbufr data ln -s ${PREPBUFR} ./prepbufr # ln -s ${OBS_ROOT}/gdas1.t${HH}z.sptrmm.tm00.bufr_d tmirrbufr # Link to the radiance data srcobsfile[1]=${OBS_ROOT}/gdas1.t${HH}z.satwnd.tm00.bufr_d gsiobsfile[1]=satwnd srcobsfile[2]=${OBS_ROOT}/gdas1.t${HH}z.1bamua.tm00.bufr_d gsiobsfile[2]=amsuabufr srcobsfile[3]=${OBS_ROOT}/gdas1.t${HH}z.1bhrs4.tm00.bufr_d gsiobsfile[3]=hirs4bufr srcobsfile[4]=${OBS_ROOT}/gdas1.t${HH}z.1bmhs.tm00.bufr_d gsiobsfile[4]=mhsbufr srcobsfile[5]=${OBS_ROOT}/gdas1.t${HH}z.1bamub.tm00.bufr_d gsiobsfile[5]=amsubbufr srcobsfile[6]=${OBS_ROOT}/gdas1.t${HH}z.ssmisu.tm00.bufr_d gsiobsfile[6]=ssmirrbufr # srcobsfile[7]=${OBS_ROOT}/gdas1.t${HH}z.airsev.tm00.bufr_d gsiobsfile[7]=airsbufr srcobsfile[8]=${OBS_ROOT}/gdas1.t${HH}z.sevcsr.tm00.bufr_d gsiobsfile[8]=seviribufr srcobsfile[9]=${OBS_ROOT}/gdas1.t${HH}z.iasidb.tm00.bufr_d gsiobsfile[9]=iasibufr srcobsfile[10]=${OBS_ROOT}/gdas1.t${HH}z.gpsro.tm00.bufr_d gsiobsfile[10]=gpsrobufr srcobsfile[11]=${OBS_ROOT}/gdas1.t${HH}z.amsr2.tm00.bufr_d gsiobsfile[11]=amsrebufr srcobsfile[12]=${OBS_ROOT}/gdas1.t${HH}z.atms.tm00.bufr_d gsiobsfile[12]=atmsbufr srcobsfile[13]=${OBS_ROOT}/gdas1.t${HH}z.geoimr.tm00.bufr_d gsiobsfile[13]=gimgrbufr srcobsfile[14]=${OBS_ROOT}/gdas1.t${HH}z.gome.tm00.bufr_d gsiobsfile[14]=gomebufr srcobsfile[15]=${OBS_ROOT}/gdas1.t${HH}z.omi.tm00.bufr_d gsiobsfile[15]=omibufr srcobsfile[16]=${OBS_ROOT}/gdas1.t${HH}z.osbuv8.tm00.bufr_d gsiobsfile[16]=sbuvbufr srcobsfile[17]=${OBS_ROOT}/gdas1.t${HH}z.eshrs3.tm00.bufr_d gsiobsfile[17]=hirs3bufrears srcobsfile[18]=${OBS_ROOT}/gdas1.t${HH}z.esamua.tm00.bufr_d gsiobsfile[18]=amsuabufrears srcobsfile[19]=${OBS_ROOT}/gdas1.t${HH}z.esmhs.tm00.bufr_d gsiobsfile[19]=mhsbufrears srcobsfile[20]=${OBS_ROOT}/rap.t${HH}z.nexrad.tm00.bufr_d gsiobsfile[20]=l2rwbufr srcobsfile[21]=${OBS_ROOT}/rap.t${HH}z.lgycld.tm00.bufr_d gsiobsfile[21]=larcglb ii=1 while [[ $ii -le 21 ]]; do if [ -r "${srcobsfile[$ii]}" ]; then ln -s ${srcobsfile[$ii]} ${gsiobsfile[$ii]} echo "link source obs file ${srcobsfile[$ii]}" fi (( ii = $ii + 1 )) done The following block copies constant fixed files from the fix/ directory and links CRTM coefficients. Please check Section 3.1 for the meanings of each fixed file. :: ################################################################################## echo " Copy fixed files and link CRTM coefficient files to working directory" # Set fixed files # berror = forecast model background error statistics # specoef = CRTM spectral coefficients # trncoef = CRTM transmittance coefficients # emiscoef = CRTM coefficients for IR sea surface emissivity model # aerocoef = CRTM coefficients for aerosol effects # cldcoef = CRTM coefficients for cloud effects # satinfo = text file with information about assimilation of brightness temperatures # satangl = angle dependent bias correction file (fixed in time) # pcpinfo = text file with information about assimilation of prepcipitation rates # ozinfo = text file with information about assimilation of ozone data # errtable = text file with obs error for conventional data (regional only) # convinfo = text file with information about assimilation of conventional data # bufrtable= text file ONLY needed for single obs test (oneobstest=.true.) # bftab_sst= bufr table for sst ONLY needed for sst retrieval (retrieval=.true.) Note: For background error covariances, observation errors, and analysis variable information, we provide two sets of fixed files. One set is based on GFS statistics and another is based on NAM statistics. For this release there is an additional setting in the ANAVINFO file for "bk_core" for both GFS and NAM statistics. :: if [ ${bkcv_option} = GLOBAL ] ; then echo ' Use global background error covariance' BERROR=${FIX_ROOT}/${BYTE_ORDER}/nam_glb_berror.f77.gcv OBERROR=${FIX_ROOT}/prepobs_errtable.global if [ ${bk_core} = NMM ] ; then ANAVINFO=${FIX_ROOT}/anavinfo_ndas_netcdf_glbe fi if [ ${bk_core} = ARW ] ; then ANAVINFO=${FIX_ROOT}/anavinfo_arw_netcdf_glbe fi if [ ${bk_core} = NMMB ] ; then ANAVINFO=${FIX_ROOT}/anavinfo_nems_nmmb_glb fi else echo ' Use NAM background error covariance' BERROR=${FIX_ROOT}/${BYTE_ORDER}/nam_nmmstat_na.gcv OBERROR=${FIX_ROOT}/nam_errtable.r3dv if [ ${bk_core} = NMM ] ; then ANAVINFO=${FIX_ROOT}/anavinfo_ndas_netcdf fi if [ ${bk_core} = ARW ] ; then ANAVINFO=${FIX_ROOT}/anavinfo_arw_netcdf fi if [ ${bk_core} = NMMB ] ; then ANAVINFO=${FIX_ROOT}/anavinfo_nems_nmmb fi fi SATINFO=${FIX_ROOT}/global_satinfo.txt CONVINFO=${FIX_ROOT}/global_convinfo.txt OZINFO=${FIX_ROOT}/global_ozinfo.txt PCPINFO=${FIX_ROOT}/global_pcpinfo.txt # copy Fixed fields to working directory cp $ANAVINFO anavinfo cp $BERROR berror_stats cp $SATINFO satinfo cp $CONVINFO convinfo cp $OZINFO ozinfo cp $PCPINFO pcpinfo cp $OBERROR errtable # # # CRTM Spectral and Transmittance coefficients CRTM_ROOT_ORDER=${CRTM_ROOT}/${BYTE_ORDER} emiscoef_IRwater=${CRTM_ROOT_ORDER}/Nalli.IRwater.EmisCoeff.bin emiscoef_IRice=${CRTM_ROOT_ORDER}/NPOESS.IRice.EmisCoeff.bin emiscoef_IRland=${CRTM_ROOT_ORDER}/NPOESS.IRland.EmisCoeff.bin emiscoef_IRsnow=${CRTM_ROOT_ORDER}/NPOESS.IRsnow.EmisCoeff.bin emiscoef_VISice=${CRTM_ROOT_ORDER}/NPOESS.VISice.EmisCoeff.bin emiscoef_VISland=${CRTM_ROOT_ORDER}/NPOESS.VISland.EmisCoeff.bin emiscoef_VISsnow=${CRTM_ROOT_ORDER}/NPOESS.VISsnow.EmisCoeff.bin emiscoef_VISwater=${CRTM_ROOT_ORDER}/NPOESS.VISwater.EmisCoeff.bin emiscoef_MWwater=${CRTM_ROOT_ORDER}/FASTEM6.MWwater.EmisCoeff.bin aercoef=${CRTM_ROOT_ORDER}/AerosolCoeff.bin cldcoef=${CRTM_ROOT_ORDER}/CloudCoeff.bin ln -s $emiscoef_IRwater ./Nalli.IRwater.EmisCoeff.bin ln -s $emiscoef_IRice ./NPOESS.IRice.EmisCoeff.bin ln -s $emiscoef_IRsnow ./NPOESS.IRsnow.EmisCoeff.bin ln -s $emiscoef_IRland ./NPOESS.IRland.EmisCoeff.bin ln -s $emiscoef_VISice ./NPOESS.VISice.EmisCoeff.bin ln -s $emiscoef_VISland ./NPOESS.VISland.EmisCoeff.bin ln -s $emiscoef_VISsnow ./NPOESS.VISsnow.EmisCoeff.bin ln -s $emiscoef_VISwater ./NPOESS.VISwater.EmisCoeff.bin ln -s $emiscoef_MWwater ./FASTEM6.MWwater.EmisCoeff.bin ln -s $aercoef ./AerosolCoeff.bin ln -s $cldcoef ./CloudCoeff.bin # Copy CRTM coefficient files based on entries in satinfo file for file in `awk '{if($1!~"!"){print $1}}' ./satinfo | sort | uniq` ;do ln -s ${CRTM_ROOT_ORDER}/${file}.SpcCoeff.bin ./ ln -s ${CRTM_ROOT_ORDER}/${file}.TauCoeff.bin ./ done # Only need this file for single obs test bufrtable=${FIX_ROOT}/prepobs_prep.bufrtable cp $bufrtable ./prepobs_prep.bufrtable # for satellite bias correction # Users may need to use their own satbias files for correct bias correction cp ${GSI_ROOT}/fix/comgsi_satbias_in ./satbias_in cp ${GSI_ROOT}/fix/comgsi_satbias_pc_in ./satbias_pc_in Please note that in the above sample script, two files related to radiance bias correction are copied to the work directory: :: cp ${GSI_ROOT}/fix/comgsi_satbias_in ./satbias_in cp ${GSI_ROOT}/fix/comgsi_satbias_pc_in ./satbias_pc_in There are two options on how to perform the radiance bias correction. The first method is to do the angle dependent bias correction offline and do the mass bias correction inside the GSI analysis, therefore requiring two input files: ``satbias_angle``, corresponding to the angle dependent bias correction file and ``satbias_in``, being the input file for mass bias correction. The second method is to combine the angle dependent and mass bias correction together and do it within the GSI analysis, requiring one combined input file: ``satbias_in``. Note that the input bias correction coefficients file, ``satbias_in``, is different for the two options, therefore it is important to use the appropriate input file for each method. The sample input files for the first method are provided with this release: ``global_satangbias.txt`` and ``sample.satbias``. To use the second option - combined angle dependent and mass bias correction, a sample file, ``gdas1.t00z.abias_pc.20150617``, is also provided. As a starting point, users may also download a GDAS satbias coefficient file from the NOMADS ftp site as the input file (starting in spring 2015, the GDAS ``satbias`` files have adopted the following format): ftp://nomads.ncdc.noaa.gov/GDAS/YYYYMM/YYYYMMDD/gdas1.tHHz.abias In order to use the combined angle dependent and mass bias correction, users also need to set ``adp_anglebc=.true.`` in the ``&SETUP`` section of the GSI namelist (``comgsi_namelist.sh``). For more details about the namelist, please see Appendix C in this document. Set up some constants used in the GSI namelist. Please note that ``bkcv_option`` is set for background error tuning. They should be set based on specific applications. Here we provide three sample sets of the constants for different background error covariance options, one set is used in the NAM operations, one for the GFS operations and one for the NMMB operations. In this release, the capability of NMMB application is included and therefore the namelist settings for NMMB are provided in addition to NMM and ARW applications. :: ################################################################################## # Set some parameters for use by the GSI executable and to build the namelist echo " Build the namelist " # default is NAM # as_op='1.0,1.0,0.5 ,0.7,0.7,0.5,1.0,1.0,' vs_op='1.0,' hzscl_op='0.373,0.746,1.50,' if [ ${bkcv_option} = GLOBAL ] ; then # as_op='0.6,0.6,0.75,0.75,0.75,0.75,1.0,1.0' vs_op='0.7,' hzscl_op='1.7,0.8,0.5,' fi if [ ${bk_core} = NMMB ] ; then vs_op='0.6,' fi # default is NMM bk_core_arw='.false.' bk_core_nmm='.true.' bk_core_nmmb='.false.' bk_if_netcdf='.true.' if [ ${bk_core} = ARW ] ; then bk_core_arw='.true.' bk_core_nmm='.false.' bk_core_nmmb='.false.' bk_if_netcdf='.true.' fi if [ ${bk_core} = NMMB ] ; then bk_core_arw='.false.' bk_core_nmm='.false.' bk_core_nmmb='.true.' bk_if_netcdf='.false.' fi The following section specifies the number of outer loops and whether to save GSI read observations based on the setting of ”if_observer”. :: if [ ${if_observer} = Yes ] ; then nummiter=0 if_read_obs_save='.true.' if_read_obs_skip='.false.' else nummiter=2 if_read_obs_save='.false.' if_read_obs_skip='.false.' fi The following section of the script is used to generate the GSI namelist called gsiparm.anl in the working directory. A detailed explanation of each variable can be found in Section 3.4 and Appendix C. :: # Build the GSI namelist on-the-fly . $GSI_NAMELIST The following block modifies the anavinfo file so that its vertical levels are consistent with the wrf_inout file for WRF ARW or NMM. Users no longer need to manually modify the anavinfo file. :: # modify the anavinfo vertical levels based on wrf_inout for WRF ARW and NMM if [ ${bk_core} = ARW ] || [ ${bk_core} = NMM ] ; then bklevels=`ncdump -h wrf_inout | grep "bottom_top =" | awk '{print $3}' ` bklevels_stag=`ncdump -h wrf_inout | grep "bottom_top_stag =" | awk '{print $3}' ` anavlevels=`cat anavinfo | grep ' sf ' | tail -1 | awk '{print $2}' ` # levels of sf, vp, u, v, t, etc anavlevels_stag=`cat anavinfo | grep ' prse ' | tail -1 | awk '{print $2}' ` # levels of prse sed -i 's/ '$anavlevels'/ '$bklevels'/g' anavinfo sed -i 's/ '$anavlevels_stag'/ '$bklevels_stag'/g' anavinfo fi The following block runs GSI and checks if GSI has successfully completed. :: ################################################### # run GSI ################################################### echo ' Run GSI with' ${bk_core} 'background' case $ARCH in 'IBM_LSF') ${RUN_COMMAND} ./gsi.exe < gsiparm.anl > stdout 2>&1 ;; * ) ${RUN_COMMAND} ./gsi.exe > stdout 2>&1 ;; esac ################################################################## # run time error check ################################################################## error=$? if [ ${error} -ne 0 ]; then echo "ERROR: ${GSI} crashed Exit status=${error}" exit ${error} fi The following block saves the analysis results with an understandable name and adds the analysis time to some output file names. Among them, "stdout" contains runtime output of GSI and ``wrf_inout`` is the resulting analysis file. :: ################################################################## # # GSI updating satbias_in # # GSI updating satbias_in (only for cycling assimilation) # Copy the output to more understandable names ln -s stdout stdout.anl.${ANAL_TIME} ln -s wrf_inout wrfanl.${ANAL_TIME} ln -s fort.201 fit_p1.${ANAL_TIME} ln -s fort.202 fit_w1.${ANAL_TIME} ln -s fort.203 fit_t1.${ANAL_TIME} ln -s fort.204 fit_q1.${ANAL_TIME} ln -s fort.207 fit_rad1.${ANAL_TIME} The following block collects the diagnostic files. The diagnostic files are merged and categorized based on outer loop and data type. Setting "write_diag" to true in the namelist directs GSI to write out diagnostic information for each observation. This information is very useful to check analysis details. Please check Appendix A.2 for the tool to read and analyze these diagnostic files. :: # Loop over first and last outer loops to generate innovation # diagnostic files for indicated observation types (groups) # # NOTE: Since we set miter=2 in GSI namelist SETUP, outer # loop 03 will contain innovations with respect to # the analysis. Creation of o-a innovation files # is triggered by write_diag(3)=.true. The setting # write_diag(1)=.true. turns on creation of o-g # innovation files. # loops="01 03" for loop in $loops; do case $loop in 01) string=ges;; 03) string=anl;; *) string=$loop;; esac # Collect diagnostic files for obs types (groups) below # listall="conv amsua_metop-a mhs_metop-a hirs4_metop-a hirs2_n14 msu_n14 \ # sndr_g08 sndr_g10 sndr_g12 sndr_g08_prep sndr_g10_prep sndr_g12_prep \ # sndrd1_g08 sndrd2_g08 sndrd3_g08 sndrd4_g08 sndrd1_g10 sndrd2_g10 \ # sndrd3_g10 sndrd4_g10 sndrd1_g12 sndrd2_g12 sndrd3_g12 sndrd4_g12 \ # hirs3_n15 hirs3_n16 hirs3_n17 amsua_n15 amsua_n16 amsua_n17 \ # amsub_n15 amsub_n16 amsub_n17 hsb_aqua airs_aqua amsua_aqua \ # goes_img_g08 goes_img_g10 goes_img_g11 goes_img_g12 \ # pcp_ssmi_dmsp pcp_tmi_trmm sbuv2_n16 sbuv2_n17 sbuv2_n18 \ # omi_aura ssmi_f13 ssmi_f14 ssmi_f15 hirs4_n18 amsua_n18 mhs_n18 \ # amsre_low_aqua amsre_mid_aqua amsre_hig_aqua ssmis_las_f16 \ # ssmis_uas_f16 ssmis_img_f16 ssmis_env_f16 mhs_metop_b \ # hirs4_metop_b hirs4_n19 amusa_n19 mhs_n19" listall=`ls pe* | cut -f2 -d"." | awk '{print substr($0, 0, length($0)-3)}' | sort | uniq` for type in $listall; do count=`ls pe*${type}_${loop}* | wc -l` if [[ $count -gt 0 ]]; then cat pe*${type}_${loop}* > diag_${type}_${string}.${ANAL_TIME} fi done done The following scripts clean the temporary intermediate files: :: # Clean working directory to save only important files ls -l * > list_run_directory if [[ ${if_clean} = clean && ${if_observer} != Yes ]]; then echo ' Clean working directory after GSI run' rm -f *Coeff.bin # all CRTM coefficient files rm -f pe0* # diag files on each processor rm -f obs_input.* # observation middle files rm -f siganl sigf03 # background middle files rm -f fsize_* # delete temperal file for bufr size fi The following block of the script runs only for ``if_observer=Yes``, which runs GSI as an observation operator for EnKF and without doing minimization. The script first renames the previous diagnostics files and GSI analysis file by appending ``.ensmean`` to the filenames to avoid these files being overwritten by the new GSI run. :: ################################################# # start to calculate diag files for each member ################################################# # if [ ${if_observer} = Yes ] ; then string=ges for type in $listall; do count=0 if [[ -f diag_${type}_${string}.${ANAL_TIME} ]]; then mv diag_${type}_${string}.${ANAL_TIME} diag_${type}_${string}.ensmean fi done mv wrf_inout wrf_inout_ensmean Next, the script generates the namelist for each ensemble member. :: # Build the GSI namelist on-the-fly for each member nummiter=0 if_read_obs_save='.false.' if_read_obs_skip='.true.' . $GSI_NAMELIST The rest of the script loops through the ensemble members to get the background ready, run GSI, and check the run status: :: # Loop through each member loop="01" ensmem=1 while [[ $ensmem -le $no_member ]];do rm pe0* print "\$ensmem is $ensmem" ensmemid=`printf %3.3i $ensmem` # get new background for each member if [[ -f wrf_inout ]]; then rm wrf_inout fi BK_FILE=${BK_FILE_mem}${ensmemid} echo $BK_FILE ln -s $BK_FILE wrf_inout # run GSI echo ' Run GSI with' ${bk_core} 'for member ', ${ensmemid} case $ARCH in 'IBM_LSF') ${RUN_COMMAND} ./gsi.exe < gsiparm.anl > stdout_mem${ensmemid} 2>&1 ;; * ) ${RUN_COMMAND} ./gsi.exe > stdout_mem${ensmemid} 2>&1 ;; esac # run time error check and save run time file status error=$? if [ ${error} -ne 0 ]; then echo "ERROR: ${GSI} crashed for member ${ensmemid} Exit status=${error}" exit ${error} fi ls -l * > list_run_directory_mem${ensmemid} The following lines generate the diagnostics files for each member. :: # generate diag files for type in $listall; do count=`ls pe*${type}_${loop}* | wc -l` if [[ $count -gt 0 ]]; then cat pe*${type}_${loop}* > diag_${type}_${string}.mem${ensmemid} fi done The following section is to move on to the next ensemble member and run GSI. :: # next member (( ensmem += 1 )) done fi If this point is reached, the GSI successfully finishes and exits with status "0": :: exit 0 .. _sec3.3: GSI Analysis Result Files in Run Directory ------------------------------------------ Once the GSI run script is set up, it is ready to be submitted like any other batch job. When completed, GSI will create a number of files in the run directory. Below is an example of the files generated in the run directory from one of the GSI test case runs. This case was run to perform a regional GSI analysis with a WRF-ARW NetCDF background using conventional (prepbufr), radiance (AMSU-A, HIRS4, and MHS), and GPSRO data. The analysis time is 1200Z on 13 May 2017. Four processors were used. To make the run directory more readable, we turned on the clean option in the run script, which deleted all temporary intermediate files. :: amsuabufr fort.206 hirs3bufrears amsuabufrears fort.207 hirs4bufr anavinfo fort.208 l2rwbufr atmsbufr fort.209 larcglb berror_stats fort.210 list_run_directory convinfo fort.211 mhsbufr diag_amsua_n15_anl.2017051312 fort.212 mhsbufrears diag_amsua_n15_ges.2017051312 fort.213 omibufr diag_amsua_n18_anl.2017051312 fort.214 ozinfo diag_amsua_n18_ges.2017051312 fort.215 pcpbias_out diag_amsua_n19_anl.2017051312 fort.217 pcpinfo diag_amsua_n19_ges.2017051312 fort.218 prepbufr diag_conv_anl.2017051312 fort.219 prepobs_prep.bufrtable diag_conv_ges.2017051312 fort.220 radar_supobs_from_level2 diag_hirs4_n19_anl.2017051312 fort.221 satbias_angle diag_hirs4_n19_ges.2017051312 fort.223 satbias_ang.out diag_mhs_n18_anl.2017051312 fort.224 satbias_in diag_mhs_n18_ges.2017051312 fort.225 satbias_out diag_mhs_n19_anl.2017051312 fort.226 satbias_out.int diag_mhs_n19_ges.2017051312 fort.227 satbias_pc_in errtable fort.228 satbias_pc.out fit_p1.2017051312 fort.229 satinfo fit_q1.2017051312 fort.230 satwnd fit_rad1.2017051312 fort.232 sbuvbufr fit_t1.2017051312 fort.233 seviribufr fit_w1.2017051312 fort.234 ssmirrbufr fort.201 gimgrbufr stdout fort.202 gomebufr stdout.anl.2017051312 fort.203 gpsrobufr wrfanl.2017051312 fort.204 gsi.exe wrf_inout fort.205 gsiparm.anl It is important to know which files hold the GSI analysis results, standard output, and diagnostic information. We will introduce these files and their contents in detail in the following chapter. The following is a brief list of what these files contain: - *stdout* or *stdout.anl.(time)*: standard text output file. *stdout.anl.(time)* is a link to *stdout* with the analysis time appended. This is the most commonly used file to check the GSI analysis processes and contains basic and important information about the analyses. We will explain the contents of the *stdout* file in Section 4.1 and users are encouraged to read this file in detail to become familiar with the order of GSI analysis processing. - *wrf_inout* or *wrfanl.(time)*: analysis results if GSI completes successfully. It exists only if using WRF for the background. The *wrfanl.(time)* file is a link to *wrf_inout* with the analysis time appended. The format is the same as the background file. - *diag_conv_anl.(time)*: binary diagnostic files for conventional and GPS RO observations at the final analysis step (analysis departure for each observation). - *diag_conv_ges.(time)*: binary diagnostic files for conventional and GPS RO observations before the initial analysis step (background departure for each observation) - *diag_(instrument_satellite)_anl*: diagnostic files for satellite radiance observations at the final analysis step. - *diag_(instrument_satellite)_ges*: diagnostic files for satellite radiance observations before the initial analysis step. - *gsiparm.anl*: GSI namelist, generated by the run script. - *fit_(variable).(time)*: links to fort.2?? with meaningful names (variable name plus analysis time). They are statistic results of observation departures from background and analysis results according to observation variables. Please see Section 4.5 for more details. - *fort.220*: output from the inner loop minimization (in *pcgsoi.f90*). Please see Section 4.6 for details. - *anavinfo*: info file to set up control, state, and background variables. Please see the Advanced GSI Users Guide for details. - *\*info* (*convinfo*,\ *satinfo*, …): info files that control data usage. Please see Section `[sec4.3] <#sec4.3>`__ for details. - *berror_stats* and *errtable*: background error file (binary) and observation error file (text). - *\*bufr*: observation BUFR files linked to the run directoryi. Please see Section 3.1 for details. - *satbias_in*: the input coefficients of bias correction for satellite radiance observations. - *satbias_out*: the output coefficients of bias correction for satellite radiance observations after the GSI run. - *satbias_pc*: the input coefficients of bias correction for passive satellite radiance observations. - *list_run_directory* : the complete list of files in the run directory before cleaning takes place. This is generated by the GSI run script. The ``diag`` files, such as ``diag_(instrument_satellite)_anl.(time)`` and ``diag_conv_anl.(time)``, contain important information about the data used in the GSI, including observation departure from analysis results for each observation (O-A). Similarly, ``diag_conv_ges`` and ``diag_(instrumen_satellite)_ges.(time)`` include the observation innovation for each observation (O-B). These files can be very helpful in understanding the detailed impact of data on the analysis. A tool is provided to process these files, which is introduced in Appendix A.2. There are many intermediate files in this directory while GSI is running or if the run crashes. The complete list of files in the directory (prior to cleaning) is saved in file ``list_run_directory``. Some knowledge about the content of these files is very helpful for debugging if the GSI run crashes. Please check table `[t37] <#t37>`__ for the meaning of these files. (Note: you may not see all the files in the list because different observational data are used. Also, the fixed files prepared for a GSI run, such as CRTM coefficient files, are not included.) .. table:: List of GSI intermediate files ================================================== ======================================================================================================================================================================================================== File name Content ================================================== ======================================================================================================================================================================================================== sigf03 This is a temporary file, holding binary format background files (typically sigf03, sigf06 and sigf09 if FGAT used). When you see this file, at the minimum, a background file was successfully read in. siganl Analysis results in binary format. When this file exists, the analysis has finished. pe????.(conv or instrument_satellite)_(outer loop) Diagnostic files for conventional and satellite radiance observations at each outer loop and each sub-domain (????=subdomain id)i. obs_input.???? Observation scratch files (each file contains observations for one observation type within the whole analysis domain and time window. ????=observation type id in namelist). pcpbias_out Output precipitation bias correction file. ================================================== ======================================================================================================================================================================================================== [t37] Introduction to Frequently Used GSI Namelist Options ---------------------------------------------------- The complete namelist options and their explanations are listed in Appendix A of the Advanced GSI Users Guide. For most GSI analysis applications, only a few namelist variables need to be changed. Here we introduce frequently used variables for regional analyses: Set Up the Number of Outer and Inner Loops ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To change the number of outer loops and the number of inner iterations in each outer loop, the following three variables in the namelist need to be modified: - ``miter``: number of outer analysis loops. - ``niter(1)``: maximum iteration number of inner loop iterations for the 1\ :sup:`st` outer loop. The inner loop will stop when it reaches this maximum number, when it reaches the convergence threshold, or when it fails to converge. - ``niter(2)``: maximum iteration number of inner loop iterations for the 2\ :sup:`nd` outer loop. - If ``miter`` is larger than two, repeat ``niter`` with larger index. Set Up the Analysis Variable for Moisture ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are two moisture analysis variable options. It is based on the following namelist variable: ``qoption = 1 or 2``: - If ``qoption=1``, the moisture analysis variable is pseudo-relative humidity. The saturation specific humidity, qsatg, is computed from the guess and held constant during the inner loop. Thus, the relative humidity control variable can only change via changes in specific humidity, q. - If ``qoption=2``, the moisture analysis variable is normalized relative humidity. This formulation allows relative humidity to change in the inner loop via changes to surface pressure, temperature, or specific humidity. Set Up the Background File ~~~~~~~~~~~~~~~~~~~~~~~~~~ The following four variables define which background field will be used in the GSI analyses: - ``regional``: if true, perform a regional GSI run using either ARW or NMM inputs as the background. If false, perform a global GSI analysis. If either ``wrf_nmm_regional`` or ``wrf_mass_regional`` are true, it will be set to true. - ``wrf_nmm_regional``: if true, the background comes from WRF-NMM. When using other background fields, set it to false. - ``wrf_mass_regional``: if true, the background comes from WRF-ARW. When using other background fields, set it to false. - ``nems_nmmb_regional``: if true, the background comes from NMMB. When using other background fields, set it to false. - ``netcdf``: if true, WRF files are in NetCDF format, otherwise WRF files are in binary format. This option only works for a regional GSI analysis. Set Up the Output of Diagnostic Files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following variables tell the GSI to write out diagnostic results in certain loops: - ``write_diag(1)``: if true, write out diagnostic data in the beginning of the analysis, so that we can have information on observation :math:`-` background (O-B) differences. - ``write_diag(2)``: if true, write out diagnostic data at the end of the 1\ :sup:`st` outer loop (before the 2\ :sup:`nd` outer loop starts). - ``write_diag(3)``: if true, write out diagnostic data at the end of the 2\ :sup:`nd` outer loop (after the analysis finishes if the outer loop number is two), so that we can have information on observation :math:`-` analysis (O-A) differences. Please check appendix A.2 for the tools to read the diagnostic files. Set Up the GSI Recognized Observation Files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following sets up the GSI recognized observation files for GSI observation ingest: :: OBS_INPUT:: ! dfile dtype dplat dsis dval dthin dsfcalc prepbufr ps null ps 1.0 0 0 prepbufr t null t 1.0 0 0 prepbufr q null q 1.0 0 0 prepbufr pw null pw 1.0 0 0 satwndbufr uv null uv 1.0 0 0 prepbufr uv null uv 1.0 0 0 prepbufr spd null spd 1.0 0 0 prepbufr dw null dw 1.0 0 0 radarbufr rw null rw 1.0 0 0 prepbufr sst null sst 1.0 0 0 gpsrobufr gps_ref null gps 1.0 0 0 ssmirrbufr pcp_ssmi dmsp pcp_ssmi 1.0 -1 0 - ``dfile``: GSI recognized observation file name. The observation file contains observations used for a GSI analysis. This file can include several observation variables from different observation types. The file name listed by this parameter will be read in by GSI. This name can be changed as long as the name in the link from the BUFR/PrepBUFR file in the run scripts also changes correspondingly. - ``dtype``: analysis variable name that GSI can read in. Please note this name should be consistent with that used in the GSI code. - ``dplat``: sets up the observation platform for a certain observation, which will be read in from the file ``dfile``. - ``dsis``: sets up the data name (including both data type and platform name) used inside GSI. Please see Section 4.3 for examples and explanations of these variables. Set Up Observation Time Window ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the namelist section ``OBS_INPUT``, use ``time_window_max`` to set the maximum half time window (hours) for all data types. In the ``convinfo`` file, you can use the column "twindow" to set the half time window for a certain data type (hours). For conventional observations, only observations within the smaller window of these two will be kept for further processing. For others, observations within ``time_window_max`` will be kept for further processing. Set Up Data Thinning ~~~~~~~~~~~~~~~~~~~~ 1) Radiance data thinning Radiance data thinning is controlled through two GSI namelist variables in the section ``&OBS_INPUT``. Below is an example: :: &OBS_INPUT dmesh(1)=120.0,dmesh(2)=60.0,dmesh(3)=30,time_window_max=1.5,ext_sonde=.true., / OBS_INPUT:: ! dfile dtype dplat dsis dval dthin dsfcalc prepbufr ps null ps 1.0 0 0 gpsrobufr gps_ref null gps 1.0 0 0 ssmirrbufr pcp_ssmi dmsp pcp_ssmi 1.0 -1 0 tmirrbufr pcp_tmi trmm pcp_tmi 1.0 -1 0 hirs3bufr hirs3 n17 hirs3_n17 6.0 1 0 hirs4bufr hirs4 metop-a hirs4_metop-a 6.0 2 0 The two namelist variables that control the radiance data thinning are real array "dmesh" in the 1\ :sup:`st` line and the "dthin" values in the 6\ :sup:`th` column. The "dmesh" array sets mesh sizes for radiance thinning grids in kilometers, while "dthin" defines if the data type it represents needs to be thinned and which thinning grid (mesh size) to use. If the value of ``dthin`` is: - an integer less than or equal to zero, no thinning is needed - an integer larger than zero, this kind of radiance data will be thinned using the mesh size defined as dmesh (dthin). The following section provides several thinning examples defined by the above sample ``&OBS_INPUT`` section: - Data type ``ps`` from prepbufr: no thinning because ``dthin=0`` - Data type ``gps_ref`` from gpsrobufr: no thinning because ``dthin=0`` - Data type ``pcp_ssmi`` from dmsp: no thinning because ``dthin(01)=-1`` - Data type ``hirs3`` from NOAA-17: thinning in a 120 km grid because ``dthin=1`` and ``dmesh(1)=120`` - Data type ``hirs4`` from metop-a: thinning in a 60 km grid because ``dthin=2`` and ``dmesh(2)=60`` 2) Conventional data thinning The conventional data can also be thinned. However, the setup of thinning is not in the namelist. To give users a complete picture of data thinning, conventional data thinning is briefly introduced here. There are three columns, ``ithin``, ``rmesh``, ``pmesh``, in the ``convinfo`` file (more details on this file are in Section 4.3) to configure conventional data thinning: - ``ithin``: 0 = no thinning; 1 = thinning with grid mesh decided by ``rmesh`` and ``pmesh`` - ``rmesh``: horizontal thinning grid size in km - ``pmesh``: vertical thinning grid size in mb; if 0, then use background vertical grid. Set Up Background Error Factor ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the namelist section BKGERR, vs is used to set up the scale factor for vertical correlation length and ``hzscl`` is defined to set up scale factors for horizontal smoothing. The scale factors for the variance of each analysis variables are set in the ``anavinfo`` file. The typical values used in operations for regional and global background error covariance are given and picked based on the choice of background error covariance in the run scripts and sample ``anavinfo`` files Single Observation Test ~~~~~~~~~~~~~~~~~~~~~~~ To do a single observation test, the following namelist option has to be set to true: :: oneobtest=.true. Then go to the namelist section ``SINGLEOB_TEST`` to set up the single observation location and variable to be tested, please see Section 4.2 for an example and details on the single observation test.