Tutorial: PREPARE RUN SCRIPT FOR BASIC CASES

Submitted by dashlineiap on Wed, 09/22/2021 - 20:13

Hello:

I followed the online tutorial (GETTING STARTED > PREPARE BASE RUN SCRIPT), and I edited the run_gsi_regional.ksh_basic and ran it. I got the stdout file, but the stdout is weird. I referred to the UserGuide and I found I don't have the info as follow: 

* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
PROGRAM GSI_ANL HAS BEGUN. COMPILED 1999232.55 ORG: NP23
STARTING DATE-TIME NOV 06,2018 11:13:35.994 310 TUE 2458429

.....

ENDING DATE-TIME NOV 06,2018 11:15:55.503 310 TUE 2458429
PROGRAM GSI_ANL HAS ENDED.
* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
*****************RESOURCE STATISTICS*******************************
The total amount of wall time = 139.509586
The total amount of time in user mode = 135.700000
The total amount of time in sys mode = 3.472000
The maximum resident set size (KB) = 519420
Number of page faults without I/O activity = 1471061
Number of page faults with I/O activity = 0
Number of times filesystem performed INPUT = 0
Number of times filesystem performed OUTPUT = 0
Number of Voluntary Context Switches = 5723
Number of InVoluntary Context Switches = 3159
*****************END OF RESOURCE STATISTICS*************************

1.png , 2.png

Hi,

While your run progresses pretty far to the minimization, there appears to be an MPI problem earlier so this likely causes the crash. Can you let us know some more information about your HPC environment, MPI settings/Compiler/etc. versions, and run submission options? When it completes successfully, as you note, there will be the timing information, and there will be new links created in the run directory, like the ones described in section 3.3 of the user guide.

Will

Permalink

In reply to by wmayfield

Hi, I think that I have some problems in the building&compiling section. I got some errors or warnings after running the command -- cmake path_to_the_comGSI_directory. However, my compiling process could reach 100% and I got gsi.x & enkf_wrf.x. This makes me confused.  

My environment: 1) Intel_compiler/18.0.4 ; 2) mkl/18.0.4 ; 3) MPI/mpich/intel2018 ; 4) hdf5/1.8.11 ; 5) netcdf/4.4 ; 6) cmake/3.20.3 ; 7) curl/7.59.0 . 

ARCH is TIANHE.

'TIANHE')
      RUN_COMMAND="yhrun  -n ${GSIPROC} " ;;

My submission is  yhrun -n 12 -p TH_SR1 run_gsi_regional.ksh_basic.

I also tried different compilation environment, but I still got similar error:

GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
GLEX_ERR(cn1557): _init_glex(544), _create_ep: No enough endpoint resources
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIDI_nem_glex_init_glex(546): Cannot create GLEX endpoint.
yhrun: error: cn1557: tasks 0-11: Exited with exit code 1

 

The Cmake compiling process is as follow:

[tianxj@ln1%tianhe build2]$ cmake -DBUILD_CORELIBS=ON -DCURL_INCLUDE_DIR=/vol6/software/curl-7.59.0/include/curl /vol6/home/tianxj/yhluo/gsi/comGSIv3.7_EnKFv1.3
-- The C compiler identification is Intel 17.0.4.20170411
-- The CXX compiler identification is Intel 17.0.4.20170411
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /vol6/software/intel2017.4/compilers_and_libraries_2017.4.196/linux/bin/intel64/icc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /vol6/software/intel2017.4/compilers_and_libraries_2017.4.196/linux/bin/intel64/icpc - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The Fortran compiler identification is Intel 17.0.4.20170411
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Check for working Fortran compiler: /vol6/software/intel2017.4/compilers_and_libraries_2017.4.196/linux/bin/intel64/ifort - skipped
-- Checking whether /vol6/software/intel2017.4/compilers_and_libraries_2017.4.196/linux/bin/intel64/ifort supports Fortran 90
-- Checking whether /vol6/software/intel2017.4/compilers_and_libraries_2017.4.196/linux/bin/intel64/ifort supports Fortran 90 - yes
Build the EnKF with WRF module
CMake Deprecation Warning at CMakeLists.txt:71 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.

Control path is
The hostname is  ln1
-- BUILD_CORELIBS manually-specified as ON
Setting paths for Generic System
/vol6/home/tianxj/yhluo/gsi/comGSIv3.7_EnKFv1.3
Setting Intel flags
Compiler version is 17.0.4
Compiler version is 17.0.4
Using installed FindMPI
-- Found MPI_C: /vol-th/software/mpi/mpi-intel2017/lib/libmpich.a (found version "3.0")
-- Found MPI_CXX: /vol-th/software/mpi/mpi-intel2017/lib/libmpichcxx.a (found version "3.0")
-- Found MPI_Fortran: /vol-th/software/mpi/mpi-intel2017/lib/libmpichf90.a (found version "3.0")
-- Found MPI: TRUE (found version "3.0")
include dirs are /vol-th/software/mpi/mpi-intel2017/include
include PATH  /vol-th/software/mpi/mpi-intel2017/include
MPI version is 3.0
MPI f90 version is TRUE
MPI f08 version is FALSE
CMake Warning (dev) at cmake/Modules/FindNetCDF.cmake:52 (find_program):
  Policy CMP0109 is not set: find_program() requires permission to execute
  but not to read.  Run "cmake --help-policy CMP0109" for policy details.
  Use the cmake_policy command to set the policy and suppress this warning.

  The file

    /vol6/software/io_tools/netcdf/mpi/4.5.0/include/netcdf_meta.h

  is readable but not executable.  CMake is using it for compatibility.
Call Stack (most recent call first):
  CMakeLists.txt:193 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found NetCDF: /vol6/software/io_tools/netcdf/mpi/4.5.0/lib/libnetcdff.so;/vol6/software/io_tools/netcdf/mpi/4.5.0/lib/libnetcdf.so
-- Found ZLIB: /usr/lib64/libz.so (found version "1.2.3")
-- Found CURL: /vol6/software/curl-7.59.0/lib/libcurl.so (found version "7.59.0")
 trying to find lapack, GENERIC,
-- Looking for Fortran sgemm
-- Looking for Fortran sgemm - not found
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Looking for Fortran sgemm
CMake Warning at /vol6/home/tianxj/yhluo/gsi/build2/CMakeFiles/CMakeTmp/CMakeLists.txt:14 (add_executable):
  Cannot generate a safe runtime search path for target cmTC_9d1f4 because
  files in some directories may conflict with libraries in implicit
  directories:

    runtime library [libmkl_intel_lp64.so] in /vol6/software/intel2017.4/mkl/lib/intel64_lin may be hidden by files in:
      /vol6/lib
    runtime library [libmkl_core.so] in /vol6/software/intel2017.4/mkl/lib/intel64_lin may be hidden by files in:
      /vol6/lib

  Some of these libraries may not be found correctly.
 

-- Looking for Fortran sgemm - found
-- Found BLAS: /vol6/software/intel2017.4/mkl/lib/intel64_lin/libmkl_intel_lp64.so;/vol6/software/intel2017.4/mkl/lib/intel64_lin/libmkl_intel_thread.so;/vol6/software/intel2017.4/mkl/lib/intel64_lin/libmkl_core.so;/vol6/lib/libguide.so;-lpthread;-lm;-ldl
-- Looking for Fortran cheev
CMake Warning at /vol6/home/tianxj/yhluo/gsi/build2/CMakeFiles/CMakeTmp/CMakeLists.txt:14 (add_executable):
  Cannot generate a safe runtime search path for target cmTC_6a522 because
  files in some directories may conflict with libraries in implicit
  directories:

    runtime library [libmkl_intel_lp64.so] in /vol6/software/intel2017.4/mkl/lib/intel64_lin may be hidden by files in:
      /vol6/lib
    runtime library [libmkl_core.so] in /vol6/software/intel2017.4/mkl/lib/intel64_lin may be hidden by files in:
      /vol6/lib

  Some of these libraries may not be found correctly.
 

-- Looking for Fortran cheev - found
-- Found LAPACK: /vol6/software/intel2017.4/mkl/lib/intel64_lin/libmkl_intel_lp64.so;/vol6/software/intel2017.4/mkl/lib/intel64_lin/libmkl_intel_thread.so;/vol6/software/intel2017.4/mkl/lib/intel64_lin/libmkl_core.so;/vol6/lib/libguide.so;-lpthread;-lm;-ldl;-lpthread;-lm;-ldl
CMake Deprecation Warning at libsrc/wrflib/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.
.....

-- Configuring done
CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0110 is not set: add_test() supports arbitrary characters in test
  names.  Run "cmake --help-policy CMP0110" for policy details.  Use the
  cmake_policy command to set the policy and suppress this warning.

  The following name given to add_test() is invalid if CMP0110 is not set or
  set to OLD:

    `
            arw_binary´

This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning at src/CMakeLists.txt:110 (add_executable):
  Cannot generate a safe runtime search path for target gsi.x because files
  in some directories may conflict with libraries in implicit directories:

    runtime library [libmkl_intel_lp64.so] in /vol6/software/intel2017.4/mkl/lib/intel64_lin may be hidden by files in:
      /vol6/lib
    runtime library [libmkl_core.so] in /vol6/software/intel2017.4/mkl/lib/intel64_lin may be hidden by files in:
      /vol6/lib

  Some of these libraries may not be found correctly.

CMake Warning at src/enkf/CMakeLists.txt:73 (add_executable):
  Cannot generate a safe runtime search path for target enkf_wrf.x because
  files in some directories may conflict with libraries in implicit
  directories:

    runtime library [libmkl_intel_lp64.so] in /vol6/software/intel2017.4/mkl/lib/intel64_lin may be hidden by files in:
      /vol6/lib
    runtime library [libmkl_core.so] in /vol6/software/intel2017.4/mkl/lib/intel64_lin may be hidden by files in:
      /vol6/lib

  Some of these libraries may not be found correctly.

-- Generating done
-- Build files have been written to: /vol6/home/tianxj/yhluo/gsi/build2
 

The error points to the MPI problem. I notice that MPI f08 version is FALSE, but I don't know whether it is the root problem. 

Attach Files