Downloading CCPP-SCM

Overview

Teaching: 0 min
Exercises: 0 min

Questions

How do I download CCPP-SCM?

Objectives

We will use GMU Hopper to run this model. We will now download the model and all the setup files needed for it. These are one-time steps.

Go to your /home/username/classes/clim670/ directory

$ cd /home/cstan/classes/clim670

Download the CCPP-SCM from Github

$ module load git
$ git clone --recursive -b v7.0.0 https://github.com/NCAR/ccpp-scm ccpp-scm-7.0 

This will take a while …

Change to the ccpp-scm-7.0 directory

$ cd ccpp-scm-7.0

Let’s take a look in that directory. The recursive option in the git clone command clones the main ccpp-scm repository and all subrepositories (ccpp-physics and ccpp-framework). Using this option, there is no need to execute git submodule init and git submodule update.

Key Points

Building CCPP-SCM

Overview

Teaching: 0 min
Exercises: 0 min

Questions

How do I setup the SCM?

Objectives

You have now downloaded all the model components. Let’s take a look: From your /home/username/classes/clim670/ccpp-scm-7.0 directory, see what is there.

$ ls

Organization of the CCPP-SCM Directory

scm - Top level SCM directory: contains the dynamical core
ccpp - Top level CCPP directory: contains physics packages
CMakeModules - Contains modules required to run the code
contrib - Contains scripts that download data files required to run the model
docker - Top directory for building CCPP-SCM container (not used in this class)
test - Top directory of a test case
tutorial_files - Examples used by developers in tutorials

Using existing libraries

The Python environment must provide the f90nml module for the SCM scripts to function. Users can test if f90nml is installed using this command in the shell:

$ python -c "import f90nml"

If f90nml is installed, this command will succeed silently, otherwise an ImportError: No module named f90nml will be printed to screen. To install the f90nml (v0.19) Python module, use:

$ module load anaconda3
$ pip install --user f90nml ==1.4.4

Platform-specific scripts are provided to load modules and set the user environment for preconfigured platforms. These scripts load compiler modules (Fortran 2008-compliant), the NetCDF module, Python environment, etc. and set compiler and environment variables.

$ cd /home/username/classes/clim670/ccpp-scm-7.0/scm/etc/
$ ls 

Hopper is not one of the preconfigured platforms. To get the corresponding file for Hopper:

$ cp /home/cstan/scm_sandbox/ccpp-scm-7.0/scm/etc/Hopper_setup_gnu.sh . 

Let’s take a look: From your /home/username/classes/clim670/ccpp-scm-7.0/scm/etc/ directory, see what is there.

$ ls

Now you have two new files that will configure the building environmeent for Hopper. One can be used for the t/csh shell and the other for the bash shell. From the top-level code directory (ccpp-scm-7.0), source the bash script for Hopper:

$ cd ../../
$ source scm/etc/Hopper_setup_gnu.sh

Ignore the warnings.

The first step in compiling the CCPP and SCM is to properly setup your user environment as described in sections above. The second step is to download the lookup tables and other large datasets (large binaries, <1 GB) needed by the physics schemes and place them in the correct directory. I have downloaded these files and you will create symbolic links (Note: Do not change username ‘cstan’ with your username:

$ cd /home/username/classes/clim670/ccpp-scm-7.0/scm/data/
$ ln -fs /home/cstan/classes/clim670/ccpp-scm/scm/data/comparison_data/
$ ln -fs /home/cstan/classes/clim670/ccpp-scm/scm/data/physics_input_data/
$ ln -fs /home/cstan/classes/clim670/ccpp-scm/scm/data/processed_case_input/
$ ln -fs /home/cstan/classes/clim670/ccpp-scm/scm/data/raw_case_input/

Above were the one-time setup setps, now we move on to the steps you will do everytime to setup a new model experiment and run it.

From the top level code-directory (ccpp-scm-7.0), change directory to the top-level SCM directory:

$ cd scm

Make a build directory and change into it

$ mkdir bin
$ cd bin

Invoke cmake on the source code

$ cmake ../src

This will take a while. The end of a succssful build will look like

-- Configuring done
-- Generating done
-- Build files have been written to: /home/username/classes/clim670/ccpp-scm-7.0/scm/bin

What happened when we ran `cmake`?

Files necssary for building the executable have been written to bin/
The CCPP physics and framwork prebuild scripts are ran to match required physics variables with those available from the dynamical core (SCM) and to generate physcis caps and makefile segments
Software caps are generated for each physics group defined in the supplied Suite Definition Files (SDFs) and generate a static library that becomes part of the SCM executable.
ccpp_prebuild.err contains a list of all files that have been used in this step

We are now ready to compile. This step will create the executable.

$ make

This will take a while and the progress is shown as a percentage.

What happened when we ran `make`?

The executable scm is created and written in bin/ directory.
The run script run_scm.py is being copied to bin/ directory.

Now what?

If your compilation has completed, then you are ready to run the case.

Key Points

Running CCPP-SCM

Overview

Teaching: 0 min
Exercises: 0 min

Questions

How do I setup a case and run the SCM?

Objectives

You have now build the model. There are several test cases provided with this version of the SCM. For all cases, the SCM will go through the time steps, applying forcing and calling the physics defined in the chosen suite definition file using physics configuration options from an associated namelist. The model is executed through a Python run script that is pre-staged into the bin directory: run_scm.py. It can be used to run one integration or several integrations serially, depending on the command line arguments supplied.

Running a case requires four pieces of information:

The case to run (consisting of initial conditions, geolocation, forcing data, etc.),
The physics suite to use (through a CCPP suite definition file),
A physics namelist (that specifies configurable physics options to use), and
A tracer configuration file.

Cases are set up via their own namelists in ../etc/case_config. A default physics suite is provided as a user-editable variable in the script and default namelists and tracer configurations are associated with each physics suite (through ../src/suite_info.py), so, technically, one must only specify a case to run with the SCM when running just one integration. For running multiple integrations at once, one need only specify one argument (-m) which runs through all permutations of supported suites from ../src/suite_info.py and cases from ../src/supported_cases.py.

The run script’s options are described below where option abbreviations are included in brackets.

--case [-c]

This or the --multirun option are the minimum required arguments. The case should correspond to the name of a case in ../etc/case_config (without the .nml extension).

--suite [-s]

The suite should correspond to the name of a suite in ../ccpp/suites (without the .xml) extension that was supplied in the cmake or ccpp_prebuild step.

--namelist [-n]

The namelist should correspond to the name of a file in ../ccpp/physics_namelists (WITH the .nml extension). If this argument is omitted, the default namelist for the given suite in ../src/suite_info.py will be used.

--tracers [-t]

The tracers file should correspond to the name of a file in ../etc/tracer_config (WITH the .txt extension). If this argument is omitted, the default tracer configuration for the given suite in ../src/suite_info.py will be used.

--multirun [-m]

This or the –case option are the minimum required arguments. When used alone, this option runs through all permutations of supported suites from ../src/suite_info.py and cases from ../src/supported_cases.py. When used in conjunction with the – file option, only the runs configured in the file will be run.

--file [-f]

This option may be used in conjunction with the –multirun argument. It specifies a path and filename to a python file where multiple runs are configured.

--gdb [-g]

Use this to run the executable through the gdb debugger (if it is installed on the system).

--docker [-d]

Use this argument when running in a docker container in order to successfully mount a volume between the host machine and the Docker container instance and to share the output and plots with the host machine.

--runtime

Use this to override the runtime provided in the case configuration namelist.

--runtime_mult

Use this to override the runtime provided in the case configuration namelist by multiplying the runtime by the given value. This is used, for example, in regression testing to reduce total runtimes.

--levels [-l]

Use this to change the number of vertical levels.

--npz_type

Use this to change the type of FV3 vertical grid to produce (see src/scm_vgrid.F90 for valid values).

--vert_coord_file

Use this to specify the path/filename of a file containing the a_k and b_k coefficients for the vertical grid generation code to use.

--bin_dir

Use this to specify the path to the build directory.

--run_dir

Use this to specify the path to the run directory.

--case_data_dir

Use this to specify the path to the directory containing the case data file (useful for using the DEPHY case repository).

--n_itt_out

Use this to specify the period of writing instantaneous output in timesteps (if different than the default specified in the script).

--n_itt_diagt

Use this to specify the period of writing instantaneous and time-averaged diagnostic output in timesteps (if different than the default specified in the script).

--timestep [-dt]

Use this to specify the timestep to use (if different than the default specified in ../src/suite_info.py).

--verbose [-v]

Use this option to see additional debugging output from the run script and screen output from the executable.

When invoking the run script, the only required argument is the name of the case to run. The case name used must match one of the case configuration files located in ../etc/case_config (without the .nml extension!). If specifying a suite other than the default, the suite name used must match the value of the suite name in one of the suite definition files located in ../../ccpp/suites (Note: not the filename of the suite definition file).

As part of the v7 CCPP release, the following suite names are valid:

SCM_GFS_v16
SCM_GFS_v16_RRTMGP
SCM_GFS_v17_p8_ugwpv1
SCM_HRRR_gf
SCM_WoFS_v0

Let’s create a case. From the top level code-directory (ccpp-scm-7.0), change directory to the top-level SCM directory and create a directory were to run all cases and then a directory for each case

$ mkdir cases
$ cd cases
$ mkdir gfs_v16_bomex

Create a script (e.g., create_gfs_v16_bomex_case.sh) with the following

#!/bin/bash

export case=bomex
export suite=SCM_GFS_v16
export namelist=input_GFS_v16.nml
export RUN_TIME=86400
export ITT_OUT=1
export BIN_DIR=/home/cstan/scm_sandbox/ccpp-scm-7.0/scm/bin
export RUN_DIR=/scratch/cstan/clim670/ccpp-scm-7.0  # Make sure RUN_DIR exists

python ${BIN_DIR}/run_scm.py -c ${case} -s ${suite} -n ${namelist} --runtime ${RUN_TIME} --n_itt_out ${ITT_OUT} --bin_dir ${BIN_DIR} --run_dir ${RUN_DIR}

Make sure the script is executable. To check

$ ls -l create_gfs_v16_bomex_case.sh

-rwxr--r-- 1 cstan users 456 Feb  5 17:10 create_gfs_v16_bomex_case.sh

If the sequence -rwxr--r-- does not have the execute (x) permission we can add it:

$ chmod u+x create_gfs_v16_bomex_case.sh

Check if your script has the execute permission

Does your file have the execute (x) permsion?

Before executing the script, we need to set the user environment for Hopper.

From the top-level code directory (ccpp-scm-7.0), source the bash script for Hopper:

$ source scm/etc/Hopper_setup_gnu.sh

Now we are ready to strat running our gfs_v16_bomex_case.

$ cd scm/cases/gfs_v16_bomex/

Get a compute node with:

$ salloc -p interactive -C amd -n 1

Run:

$ ./create_gfs_v16_bomex_case.sh

A succssful run will generate a message like:

INFO: Process "(case=bomex, suite=SCM_GFS_v16, namelist=input_GFS_v16.nml" completed successfully
INFO:     Elapsed time: 4.579s

A NetCDF output file is generated in a directory located in the RUN_DIR.

$ cd /scratch/cstan/clim670/ccpp-scm-7.0/output_bomex_SCM_GFS_v16
$ ls

bomex_SCM_GFS_v16.nml  logfile  output.nc

The output.nc contains the output written with the frequency set by the --n_itt variable.

What is in this file?

We will look at the file using ncdump -h to understand what is in the file. In my case ncdump is not loaded when I connect to Hopper using the ‘Clusters’ tab on the Dashboard. It works after loading the following modules:
$ module load OneAPI/2022.1.2 compiler/2022.0.2 netcdf-fortran/4.5.3-kc
What variables are in the file?

How many times are in the output file?

We can read the file using Python xarray. This is a small file. If you need help, here is an example

The namlist file .nml contains the configuration namelist that contains parameters for the SCM infrastructure and the physics configuration namelist.

The case_config namelist expects the following parameters:

case_name

Identifier for which dataset (initialization and forcing) to load. This string must correspond to a dataset included in the directory ccpp-scm/scm/data/processed_case_input/ (without the file extension).
runtime

Specify the model runtime in seconds (integer). This should correspond with the forcing dataset used. If a runtime is specified that is longer than the >supplied forcing, the forcing is held constant at the last specified values.
thermo_forcing_type

An integer representing how forcing for temperature and moisture state vari- ables is applied (1 = total advective tendencies, 2 = horizontal advective ten- dencies with prescribed vertical motion, 3 = relaxation to observed profiles with vertical motion prescribed)
mom_forcing_type

An integer representing how forcing for horizontal momentum state variables is applied (1 = total advective tendencies; not implemented yet, 2 = hori- zontal advective tendencies with prescribed vertical motion, 3 = relaxation to observed profiles with vertical motion prescribed)
relax_time

A floatingpoint number representing the timescale inseconds for the relaxation forcing (only used if thermo_forcing_type = 3 or mom_forcing_type = 3)
sfc_flux_spec

A boolean set to .true. if surface flux are specified from the forcing data (there is no need to have surface schemes in a suite definition file if so)
sfc_roughness_length_cm

Surface roughness length in cm for calculating surface-related fields from specified surface fluxes (only used if sfc_flux_spec is True).
sfc_type

An integer representing the character of the surface (0 = sea surface, 1 = land surface, 2 = sea-ice surface)
reference_profile_choice

An integer representing the choice of reference profile to use above the supplied initialization and forcing data (1 = “McClatchey” profile, 2 = mid-latitude summer standard atmosphere)
year

An integer representing the year of the initialization time
month

An integer representing the month of the initialization time
day

An integer representing the day of the initialization time
hour

An integer representing the hour of the initialization time
column_area

A list of floating point values representing the characteristic horizontal domain area of each atmospheric column in square meters (this could be analogous to a 3D model’s horizontal grid size or the characteristic horizontal scale of an observation array; these values are used in scale-aware schemes; if using multiple columns, you may specify an equal number of column areas)
model_ics

A boolean set to .true. if UFS atmosphere initial conditions are used rather than field campaign-based initial conditions
C_RES

An integer representing the grid size of the UFS atmosphere initial conditions; the integer represents the number of grid points in each horizontal direction of each cube tile
input_type

0 => original DTC format, 1 => DEPHY-SCM format.

Optional variables (that may be overridden via run script command line arguments) are:
vert_coord_file

File containing FV3 vertical grid coefficients.
dt

Time step in seconds (floating point)
n_itt_out

Specify the period of the instantaneous model output in number of timesteps (integer).
n_itt_diag

Specify the period of the instantaneous and time-averaged diagnostic output in number of timesteps (integer).
n_levels

Specify the integer number of vertical levels.

The physics_config expects the following parameters:

physics_nml

The name should correspond to the name of a file in ../ccpp/physics_namelists (WITH the .nml extension)
physics_suite

The suite should correspond to the name of a suite in ../ccpp/suites (without the .xml)

The logfile contains information from the running time.

Key Points

NCAR DTC CCPP-SCM

Downloading CCPP-SCM

Overview

Key Points

Building CCPP-SCM

Overview

Organization of the CCPP-SCM Directory

Using existing libraries

What happened when we ran cmake?

What happened when we ran make?

Now what?

Key Points

Running CCPP-SCM

Overview

Check if your script has the execute permission

What is in this file?

Key Points

What happened when we ran `cmake`?

What happened when we ran `make`?