Setup a CESM 1.0.x benchmark run on a generic system
This is a
cookbook to setup a CESM 1.0 benchmark run on a generic (Linux) system.
See also porting CESM in the CESM user's guide:
http://www.cesm.ucar.edu/models/cesm1.0/cesm/cesm_doc/c2161.html
System requirements
- Compilers known to work: intel 10.1, pgi 7.2, 8.0, 9.0, (pathscale 3.2)
- MPI implementations known to work: openmpi 1.4, 1.5, mvapich2 1.4, 1.5
- tcsh CESM setup scripts are written in tcsh
Compile NETCDF (Requirement)
- Install NETCDF
make install
Download CESM source code
Adapt configuration files
- Meaning of filenames
Filename | Purpose |
env_machopts.* | Set environment: Can be used to set paths to compiler, MPI library, NETCDF library |
Macros.* | Set compiler name and paths to MPI library, NETCDF library. Set compiler options |
mkbatch.* | Setting for queuing system |
where * corresponds to a machine.
- Add to
config_machines.xml
a configuration tag for your machine (your_machine) - only the important lines are listed below
<machine MACH="your_machine"
DESC="Test System"
EXEROOT="/scratch/$CCSMUSER/$CASE"
OBJROOT="$EXEROOT"
INCROOT="$EXEROOT/lib/include"
DIN_LOC_ROOT_CSMDATA="/scratch/cesm1/inputdata"
DIN_LOC_ROOT_CLMQIAN="/scratch/cesm1/inputdata/atm/datm7/atm_forcing.datm7.Qian.T62.c080727"
BATCHQUERY="qstat -f"
BATCHSUBMIT="qsub"
GMAKE_J="4"
MAX_TASKS_PER_NODE="4"
MPISERIAL_SUPPORT="FALSE" />
please set the following variables:
EXEROOT= # working directory, final location of binary and output files
DIN_LOC_ROOT_CSMDATA= # input data, date will be downloaded on the fly
DIN_LOC_ROOT_CLMQIAN= # input data, data will be downloaded on the fly
MAX_TASKS_PER_NODE= # define cores per node
Create a new case
- Change to
scripts
directory
cd cesm1_0_2
cd scripts
Change the layout
- The layout will define which model component (ATM, LND, ICE, OCN, CPL) will run on how many cores.
- To change the layout you don't have to recreate the case (but you can if you like).
- Change into case directory
cd $CASE
- Clean the case
./configure -cleanmach
- Configure layout (all components will run on all cores)
NTASKS=16 #or# NTASKS=32 #or# NTASKS=64 #or# NTASKS=128 #or# NTASKS=256 #or# NTASKS=512
./xmlchange -file env_mach_pes.xml -id NTASKS_ATM -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_LND -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_ICE -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_OCN -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_CPL -val $NTASKS
./xmlchange -file env_mach_pes.xml -id NTASKS_GLC -val 1
./xmlchange -file env_mach_pes.xml -id TOTALPES -val $NTASKS
./xmlchange -file env_mach_pes.xml -id ROOTPE_OCN -val 0
- Exception: For resolution T31_gx3v7 with NTASKS=32, NTASKS=64
NTASKS=32 #or# NTASKS=64
NTASKS_ATM=24
NTASKS_OCN=8
NTASKS_LND=24
NTASKS_ICE=20
NTASKS_CPL=24
NTASKS_TOTAL=32
F=$(( $NTASKS / $NTASKS_TOTAL ))
./xmlchange -file env_mach_pes.xml -id ROOTPE_GLC -val 0
./xmlchange -file env_mach_pes.xml -id NTASKS_GLC -val 1
./xmlchange -file env_mach_pes.xml -id ROOTPE_ATM -val 0
./xmlchange -file env_mach_pes.xml -id NTASKS_ATM -val $(( $NTASKS_ATM * $F ))
./xmlchange -file env_mach_pes.xml -id ROOTPE_LND -val 0
./xmlchange -file env_mach_pes.xml -id NTASKS_LND -val $(( $NTASKS_LND * $F ))
./xmlchange -file env_mach_pes.xml -id ROOTPE_ICE -val 0
./xmlchange -file env_mach_pes.xml -id NTASKS_ICE -val $(( $NTASKS_ICE * $F ))
./xmlchange -file env_mach_pes.xml -id ROOTPE_OCN -val $(( $NTASKS_ATM * $F ))
./xmlchange -file env_mach_pes.xml -id NTASKS_OCN -val $(( $NTASKS_OCN * $F ))
./xmlchange -file env_mach_pes.xml -id ROOTPE_CPL -val 0
./xmlchange -file env_mach_pes.xml -id NTASKS_CPL -val $(( $NTASKS_CPL * $F ))
./xmlchange -file env_mach_pes.xml -id TOTALPES -val $NTASKS
Set simulation length
- In general, CESM is hardwired to generate monthly average data. In principle this can be turned of but needs a lot of code changes. Therefore it's not considered here. The following two cases are suggested instead:
Configure the case
- Configure case
./configure -case
Change history file output frequency of atmosphere model (cam)
Build the case
- Build and compile the model
./$CASE.$MACH.build
Run the model
- Run the model, for example with LSF queuing system
bsub < $CASE.$MACH.run
- To start without a queuing system just execute:
./$CASE.$MACH.run
Change the resolution
- Recommended resolutions are T31_gx3v7 (~3°), 1.9x2.5_gx1v6 (2°), 0.9x1.25_gx1v6 (1°)
- Create for each resolutions a separate cases !
Produce a summary
- Create performance matrix for CASE 1 and CASE 2. Fill in Model Throughput in simulated_years/day
- CASE 1: Run a short simulation with producing almost no output (I/O) - Values are from brutus.ethz.ch
resolution / layout (NTASKS) | 16 | 32 | 64 | 128 | 192 | 256 | 512 |
T31_gx3v7 | 12.0 | 18.3 | 26.9 | -- | -- | -- | -- |
1.9x2.5_gx1v6 | -- | 2.27/2.60 | | | | | -- |
0.9x1.25_gx1v6 | -- | -- | 1.88 | | | | |
- CASE 2: Run a longer simulation with producing hourly output data - Values are from brutus.ethz.ch
resolution / layout (NTASKS) | 16 | 32 | 64 | 128 | 192 | 256 | 512 |
T31_gx3v7 | 7.9 | 10.8 | 13.6 | -- | -- | -- | -- |
1.9x2.5_gx1v6 | -- | | | | | | -- |
0.9x1.25_gx1v6 | -- | -- | | | | | |