Multi-Node Jobs#
Large parallel Abaqus jobs can be run on across multiple nodes using MPI.
Tip
Multi-node parallelism is best-suited to Abaqus/Explicit. Abaqus/Standard (implicit) does not scale well across multiple nodes.
Job Script Template#
Here is an example job submission script for a multi-node Abaqus job:
1#!/usr/bin/bash -l
2#
3#SBATCH --job-name=my_job
4#SBATCH --nodes=2
5#SBATCH --ntasks-per-node=28
6#SBATCH --cpus-per-task=1
7#SBATCH --time=0:10:00
8#SBATCH --mem-per-cpu=4000M
9#SBATCH --account=aero012345
10
11# Load modules
12module load apps/abaqus/2018
13module load languages/Intel-OneAPI/2022.2.0 # BlueCrystal (Phase 4) and BluePebble
14
15# Unset SLURM's Global Task ID for ABAQUS's PlatformMPI to work
16unset SLURM_GTIDS
17
18# Get allocated nodes for Abaqus
19env_file=abaqus_v6.env
20node_list=$(scontrol show hostname ${SLURM_NODELIST} | sort -u)
21mp_host_list="["
22for host in ${node_list}; do
23 mp_host_list="${mp_host_list}['$host', ${SLURM_CPUS_ON_NODE}],"
24done
25mp_host_list=$(echo ${mp_host_list} | sed -e "s/,$/]/")
26echo "mp_host_list=${mp_host_list}" >> ${env_file}
27
28# Launch Abaqus
29abaqus job=<job-name> cpus=$((SLURM_NTASKS_PER_NODE*SLURM_NNODES)) user=<usub-file> mp_mode=mpi double=both interactive
There are number of important differences with the single-node job script example:
More than one node is requested
We request multiple tasks per node (distributed parallelism), instead of multiple cpus per task (thread-based parallelism)
We have extra lines to inform Abaqus platform MPI about the nodes that have been allocated via the job scheduler
We must use
mp_mode=mpi
for multi-node parallelism
How to use#
Follow the same steps are described for the single-node example except you can scale the paralellism by changing the number of nodes (line 4).