System Overview¶
Compute Nodes¶
There are 8 Nodes in Sakura System. Each nodes have following specifications:
| CPU | Xeon D-1571 16 Cores @1.30GHz |
| Memory | 32GB |
| Accelerators | PEZY-SC2, 700MHz, 64GB |
| Inteconnect | Infiniband EDR |
| Storage | nfs /home |
File Systems¶
There is 1 storage mounted /home is shared with other uses. /home has 100GB and is shared with other users. Threre are no quotes. Please use gentlemanly.
Connecting¶
Connect with SSH¶
| System Name | Hostname | RSA Key Fingerprints |
| Sakura | matsu.exascaler.co.jp | SHA256:gywcL7XDCOgXm4UXV4m0hi2Xzo2I4XLUD2CDnqeDPlA |
For example, to connect to Sakura login node from a UNIX-based system, use the following command::
[your machine] $ ssh-add # Add ssh key to auth agent
[your machine] $ ssh -A userid@matsu.exascaler.co.jp # You have to use SSH Agent forward
Then you can login to login node. To connect to Sakura Head Node from Sakura login node, use the following command::
[ssh] $ ssh esfe
First login, you should change your password. You have to use yppasswd command instead of passwd command.:
[esfe] $ yppasswd
Compiling¶
..(snip)..
Running Jobs¶
We use slurm for scheduling of clients jobs.
Please use --exclusive option to grab entire of a node.
Single Node Job¶
Sample Job File:
#!/bin/bash
#SBATCH -p debug # partition name
#SBATCH -N 1 # number of nodes
#SBATCH -n 1 # number of process
#SBATCH --exclusive
./myprogram
To submit job, use the sbatch command::
$ sbatch job.sh
Partition¶
There are some partition(s) in system. To see partition information, use the sinfo command::
$ sinfo
You could choose partition name in job script.
MPI Job¶
You can use openMPI. mpivars.sh location is following.
openmpi:
$ source /usr/mpi/gcc/openmpi-1.8.4/bin/mpivars.sh
MPI with Slurm¶
Sample Job File:
#!/bin/bash
#SBATCH -p debug
#SBATCH -N 1 # number of nodes
#SBATCH -n 4 # number of process
#SBATCH --ntasks-per-node=4
#SBATCH --exclusive
source /usr/mpi/gcc/openmpi-1.8.4/bin/mpivars.sh
mpirun ./myprogram
More Information about slurm¶
See Slurm offical man pages for more information about sbatch, srun, squeue and scancel.
Debugging¶
You can use interactive session with following::
$ srun -N 1 --exclusive --pty bash
If you want to use 2 nodes::
$ srun -N 2 --exclusive --pty bash