計算機システムの使い方/Smith_en - 第一原理分子動力学プログラム STATE Senri Wiki

Smith †

"Smith" is a computer cluster based on the Intel and Intel-compatible CPUs.

Smith

↑

Login nodes †

To use the "Smith" system, log in to the following nodes:

[smith] 133.1.116.161
[rafiki] 133.1.116.162
[tiamat] 133.1.116.211

To use the "sb100" system, use the following node:

[sb100] 133.1.116.165

↑

How to login the login node †

To login "smith" type

$ ssh -l [userID] 133.1.116.161

$ ssh [userID]@133.1.116.161

In a case you allow the X11 forwarding, use

$ ssh -Y -l [userID] 133.1.116.161

$ ssh -Y [userID]@133.1.116.161

Currently, you get the following message upon login

-bash: /usr/local/g09/D01/g09/bsd/g09.profile: Permission denied

but it does not affect your work mostly.

NOTE: When you log in for the first time, change your initial password by typing

$ yppasswd

↑

How to compile and run the program †

In the latest environment (as of October 2020), we are supposed to use the module:

To check the available modules, type

$ module available

and to load the specific modules, type for e.g.

$ module load intel/2020.2.254
$ module load intelmpi/2020.2.254
$ module load python/3.8

Note that these modules are loaded one time and they should be added to ~/.bashrc as:

module load intel/2020.2.254
module load intelmpi/2020.2.254
module load python/3.8

Make sure that the old setting is deleted and/or commented out as:

# source /home/opt/settings/2017.4/intel-compiler.sh
# source /home/opt/settings/2017.4/intel-mpi.sh

Also make sure to load the same modules in your job script.

↑

How to submit your jobs †

To execute your program, use the queueing system, usually using a job script (see below). For instance, to execute a script "job.sh" using the node (24 cores) in the group 10, type

$ qsub -q xh1.q -pe x24 24 job.sh

Note group and number of cores can be specified in the job script. To see the job status, type

$ qstat

To see the job status of the specific user, type

$ qstat -u [user ID]

To cancel a job, use

$ qdel [job ID]

where the job ID can be obtained by using qstat (the number appearing in the first column).

↑

Examples of job script †

In the following, examples for each groups (queues) are listed. In this case, you just type

$ qsub job.sh

and do not have to specify the queue group and number of processors explicitly.

Groups 4

#$ -S /bin/bash
#$ -cwd
#$ -q xe1.q
#$ -pe x8 8
#$ -N JOB_NAME
module load intel/2020.2.254
module load intelmpi/2020.2.254
# Above settings should be consistent with those used in the compilation
mpirun ./a.out < input.dat > output.dat

Groups 5

#$ -S /bin/bash
#$ -cwd
#$ -q xe2.q
#$ -pe x12 12
#$ -N JOB_NAME
module load intel/2020.2.254
module load intelmpi/2020.2.254
# Above settings should be consistent with those used in the compilation
mpirun ./a.out < input.dat > output.dat

Group 7 (sb100)

Hybrid parallelization (ex. use 12 cores with 6 threads parallelization)

#$ -S /bin/bash
#$ -cwd
#$ -q sb.q
#$ -pe x6 12
#$ -N JOB_NAME
module load intel/2021.2.0
module load intelmpi/2021.2.0
# Above settings should be consistent with those used in the compilation
export OMP_NUM_THREADS=6
mpirun -perhost 1 -np $NHOSTS ./a.out < input.dat > output.dat

Flat parallelization (12 cores)

#$ -S /bin/bash
#$ -cwd
#$ -q sb.q
#$ -pe x6 12
#$ -N JOB_NAME
module load intel/2021.2.0
module load intelmpi/2021.2.0
# Above settings should be consistent with those used in the compilation
mpirun -np $NSLOTS ./a.out < input.dat > output.dat

Group 8

#$ -S /bin/bash
#$ -cwd
#$ -q xs2.q
#$ -pe x16 16
#$ -N JOB_NAME
module load intel/2020.2.254
module load intelmpi/2020.2.254
# Above settings should be consistent with those used in the compilation
MPI_COMMAND=mpirun
export I_MPI_PIN=1
export I_MPI_ADJUST_ALLGATHERV=2
export OMP_NUM_THREADS=1
cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > 
hostfile.$JOB_ID
$MPI_COMMAND ./a.out < input.dat > output.dat

Group 9

#$ -S /bin/bash
#$ -cwd
#$ -q xi1.q
#$ -pe x16 16
#$ -N JOB_NAME
module load intel/2020.2.254
module load intelmpi/2020.2.254
# Above settings should be consistent with those used in the compilation
MPI_COMMAND=mpirun
export I_MPI_PIN=1
export I_MPI_ADJUST_ALLGATHERV=2
export OMP_NUM_THREADS=1
cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > hostfile.$JOB_ID
$MPI_COMMAND ./a.out < input.dat > output.dat

Groups 10

#$ -S /bin/bash
#$ -cwd
#$ -q xh1.q
#$ -pe x24 48
#$ -N JOB_NAME
#$ -j y
module load intel/2020.2.254
module load intelmpi/2020.2.254
# Above settings should be consistent with those used in the compilation
MPI_COMMAND=mpirun
export I_MPI_PIN=1
export I_MPI_FABRICS=shm:ofa
export OMP_NUM_THREADS=1
cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > 
hostfile.$JOB_ID
$MPI_COMMAND ./a.out < input.dat > output.dat

Groups 11

#$ -S /bin/bash
#$ -cwd
#$ -q xh2.q
#$ -pe x24 48
#$ -N JOB_NAME
#$ -j y
module load intel/2020.2.254
module load intelmpi/2020.2.254
# Above settings should be consistent with those used in the compilation
MPI_COMMAND=mpirun
export I_MPI_PIN=1
export I_MPI_FABRICS=shm:ofa
export OMP_NUM_THREADS=1
cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > 
hostfile.$JOB_ID
$MPI_COMMAND ./a.out < input.dat > output.dat

Group 13

#$ -S /bin/bash
#$ -cwd
#$ -q xb1.q
#$ -pe x32 32
#$ -N JOB_NAME
#$ -j y
module load intel/2020.2.254
module load intelmpi/2020.2.254
# Above settings should be consistent with those used in the compilation
MPI_COMMAND=mpirun
export I_MPI_PIN=1
export I_MPI_FABRICS=shm:ofa
export OMP_NUM_THREADS=1
cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > 
hostfile.$JOB_ID
$MPI_COMMAND ./a.out < input.dat > output.dat

Group 14

#$ -S /bin/bash
#$ -cwd
#$ -q x17.q
#$ -pe x32 32
#$ -N JOB_NAME
#$ -j y
module load intel/2020.2.254
module load intelmpi/2020.2.254
# Above settings should be consistent with those used in the compilation
MPI_COMMAND=mpirun
export I_MPI_PIN=1
export I_MPI_FABRICS=shm:dapl
export OMP_NUM_THREADS=1
cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > 
hostfile.$JOB_ID
$MPI_COMMAND ./a.out < input.dat > output.dat

↑

Computer nodes and queues †

Group	Proc.	#CORE/#CPU	Submission node	queue	paral. environ.	Inter-node
4	xeon	8/2	smith/rafiki/tiamat	xe1.q	x8
5	xeon	12/2	smith/rafiki/tiamat	xe2.q	x12
7	core i7 sandy-bridge	6/1	sb100	all.q	x6
8	xeon sandy-bridge	16/2	smith/rafiki/tiamat	xs2.q	x16
９	xeon ivy-bridge	16/2	smith/rafiki/tiamat	xi1.q	x16
10	xeon Haswell	24/2	smith/rafiki/tiamat	xh1.q	x24	infini-band
11	xeon Haswell	24/2	smith/rafiki/tiamat	xh2.q	x24	infini-band
13	xeon Broadwell	32/2	smith/rafiki/tiamat	xb1.q	x32	infini-band
14	xeon Skylake	32/2	smith/rafiki/tiamat	x17.q	x32	infini-band

NOTE:

To submit a job to group 8 nodes, login to sb100 and execute qsub
To submit a job to other group nodes, login to smith and execute qsub

↑

Group 4, 5 "xe" system †

The "xe" system is composed of the nodes with the Xeon CPU, which have 2 CPUs (8 or 12 cores) per node. The parallel environment is x8 and x12.

↑

Group 7 "sb100" system †

The "sb100" system is based on the Core i7 CPUs with the Sandy-bridge architecture. Each node has 1 CPU (6cores) with 16 GB memory. Fast calculations are possible thanks to the AVX function. The parallel environment is x6.

↑

Group 8 "xs" system †

The "xs" system is based on the Xeon CPUs with the Sandy-bridge architecture. Each node has 1 CPU (6cores) with 32 GB memory. Fast calculations are possible thanks to the AVX function. The parallel environment is x16.

↑

Group 9 "xi" system †

The "xi" system is based on the Xeon CPUs with the Ivy-bridge architecture. Each node has 2 CPUs (16cores) with 128 GB memory. Fast calculations are possible thanks to the AVX function. It is recommend to use this system for Gaussian calculations. The parallel environment is x16.

↑

Group 10, 11, 12 "xh" system †

The "xh" system is composed of the nodes with 2 Xeon CPUs (24 cores in total) and 64 GB memory. The parallel environment is x24.

↑

Group 13 "xb" system †

The "xb" system is composed of the nodes with 2 Xeon Broadwell CPUs (32 cores in total) and 64 GB memory. The parallel environment is x32.

↑

Group 14 "x17" system †

The "x17" system is composed of the nodes with 2 Xeon Skylake CPUs (32 cores in total) and 64 GB memory. The parallel environment is x32.

↑

Network structure †

"|" indicates a network connection, "[]" name, for the computer node

+ Engineering intranet, ODINS network
|
|           Backbone network( no access outside of engineering network)
|               |
+- [smith] -----+                          133.1.116.161 Login & application server & backup server & file server 
+- [rafiki] ----+                          133.1.116.162 Login & application server & backup server 
+- [tiamat] ----+                          133.1.116.211 Login & Application server
|               |
|               +-- [xe00], [xe01]         Calc. node, group 4 (each node has 8 cores (2CPUs)) paral. env.=x8 queue=xe1.q
|               +-- [xe02]-[xe06]          Calc. node, group 5 (each node has 8 cores (2CPUs)) paral. env.=x8 queue=xe1.q
|               |
|               +-- [xs01]-[xs18]          Calc. node, group 8 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xs2.q
|               |
|               +-- [xi01]-[xi12]          Calc. node, group 9 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xi1.q
|               |
|               +-- [xh01]-[xh17]
|               +-- [xh19]-[xh34]          Calc. node, group 10 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xe1.q                        
|               +-- [xh18],[xh35]-[xh43]   Calc. node, group 11 (each node has 24 cores (2CPUs)) paral. env.=x24 queue=xh2.q
|               +-- [xb01]-[xb14]          Calc. node, group 13 (each node has 32 cores (2CPUs)) paral. env.=x32 queue=xb1.q
|               +-- [x1701]-[x1706]        Calc. node, group 14 (each node has 32 cores (2CPUs)) paral. env.=x32 queue=x17.q
|               |
|               |
+- [sb100] -----+                          133.1.116.165 Login node for other groups
                |
                +-- [sb101]-[sb120]        Calc. node, group 7 (each node has 6 cores (1 CPU))

第一原理分子動力学プログラム STATE Senri Wiki

Menu