"Smith" is a computer cluster based on the Intel and Intel-compatible CPUs.
To use the "Smith" system, log in to the following nodes:
To use the "sb100" system, use the following node:
To login "smith" type
$ ssh -l [userID] 133.1.116.161
or
$ ssh [userID]@133.1.116.161
In a case you allow the X11 forwarding, use
$ ssh -Y -l [userID] 133.1.116.161
or
$ ssh -Y [userID]@133.1.116.161
Currently, you get the following message upon login
-bash: /usr/local/g09/D01/g09/bsd/g09.profile: Permission denied
but it does not affect your work mostly.
NOTE: When you log in for the first time, change your initial password by typing
$ yppasswd
In the latest environment (as of October 2020), we are supposed to use the module:
To check the available modules, type
$ module available
and to load the specific modules, type for e.g.
$ module load intel/2020.2.254 $ module load intelmpi/2020.2.254 $ module load python/3.8
Note that these modules are loaded one time and they should be added to ~/.bashrc as:
module load intel/2020.2.254 module load intelmpi/2020.2.254 module load python/3.8
Make sure that the old setting is deleted and/or commented out as:
# source /home/opt/settings/2017.4/intel-compiler.sh # source /home/opt/settings/2017.4/intel-mpi.sh
Also make sure to load the same modules in your job script.
To execute your program, use the queueing system, usually using a job script (see below). For instance, to execute a script "job.sh" using the node (24 cores) in the group 10, type
$ qsub -q xh1.q -pe x24 24 job.sh
Note group and number of cores can be specified in the job script. To see the job status, type
$ qstat
To see the job status of the specific user, type
$ qstat -u [user ID]
To cancel a job, use
$ qdel [job ID]
where the job ID can be obtained by using qstat (the number appearing in the first column).
In the following, examples for each groups (queues) are listed. In this case, you just type
$ qsub job.sh
and do not have to specify the queue group and number of processors explicitly.
#$ -S /bin/bash #$ -cwd #$ -q xe1.q #$ -pe x8 8 #$ -N JOB_NAME module load intel/2020.2.254 module load intelmpi/2020.2.254 # Above settings should be consistent with those used in the compilation mpirun ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q xe2.q #$ -pe x12 12 #$ -N JOB_NAME module load intel/2020.2.254 module load intelmpi/2020.2.254 # Above settings should be consistent with those used in the compilation mpirun ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q sb.q #$ -pe x6 12 #$ -N JOB_NAME module load intel/2021.2.0 module load intelmpi/2021.2.0 # Above settings should be consistent with those used in the compilation export OMP_NUM_THREADS=6 mpirun -perhost 1 -np $NHOSTS ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q sb.q #$ -pe x6 12 #$ -N JOB_NAME module load intel/2021.2.0 module load intelmpi/2021.2.0 # Above settings should be consistent with those used in the compilation mpirun -np $NSLOTS ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q xs2.q #$ -pe x16 16 #$ -N JOB_NAME module load intel/2020.2.254 module load intelmpi/2020.2.254 # Above settings should be consistent with those used in the compilation MPI_COMMAND=mpirun export I_MPI_PIN=1 export I_MPI_ADJUST_ALLGATHERV=2 export OMP_NUM_THREADS=1 cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > hostfile.$JOB_ID $MPI_COMMAND ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q xi1.q #$ -pe x16 16 #$ -N JOB_NAME module load intel/2020.2.254 module load intelmpi/2020.2.254 # Above settings should be consistent with those used in the compilation MPI_COMMAND=mpirun export I_MPI_PIN=1 export I_MPI_ADJUST_ALLGATHERV=2 export OMP_NUM_THREADS=1 cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > hostfile.$JOB_ID $MPI_COMMAND ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q xh1.q #$ -pe x24 48 #$ -N JOB_NAME #$ -j y module load intel/2020.2.254 module load intelmpi/2020.2.254 # Above settings should be consistent with those used in the compilation MPI_COMMAND=mpirun export I_MPI_PIN=1 export I_MPI_FABRICS=shm:ofa export OMP_NUM_THREADS=1 cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > hostfile.$JOB_ID $MPI_COMMAND ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q xh2.q #$ -pe x24 48 #$ -N JOB_NAME #$ -j y module load intel/2020.2.254 module load intelmpi/2020.2.254 # Above settings should be consistent with those used in the compilation MPI_COMMAND=mpirun export I_MPI_PIN=1 export I_MPI_FABRICS=shm:ofa export OMP_NUM_THREADS=1 cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > hostfile.$JOB_ID $MPI_COMMAND ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q xb1.q #$ -pe x32 32 #$ -N JOB_NAME #$ -j y module load intel/2020.2.254 module load intelmpi/2020.2.254 # Above settings should be consistent with those used in the compilation MPI_COMMAND=mpirun export I_MPI_PIN=1 export I_MPI_FABRICS=shm:ofa export OMP_NUM_THREADS=1 cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > hostfile.$JOB_ID $MPI_COMMAND ./a.out < input.dat > output.dat
#$ -S /bin/bash #$ -cwd #$ -q x17.q #$ -pe x32 32 #$ -N JOB_NAME #$ -j y module load intel/2020.2.254 module load intelmpi/2020.2.254 # Above settings should be consistent with those used in the compilation MPI_COMMAND=mpirun export I_MPI_PIN=1 export I_MPI_FABRICS=shm:dapl export OMP_NUM_THREADS=1 cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > hostfile.$JOB_ID $MPI_COMMAND ./a.out < input.dat > output.dat
Group | Proc. | #CORE/#CPU | Submission node | queue | paral. environ. | Inter-node |
4 | xeon | 8/2 | smith/rafiki/tiamat | xe1.q | x8 | |
5 | xeon | 12/2 | smith/rafiki/tiamat | xe2.q | x12 | |
7 | core i7 sandy-bridge | 6/1 | sb100 | all.q | x6 | |
8 | xeon sandy-bridge | 16/2 | smith/rafiki/tiamat | xs2.q | x16 | |
9 | xeon ivy-bridge | 16/2 | smith/rafiki/tiamat | xi1.q | x16 | |
10 | xeon Haswell | 24/2 | smith/rafiki/tiamat | xh1.q | x24 | infini-band |
11 | xeon Haswell | 24/2 | smith/rafiki/tiamat | xh2.q | x24 | infini-band |
13 | xeon Broadwell | 32/2 | smith/rafiki/tiamat | xb1.q | x32 | infini-band |
14 | xeon Skylake | 32/2 | smith/rafiki/tiamat | x17.q | x32 | infini-band |
NOTE:
The "xe" system is composed of the nodes with the Xeon CPU, which have 2 CPUs (8 or 12 cores) per node. The parallel environment is x8 and x12.
The "sb100" system is based on the Core i7 CPUs with the Sandy-bridge architecture. Each node has 1 CPU (6cores) with 16 GB memory. Fast calculations are possible thanks to the AVX function. The parallel environment is x6.
The "xs" system is based on the Xeon CPUs with the Sandy-bridge architecture. Each node has 1 CPU (6cores) with 32 GB memory. Fast calculations are possible thanks to the AVX function. The parallel environment is x16.
The "xi" system is based on the Xeon CPUs with the Ivy-bridge architecture. Each node has 2 CPUs (16cores) with 128 GB memory. Fast calculations are possible thanks to the AVX function. It is recommend to use this system for Gaussian calculations. The parallel environment is x16.
The "xh" system is composed of the nodes with 2 Xeon CPUs (24 cores in total) and 64 GB memory. The parallel environment is x24.
The "xb" system is composed of the nodes with 2 Xeon Broadwell CPUs (32 cores in total) and 64 GB memory. The parallel environment is x32.
The "x17" system is composed of the nodes with 2 Xeon Skylake CPUs (32 cores in total) and 64 GB memory. The parallel environment is x32.
"|" indicates a network connection, "[]" name, for the computer node
+ Engineering intranet, ODINS network | | Backbone network( no access outside of engineering network) | | +- [smith] -----+ 133.1.116.161 Login & application server & backup server & file server +- [rafiki] ----+ 133.1.116.162 Login & application server & backup server +- [tiamat] ----+ 133.1.116.211 Login & Application server | | | +-- [xe00], [xe01] Calc. node, group 4 (each node has 8 cores (2CPUs)) paral. env.=x8 queue=xe1.q | +-- [xe02]-[xe06] Calc. node, group 5 (each node has 8 cores (2CPUs)) paral. env.=x8 queue=xe1.q | | | +-- [xs01]-[xs18] Calc. node, group 8 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xs2.q | | | +-- [xi01]-[xi12] Calc. node, group 9 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xi1.q | | | +-- [xh01]-[xh17] | +-- [xh19]-[xh34] Calc. node, group 10 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xe1.q | +-- [xh18],[xh35]-[xh43] Calc. node, group 11 (each node has 24 cores (2CPUs)) paral. env.=x24 queue=xh2.q | +-- [xb01]-[xb14] Calc. node, group 13 (each node has 32 cores (2CPUs)) paral. env.=x32 queue=xb1.q | +-- [x1701]-[x1706] Calc. node, group 14 (each node has 32 cores (2CPUs)) paral. env.=x32 queue=x17.q | | | | +- [sb100] -----+ 133.1.116.165 Login node for other groups | +-- [sb101]-[sb120] Calc. node, group 7 (each node has 6 cores (1 CPU))