"Smith" is a computer cluster based on the Intel and Intel-compatible CPUs.
To use the "Smith" system, log in to the following nodes:
To use the "sb100" system, use the following node:
To login "smith" type
ssh -l [userID] 133.1.116.161
or
ssh [userID]@133.1.116.161
In a case you allow the X11 forwarding, use
ssh -Y -l [userID] 133.1.116.161
or
ssh -Y [userID]@133.1.116.161
Currently, you get the following message upon login
-bash: /usr/local/g09/D01/g09/bsd/g09.profile: Permission denied
but it does not affect your work mostly.
NOTE: When you log in for the first time, change your initial password by typing
passwd
To execute your program, use the queueing system, usually using a job script (see below). For instance, to execute a script "job.sh" using the node (24 cores) in the group 10, type
qsub -q xh1.q -pe x24 24 job.sh
Note group and number of cores can be specified in the job script. To see the job status, type
qstat
To see the job status of the specific user, type
qstat -u [user ID]
To cancel a job, use
qdel [job ID]
where the job ID can be obtained by using qstat (the number appearing in the first column).
#$ -S /bin/bash #$ -cwd #$ -q xh1.q #$ -pe x24 24 #$ -N CO source /opt/setting/2016.4/intel-compiler.sh source /opt/setting/2016.4/intel-mpi.sh MPI_COMMAND=mpirun I_MPI_PIN=1 I_MPI_FABRICS=shm:dapl OMP_NUM_THREADS=1 cat $PE_HOSTFILE | awk '{ print $1":"$2/ENVIRON["OMP_NUM_THREADS"] }' > hostfile.$JOB_ID $MPI_COMMAND ./a.out < input.dat > output.dat
Group | Proc. | #Node/#CORE | Submission node | queue | paral. environ. | Inter-node |
4 | xeon | 8/2 | smith/rafiki/tiamat | xe1.q | x8 | |
5 | xeon | 12/2 | smith/rafiki/tiamat | xe2.q | x12 | |
7 | core i7 sandy-bridge | 6/1 | sb100 | all.q | x6 | |
8 | xeon sandy-bridge | 16/2 | smith/rafiki/tiamat | xs2.q | x16 | |
9 | xeon ivy-bridge | 16/2 | smith/rafiki/tiamat | xi1.q | x16 | |
10 | xeon Haswell | 24/2 | smith/rafiki/tiamat | xh1.q | x24 | infini-band |
11 | xeon Haswell | 24/2 | smith/rafiki/tiamat | xh2.q | x24 | infini-band |
13 | xeon Broadwell | 32/2 | smith/rafiki/tiamat | xb1.q | x32 | infini-band |
14 | xeon Skylake | 32/2 | smith/rafiki/tiamat | x17.q | x32 | infini-band |
NOTE:
"|" indicates a network connection, "[]" name, for the computer node
+ Engineering intranet, ODINS network | | Backbone network( no access outside of engineering network) | | +- [smith] -----+ 133.1.116.161 Login & application server & backup server & file server +- [rafiki] ----+ 133.1.116.162 Login & application server & backup server +- [tiamat] ----+ 133.1.116.211 Login & Application server | | | +-- [xe00], [xe01] Calc. node, group 4 (each node has 8 cores (2CPUs)) paral. env.=x8 queue=xe1.q | +-- [xe02]-[xe06] Calc. node, group 5 (each node has 8 cores (2CPUs)) paral. env.=x8 queue=xe1.q | | | +-- [xs01]-[xs18] Calc. node, group 8 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xs2.q | | | +-- [xi01]-[xi12] Calc. node, group 9 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xi1.q | | | +-- [xh01]-[xh17] | +-- [xh19]-[xh34] Calc. node, group 10 (each node has 16 cores (2CPUs)) paral. env.=x16 queue=xe1.q | +-- [xh18],[xh35]-[xh43] Calc. node, group 11 (each node has 24 cores (2CPUs)) paral. env.=x24 queue=xh2.q | +-- [xb01]-[xb14] Calc. node, group 13 (each node has 32 cores (2CPUs)) paral. env.=x32 queue=xb1.q | +-- [x1701]-[x1706] Calc. node, group 14 (each node has 32 cores (2CPUs)) paral. env.=x32 queue=x17.q | | | | +- [sb100] -----+ 133.1.116.165 Login node for other groups | +-- [sb101]-[sb120] Calc. node, group 7 (each node has 6 cores (1 CPU))