Euler

From Compute
Jump to navigationJump to search

Overview of EULER Cluster

euler.lbl.gov is a computing cluster consisting of one frontend node and nine dedicated compute nodes. The cluster runs Rocks Cluster 6.6, which is a collection of cluster management tools built on top of CentOS 6.6 ( a standard Redhat Enterprise rebuild, like Scientific Linux). Job scheduling is through Sun Grid Engine (SGE) 6.2u4.

The first time you login to the system you may be asked to generate an SSH key. The process will be started automatically. You will be prompted for a passphrase. This key will only be used to migrate jobs from the frontend to the various compute nodes. It is recommended that you enter an empty passphrase for this key. Otherwise you will have to have an active ssh-agent process running all the time or your batch jobs will not be able to be started. There are no security implications of making the passphrase for this particular key empty (as long as you don't distribute the private key residing in ~/.ssh, of course).


Hardware

  • The frontend has a 2.1GHz Intel XEON E5-2620 processor (thirty-two cores) with 64GB of RAM. It has a 4TB disk, plus a 23TB RAID array.
  • Three compute nodes have two 2.5GHz quad-core Xeon processors (eight cores) with 16GB of RAM and a 750GB local disk.
  • Six compute nodes have two 2.4GHz six-core Xeon processors ( twelve cores ) with 24GB of RAM and a 1TB local disk.
  • Only the frontend is directly connected to the outside network. The compute nodes are interconnected with a dedicated Gigabit ethernet network.

Storage Configuration

  • /home -- The user home directories reside on the frontend's 23TB disk and are NFS exported to all the compute nodes.
  • /home/data3 --The frontend's 23TB disk array is NFS exported to the compute nodes at /home/data3. Users can create subdirectories under /home/data3 as on a standard scratch disk and the contents will be accessible from any compute node.
  • /home/data1 --A large (5.4TB) network mounted disk with a 4Gbit/sec connection to the cluster. The disk is configured as a standard scratch disk, just like /home/data3.
  • /home/data2 --A large (4.6TB) network mounted disk with a 4Gbit/sec connection to the cluster. The disk is configured as a standard scratch disk, just like /home/data3.
  • /home/data -- For historical reasons, /home/data3/data is just a link to /home/data.
  • /state/partition1 -- Each compute node has a local working area mounted at /state/partition1. These areas are local to each node and are not visible to the other nodes. If your job is disk intensive, you may wish to have your jobscript copy files onto the local work space at the beginning of the job. If your job writes files into the local work space, be sure to have your jobscript move them out when it is done. This is for two reasons. First, you will not necessarily know what node your job ran on, so it may be difficult to find the files later. Second, it leaves the node clean for the next user.

Running Jobs -- Job Control

The policy regarding usage of the cluster is currently evolving. We will find out what works best and implement it based on user needs and preferences. For the moment here is a rough outline. Management has attempted to accomodate the user comments provided so far.

Interactive Jobs

You may run interactive jobs on the frontend. As long as the job is not too CPU intensive it should not disrupt the cluster. Tasks such as editing, building and debugging code are most simply accomplished simply by logging into euler.lbl.gov and doing your thing.

If you have a cpu intensive interactive job, you can use the command qlogin. This will select a free compute node and log you into it. This is a good procedure for things like root or for interactive jobs that require occassional user input.

If you must subvert the batch system selection provided by qlogin, it is possible to ssh directly into the compute nodes. They are named compute-1-0 through compute-1-2, and compute-2-0 through compute-2-5 and are accessible only through the frontend (euler.lbl.gov). This is not recommended unless you have a good reason.

Batch Jobs

After reading this section, please visit the new examples page

  • Batch jobs are submitted to the cluster via the command qsub. There are a total of 110 slots.
  • Currently the cluster is partitioned into two queues: long and short.
  • You can submit jobs to these queues via 'qsub -q long myscript.sh' or 'qsub -q short myscript.sh'



Some examples:

  • qsub myscript.sh will start the job myscript.sh on any free compute node.
  • qsub -q fast myscript.sh will start the job on some free node in compute-1-1 through compute-1-3, avoiding the slower compute-0-0 through compute-0-6

The command passed to qsub in this form must be a script (not a binary). If you must pass a binary directly to qsub, it is possible. Use the syntax qsub -q QUEUENAME -b y myprogram You can see the status of the batch queues by qstat -f. You can suppress the listing of empty queues by qstat -f -ne.

There is also a graphical interface to the batch system. Run qmon on the frontend to start it. It may or may not be of interest.

Management is in the process of defining different batch queues to allow finer grained control (days versus nights versus weekend ... high priority vs low priority). This is a work in progress. User feedback is welcome.

Important note: The batch system does NOT automatically "parallelize" your job scripts. One qsub invocation will start your job on a single processor. To take advantage of the multiple processors available in the cluster, you must break your job up into pieces and submit each piece via qsub.

The exception to this is if your code is actually written for a parallel environment, specifically using MPI commands. In that case, you can start it on multiple processors by the mpirun command. See the documentation to find out how to do this.

Software Tools

All editing and compiling tools should be available on the frontend. If you need something that is not installed, inform Jeff Anderson.

The processors in the cluster are 64bit, as is the linux distribution running on them. To take full advantage of this you should probably compile your applications directly on the cluster. Applications built on the regular linux workstations will be 32bit, and should run on the cluster, but performance will likely be better if you rebuild on the cluster, if at all possible.

Various HEP software is installed on all nodes at /share/apps. All the versions listed here are 64bit. This could cause problems if you have dynamically linked 32bit applications imported from elsewhere. The solution is usually to rebuild your application on the cluster.

Python

A special note about python is in order. There are many versions of python installed: 2.4 is the default.

You can set your python version using the 'module' command. Examples that do what you think they do are:

module avail

to see a list of all versions installed.

module load python/2.7.2
module unload python
module switch python/2.6.6

You can see all the software versions managed by module by

module avail

In Progress

Please visit this page regularly to find out the latest status. All requested software has now been installed. User feedback is welcome. The initial batch structure has been setup according to user comment. Further user comment about how it is working is welcome.