Documentation/Policies and procedures for cluster use

From systems
Jump to: navigation, search

Policies and procedures for cluster use

We request that you abide by these few common sense practices when submitting jobs to the clusters.

Suggested practices

When you run jobs, check your jobs regularly to see:

  • How much memory your jobs are using.
  • How much time they are taking to run.
  • How much disk space are they using (for writing files).
  • How many processors are you using at one time.
  • How many jobs you have waiting in the queue.

Rules

Failure to abide by these rules may result disciplinary actions.

  • Do not store files on the cluster node disks. Clean up temporary files written there. These disks are temporal and get overwritten when a cluster node is rebooted.
  • Do not run multiple computationally intensive tasks at the same time under a single job (unless you are in a parallel environment, e.g. mpich or smp).
  • Do not run jobs on the head node or login nodes. These nodes are only for submitting jobs. Test and run jobs on the cluster nodes. When you submit a job to the cluster, you should check to see that it is running on the cluster nodes and not the head node.
  • You can login to individual backend cluster nodes from the head node using qrsh or qlogin. See the submit(1) man page and the documentation for Sun Grid Engine (http://gaia.c2b2.columbia.edu/roll-documentation/sge/4.2/) for information on logging in and submitting jobs to the cluster. You should never access the backend nodes through any other means.
  • Do not attempt to circumvent security, restrictions or resource allotments.
  • This is a shared use system. Everyone must be respectful of other users and their right to use the system. Any attempt to "hog" resources or prevent other users from fair use of the system will not be tolerated.