TORQUE Architecture
A TORQUE cluster consists of one head node and many compute nodes. The head node runs the pbs_server daemon and the compute nodes run the pbs_mom daemon. Client commands for submitting and managing jobs can be installed on any host (including hosts not running pbs_server or pbs_mom).
The head node also runs a scheduler daemon. The scheduler interacts with pbs_server to make local policy decisions for resource usage and allocate nodes to jobs. A simple FIFO scheduler, and code to construct more advanced schedulers, is provided in the TORQUE source distribution. Most TORQUE users choose to use a packaged, advanced scheduler such as Maui or Moab.
Users submit jobs to pbs_server using the qsub command. When pbs_server receives a new job, it informs the scheduler. When the scheduler finds nodes for the job, it sends instructions to run the job with the node list to pbs_server. Then, pbs_server sends the new job to the first node in the node list and instructs it to launch the job. This node is designated the execution host and is called Mother Superior. Other nodes in a job are called sister moms.
Build the distribution on the machine that will act as the TORQUE server - the machine which monitors and controls all compute nodes by running the pbs_server daemon.
The built distribution package works only on compute nodes of a similar architecture. Nodes with different architecture must have the installation package built on them individually.
- Download the TORQUE distribution file from Internet.
- Extract the packaged file and navigate to the unpackaged directory.
# tar -xzvf torque-2.3.4.tar.gz
# cd torque-2.3.4/
- Configure the package.
(For information on customizing the build at configure time, see the configure options list. :)
# ./configure
4. Run make and make install.
(TORQUE must be installed by a root user)
# make
# make install
After installation, verify you have the PATH environment variable configured to include/usr/local/bin/ and /usr/local/sbin/.
By default, make install creates a directory at /var/spool/torque. This directory is referred to as $TORQUEHOME. $TORQUEHOME has several sub-directories, including server_priv/, server_logs/, mom_priv/, mom_logs/, and other directories used in the configuration and running of TORQUE. TORQUE 2.0p2 and later includes a torque.spec file for building your own RPMs. You can also use the checkinstall program to create your own RPM, tgz, or deb package.
In Compute Nodes
Use the Cluster Resources tpackage system to create self-extracting tarballs which can be distributed and installed on compute nodes. The tpackages are customizable. See the INSTALL file for additional options and features.
To create tpackages:
- Configure and make as normal, and then run make packages.
# make packages
Building ./torque-package-clients-linux-i686.sh ...
Building ./torque-package-mom-linux-i686.sh ...
Building ./torque-package-server-linux-i686.sh ...
Building ./torque-package-gui-linux-i686.sh ...
Building ./torque-package-devel-linux-i686.sh ...
Done.
The package files are self-extracting packages that can be copied and executed on your production machines. Use --help for options.
2. Copy the desired packages to a shared location.
# cp torque-package-mom-linux-i686.sh /shared/storage/
# cp torque-package-clients-linux-i686.sh /shared/storage/
3. Install the tpackages on the compute nodes.
Cluster Resources recommends you use a distributed shell to install tpackages on remote systems. The command is dsh -f <FILENAME> <COMMAND>. <FILENAME> is a file with each line containing a host that you want to run the command. Set up SSH keys so you are not required to supply a password for each host.
The only required package for the compute nodes is mom-linux. Additional packages are recommended so you can use client commands and submit jobs from compute nodes.
# dsh -f <FILENAME> torque-package-mom-linux-i686.sh --install
# dsh -f <FILENAME> torque-package-clients-linux-i686.sh –install
(You can use a tool like xCAT instead of dsh.)
1. Copy the tpackage to the nodes.
# prcp torque-package-linux-i686.sh noderange:/destinationdirectory/
2. Install the tpackage.
# psh noderange /tmp/torque-package-linux-i686.sh --install
Alternatively, users with RPM-based Linux distributions can build RPMs from the source tarball in two ways.- To use the default settings, use the rpmbuild command.
# rpmbuild -ta torque-2.3.4.tar.gz
- If configure options are required, untar and build as normal, and then use the make rpms command instead.
Enabling TORQUE as a service (optional)
The method for enabling TORQUE as a service is dependent on the Linux variant you are using. Startup scripts are provided in the contrib/init.d/ directory of the source package.- Red Hat (as root)
# cp contrib/init.d/pbs_mom /etc/init.d/pbs_mom
# chkconfig --add pbs_mom
- SUSE (as root)
# cp contrib/init.d/suse.pbs_mom /etc/init.d/pbs_mom
# insserv -d pbs_mom
- Debian (as root)
# cp contrib/init.d/debian.pbs_mom /etc/init.d/pbs_mom
# update-rc.d pbs_mom defaults
These options can be added to the self-extracting packages. For more details, see the INSTALL file.
Comments
Post a Comment