Brief notes on how to submit jobs to a Condor pool using GridSAM

Introduction

These notes are intended to supplement the GridSAM documentation. The best place to start is the Quick Start Guide, following on from the sections on GridSAM installation. This doesn't cover the use of Condor, which is dealt with in the Deployment Guide. I didn't test anything in the "Advance Job Submission" section of the Quick Start Guide, which covers transferring input and output files via FTP or HTTP.

Job descriptions

The Job Submission Description Language (JSDL) document that I used to test GridSAM is shown in the attachment Test1.xml.

The JSDL specification can be downloaded from the project Web site (https://forge.gridforum.org/projects/jsdl-wg/document/draft-ggf-jsdl-spec/en/28). Note that the WorkingDirectory line in the JSDL document is commented out, because I did not manage to get this feature to work. That is why full paths are given for all the files involved in the job. The executable file, test, is a C program that writes out all the character strings that it finds in the command line arguments and in the standard input stream. Test1.xml specifies that file test1.in is sent to the standard input stream, and that the standard output stream is written to test1.out.

Submitting Jobs to a Condor Pool

The default launching mechanism for GridSAM is a the fork, where the job is simply launched as a process on the machine where the job was submitted. To change the launch mechanism to Condor, the file jobmanager.xml must be changed. This configuration file can be found in the following subdirectory of the OMII Server home directory:

jakarta-tomcat-5.0.25/webapps/gridsam/WEB-INF/classes

Sample files for all the launching mechanisms supported by GridSAM can also be found in this directory. The original version of the sample jobmanager file for Condor, jobmanager-condor.xml, has an important omission, a line specifying the path to the spooler directory used by GridSAM during the execution of the job. The spooler directory is referred to in the Condor section of the GridSAM Deployment guide (http://gridsam.sourceforge.net/1.1/deploymentguide/condor.html), but the appropriate line is omitted from the sample jobmanager file. The jobmanager file I used to test GridSAM with Condor is given in the attachment jobmanager-condor.xml, which has the spooler directory line added. To use this file it must be copied to a file named jobmanager.xml.

Note that each time jobmanager.xml is changed, the OMII server must be restarted by running the stopomii.sh and startomii.sh scripts. To submit a job to a remote Condor pool, an extra section relating to ssh must be added to jobmanager.xml, as in the attachment jobmanager-condor_ssh.xml.

Note that all of the paths (except the path to the private ssh key file at the bottom) now refer to the remote Condor submit host, which is called Machine C in the diagrams in the GridSAM documentation. Before these tests were carried out, my public ssh key had already been added to ~/.ssh/authorized_keys on the remote Condor submit machine, enabling me (or my GridSAM processes) to log onto that machine without entering a password.

I should admit that this remote Condor submission test didn't quite work. The gridsam-status command showed an error relating to the transfer of the executable file to the remote machine, so the job was never actually submitted to the remote Condor pool. I did manage to run a job on the remote machine using ssh only, by using the jobmanager file in attachment jobmanager-ssh.xml, which has the same ssh settings as jobmanager-condor_ssh.xml.

-- DanBretherton - 24 Feb 2006

Topic attachments
I Attachment Action Size Date Who Comment
xmlxml Test1.xml manage 0.6 K 24 Feb 2006 - 18:42 DanBretherton JSDL document to run C program test
xmlxml jobmanager-condor.xml manage 1.8 K 24 Feb 2006 - 18:35 DanBretherton jobmanager file to submit on local Condor submit machine
xmlxml jobmanager-condor_ssh.xml manage 1.8 K 27 Feb 2006 - 15:07 DanBretherton jobmanager file to submit on remote Condor submit machine
xmlxml jobmanager-ssh.xml manage 1.3 K 27 Feb 2006 - 15:13 DanBretherton jobmanager file to submit on remote machine using ssh only
Topic revision: r3 - 27 Feb 2006 - 15:13:56 - DanBretherton
 
This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback