Quantcast
Viewing latest article 2
Browse Latest Browse All 6

Sun Grid Engine – how to send job onto specific hosts ?

One way to do it consists in using queues – you may create unique queue for each host in your SGE grid ( using qconf -aq ) and specify this queue name in submitting parameters -

qsub -q <queue_name> $SGE_ROOT/examples/jobs/simple.sh

In case if you would like do deploy jobs onto grid from application ( C or Java ) SGE supports special API – Direct esource Managment Application API – DRMAA – here’s some examples in C++ and Java which may help to figure out this stuff. There’s SGE DRMAA Javadocs, drmaa package JavaDocs and common help – C library functions listed in section 3. To specify queue name dmraa_set_attribute function should be used as shown below :

drmaa_set_attribute(jt, DRMAA_NATIVE_SPECIFICATION, “q queue_name”, error, DRMAA_ERROR_STRING_BUFFER – 1);

Another way to route jon onto specific host it’s to specify request attributes in qsub  : – qsub -l <request_attr_name> – for Java example please see below. Also you may add “soft” or “hard” resource requirements modifier ( for more see SGE glossary – hard/soft resource requirements).

drmaa_set_attribute(jt, DRMAA_NATIVE_SPECIFICATION, “-hard  -q queue_name”, error, DRMAA_ERROR_STRING_BUFFER – 1);

Here’s a listing of  drmaa C++ example which runs job on specified queue – to build it you may use this simple bash script which listed below – it works on Solaris 10, for Linux I suppose it’s better to use g++ compiler :

INC=-I$SGE_ROOT/include
LIB=-L$SGE_ROOT/lib/sol-x86/
LIB_NAME=-ldrmaa
cc $INC $LIB $LIB_NAME sge_drmaa_test_example.c -o sge_drmaa_test_example.out

If you got below error when you run this example

ld.so.1: sge_drmaa_test_example.out: fatal: libdrmaa.so.1.0: open failed: No such file or directory
Killed

please checkout LD_LIBRARY_PATH environment variable, it should be set in the way like ( Solaris 10 x86 )

export LD_LIBRARY_PATH=$SGE_ROOT/lib/sol-x86/

Java implementation also use DRMAA, but it looks little different from C++ : instead of  drmaaa_set_attribute it called JobTemplate::setNativeSpecification :

job_template.setNativeSpecification(“-hard -q ” + queue_name);

Another way to run job on needed host it’s to specifying hostname as request attributes – it look like

jt.setNativeSpecification(“-l hostname=dev-host1″);

Here’s an java source for sge drmaa example or Java drmaa example archive – zip  contains source file, eclipse project and compiled binaries – to create jar you may use Eclipse export  or run inside bin folder

jar cf SgeDrmaaJobRunner.jar net/bokov/sge/*.class

To run this jar ( and run /tools/job.sh which already deployed on all executors ) on Solaris 10 I use this command

java -cp $SGE_ROOT/lib/drmaa.jar:SgeDrmaaJobRunner.jar -Djava.library.path=$LD_LIBRARY_PATH net.bokov.sge.SgeDrmaaJobRunner soft host  not_wait  /tools/job.sh host2-dev-net

Also you specify not only one queue name, but use a lists of queue’s names as parameter -

qsub -q queue_1, queue_2 $SGE_ROOT/examples/jobs/simple.sh

At least qsub allows this syntax Image may be NSFW.
Clik here to view.
:-)


Viewing latest article 2
Browse Latest Browse All 6

Trending Articles