Velocity Scheduler:
How to Run a Serial Batch Job in Linux

This document provides step-by-step instructions to run a simple serial batch job on a CTC compute node.  The instructions are followed by a sample session demonstrating the same steps.

Instructions

1. The scheduler is installed on all CTC login nodes (e.g. linuxlogin1.tc.cornell.edu). Connect to one of the nodes using ssh as documented in Accessing CTC Machines .

2. Run the following script before submitting your first batch job. You will not need to run it again unless the files it creates are inadvertently deleted or changed. The script creates ssh keys and an MPI file required for batch runs.

/ctc/tools/setup_ssh_mpd_linux.sh
Complete details on what the script does are in the Appendix of the document Important vsched Information for Linux .

3. Prepare a file named your_job .xml in the format shown here. All of the xml tags shown in the example are required. This file specifies number of minutes, number of nodes, and type of job (batch or interactive), as well as:

    • what you want to run (e.g. your perl script)
    • where you want to run it (e.g. the v2linux dev pool of nodes)

<run>...</run> tags specify what you want to run on the nodes. It can be any script or executable.

<type>...</type> tags must contain either batch or interactive . 

If the job type is interactive, the run tags may be left blank; anything specified in the run tags will be ignored.

<?xml version="1.0" ?>

<!-- Sample XML Job File -->

<job>
<nodes>1</nodes>
<minutes>10</minutes>
<type>batch</type>
<affiliation>v2linuxdev</affiliation>
<run>/bin/sh $HOME/test/MyJob.sh</run>
</job>

4. The previous step stated that within the <run>...</run> tags you can specify any script or executable.  We have used a .sh script for this example.  In general, a batch script is used to:

  1. set up a temporary directory on the compute node for your files
  2. copy the files your job needs to the temporary directory
  3. run your program(s)
  4. copy the output files back to the CTC fileservers
  5. delete your files in the temporary directory on the compute node
  6. end the batch session
#!/bin/sh  

# create a local directory on /tmp
mkdir -v /tmp/$USER

# copy the files
cp $HOME/test/karpc.exe /tmp/$USER/karpc.exe
cp $HOME/test/values /tmp/$USER/values

cd /tmp/$USER

# run the executable from local disk
./karpc.exe >&karpc.stdout

# Copy output files to your output folder
cp -f /tmp/$USER/karpc.std* $HOME/test

# delete all remaining files on /tmp/$USER
rm -r /tmp/$USER
vsched -c

5. Submit the xml file from the command prompt. The JobID will be returned. No files will be read or copied until your job actually starts. The warning statement indicates that vsched did not have permissions to verify the program you specified in the run tags.

    -sh-3.00$ vsched -submit MyJob.xml
    WARNING: Run statement cannot be verified
    47694

6. Your job should now either be running or be in the queue waiting to start. At this point you can simply wait for it to finish, or you can view the queue

-sh-3.00$ vsched -q

or cancel your job

-sh-3.00$ vsched -c <JobID>

or use ssh to log into the node where your job is running to either see that the job is running properly, or to issue commands.

Example

This sample session begins after you have logged into a CTC linux login node.

sh-3.00$ cd $HOME

-sh-3.00$ cat /ctc/tools/velocity.pub >> $HOME/.ssh/authorized_keys
-sh-3.00$ chmod 700 .ssh
-sh-3.00$ chmod 600 .ssh/authorized_keys

-sh-3.00$ cd test

-sh-3.00$ dir –lt

total 17

-rwxr-xr-x 1 susan techies 441 Jan 17 13:44 MyJob.sh
-rwxr-xr-x 1 susan techies 440 Jan 17 13:39 MyJob.sh~
-rwxr-xr-x 1 susan techies 14820 Jan 17 12:02 karpc.exe
-rwxr-xr-x 1 susan techies 10 Jan 17 11:52 values
-rwxr-xr-x 1 susan techies 211 Jan 17 10:08 MyJob.xml
-rwxr-xr-x 1 susan techies 222 Jan 17 10:02 MyJob.xml~

-sh-3.00$ cat MyJob.xml

<?xml version="1.0" ?>
<!-- Sample XML Job File -->
<job>
<nodes>1</nodes>
<minutes>10</minutes>
<type>batch</type>
<affiliation>v2linuxdev</affiliation>
<run>/bin/sh $HOME/test/MyJob.sh</run>
</job>

-sh-3.00$ cat MyJob.sh

#!/bin/sh

# create a local directory on /tmp
mkdir -v /tmp/$USER

# copy the files

cp $HOME/test/karpc.exe /tmp/$USER/karpc.exe
cp $HOME/test/values /tmp/$USER/values

cd /tmp/$USER

# run the executable from local disk
/karpc.exe >&karpc.stdout

# Copy output files to your output
# This includes all output files from your
cp -f /tmp/$USER/karpc.std* $HOME/test

# delete all remaining files on /tmp/$USER
rm -r /tmp/$USER

vsched –c

-sh-3.00$ vsched -submit MyJob.xml

WARNING: Run statement cannot be verified
47819

-sh-3.00$ vsched -q | grep v2linuxdev

47819 susan 1 0:10 B C 14:06 01/17 VII0001 v2linuxdev

-sh-3.00$ dir –lt

total 18
-rw-r--r-- 1 susan techies 153 Jan 17 13:56 karpc.stdout
-rwxr-xr-x 1 susan techies 441 Jan 17 13:44 MyJob.sh
-rwxr-xr-x 1 susan techies 440 Jan 17 13:39 MyJob.sh~
-rwxr-xr-x 1 susan techies 14820 Jan 17 12:02 karpc.exe
-rwxr-xr-x 1 susan techies 10 Jan 17 11:52 values
-rwxr-xr-x 1 susan techies 211 Jan 17 10:08 MyJob.xml
-rwxr-xr-x 1 susan techies 222 Jan 17 10:02 MyJob.xml~

-sh-3.00$ cat karpc.stdout

Approximation interval is 10
sum, err = 3.14243, 8.333314e-04
Approximation interval is 100
sum, err = 3.14160, 8.333333e-06
Approximation interval is 0
-sh-3.00$