Skip to main content
Skip table of contents

Verifying your HTCondor installation

Basic Quick Test

The test described here serves to test the basic HTCondor functionality (check for proper credentials, user account privileges and assigned Hostnames / IP addresses). It should complete successfully before running the extended test. The test requires the setup of two ASCII files:

Create a “test.bat” with following content:

CODE
REM Simple test file to test batching in condor
Echo This is just the %2 %1

Create a text file “simple.sub” with following content:

CODE
Universe   = vanilla
Executable = test.bat
Arguments  = test first
Log        = simple.log
Output     = simple.out
Error      = simple.error
# OS requirements 
Requirements = (OpSys == "Windows" || OpSys == "WINNT61" ) && (Arch == "X86_64" || Arch == "INTEL") && (SlotID == 1)

# Add Ranking requirements to select machines to run this job based on rank
# By default we want the machine that is the fastest and with the most available memory
# Reduce the rank for the machine that matched last time but didn't finish the job
Rank = kflops + memory*1024 - (Machine =?= LastRemoteHost)*500000

# Be sure to copy files back and forth to the node (linux disables this by default)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT

On_exit_hold = ( ExitCode != 0 )
On_exit_remove = (ExitBySignal == False) && (ExitCode == 0)
Queue

Put both files in any folder for testing e.g. “C:\temp\CondorTest”. Continue and open a Windows CMD Shell, switch to your test directory and use “condor_submit simple.sub”. The submission should work without any error messages. Check the *.log\ *.out  and *.err files generated in the test directory. If no errors are reported in any of the files, this indicates that HTCondor is in general running. In case of errors verify:

  • If user credentials are valid (condor_store_cred query)

  • If multiple Network Interface cards (NIC) are available/activated. If so, specify the IP address of the preferred NIC in the condor_config file by adding

    • (NETWORK_INTERFACE = <IP address of desired interface>)

  • When HTCondor is about to start a job, the condor_starter daemon creates a “temporary” run account on the machine with a login name of condor-slot<X>, where <X> is the slot number of the condor_starter. This account is added to group Users by default.

Extended Test

Once the basic quick test is running, we have validated that the HTCondor setup is correct in that the submitter can talk to the pool manager and nodes are able to pick up the jobs. The next hurdle in the distributed computing setup is to make sure the compute Nodes can Map Network drives with the same permission and location path as the submitter machine.

First update the “test.bat” with following:

CODE
REM Simple test file to test batching in condor
Echo This is just the %2 %1

REM Mapping Share Drives with user and password
echo Map share drives
set USER=<enter user>
set PASSWORD=<enter password>

if not exist Y: net use Y: \\host\Data_Path %PASSWORD% /USER:%USER% /persistent:no

dir Y:\

Next run the same steps as above and validate that the job runs with out any error and the “simple.out” file contains a list of the files found in the Mapped drives path.

Using these two tests to validate the setup on the HTCondor removes the majority of the problems users face in setting up a distributed job management system.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.