asc:tutorials:2014october_connect
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
asc:tutorials:2014october_connect [2014/10/09 19:14] – [xAOD tutorial at ANL using ATLAS connect] asc | asc:tutorials:2014october_connect [2014/10/10 15:28] (current) – [Lesson 6: Using HTCondor and Tier2] asc | ||
---|---|---|---|
Line 201: | Line 201: | ||
testRun submitDir | testRun submitDir | ||
</ | </ | ||
- | ====== Lesson 5: Running a job on multiple cores ====== | ||
- | Now we run the above job on multiple | + | ====== Lesson 5: Running |
- | Prepare a fresh directory: | + | This example is not needed for ATLAS connect. If you still want to know how to run an ATLAS analysis job on several cores of your desktop, |
- | < | + | look at [[asc: |
- | source setup.sh | + | |
- | mkdir lesson_5; cd lesson_5 | + | |
- | </ | + | |
- | And setup the package: | + | ====== Lesson 6: Using HTCondor and Tier2 ====== |
- | <code bash> | + | |
- | mkdir ANLanalysisHisto_threads | + | |
- | cd ANLanalysisHisto_threads | + | |
- | rcSetup -u; rcSetup Base, | + | |
- | rc find_packages | + | |
- | rc compile | + | |
- | </ | + | |
- | This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory | + | Lesson 5: Working on a Tier3 farm (Condor queue) |
- | <code bash> | + | In this example we will use HTCondor workload management system |
- | cp -r / | + | Start from the new shell and set up environment, |
- | rc find_packages | + | |
- | rc compile | + | |
- | cd MyAnalysis/ | + | |
- | </ | + | |
- | This example has a modified. " | + | <file bash startJob.sh> |
- | small script | + | # |
+ | export RUCIO_ACCOUNT=YOUR_CERN_USERNAME | ||
+ | export ATLAS_LOCAL_ROOT_BASE=/ | ||
+ | source ${ATLAS_LOCAL_ROOT_BASE}/ | ||
+ | localSetupFAX | ||
+ | source $AtlasSetup/ | ||
+ | export X509_USER_PROXY=x509up_u21183 | ||
+ | unzip payload.zip | ||
+ | ls | ||
+ | rcSetup -u; rcSetup Base, | ||
+ | rc find_packages | ||
+ | rc compile | ||
+ | cd MyAnalysis/ | ||
+ | rm submitDir | ||
+ | |||
+ | echo $1 | ||
+ | sed -n $1, | ||
+ | cp Inp_$1.txt inputdata.txt | ||
+ | cat inputdata.txt | ||
+ | echo "startdate $(date)" | ||
+ | testRun submitDir | ||
+ | echo "enddate $(date)" | ||
+ | </ | ||
- | * Generates input list " | + | Make sure the RUCIO_ACCOUNT variable is properly set. Make this file executable and create the file that describes our job needs and that we will give to condor: |
- | * Splits it using a given number of threads (2 is default) | + | |
- | * launches 2 jobs using input files from **inputs/ | + | |
- | * Output files will go to **outputs/ | + | |
- | The output | + | <file bash job.sub> |
+ | Jobs=10 | ||
+ | getenv | ||
+ | executable | ||
+ | output | ||
+ | error = MyAnal_$(Jobs).$(Process).error | ||
+ | log = MyAnal_$(Jobs).$(Process).log | ||
+ | arguments = $(Process) $(Jobs) | ||
+ | environment = "IFlist=$(IFlist)" | ||
+ | transfer_input_files = payload.zip,/ | ||
+ | universe | ||
+ | # | ||
+ | queue $(Jobs) | ||
+ | </ | ||
- | To run the job, run "./A_RUN" | + | To access files using FAX the jobs need a valid grid proxy. That's why we send it with each job. Proxy is the file starting with "x509up" |
- | <code bash> | + | You need to pack all of the working directory into a payload.zip file: |
- | ps rux | grep -e " | + | |
- | </ | + | |
- | Ones the jobs are done, you can merge the output files to " | + | <file bash startJob.sh> |
- | <code bash> | + | startJob.sh |
- | hadd -f hist.root outputs/*/ | + | rc clean |
- | </code> | + | rm -rf RootCoreBin |
+ | zip -r payload.zip * | ||
+ | </file> | ||
- | ====== Lesson 6: Working on the ANL farm ====== | + | Now you may submit your task for the execution and follow its status in this way: |
- | This example is ANL-specific. If you ran analysis code for RUN I, you should be a familiar with it. We used [[http:// | ||
- | |||
- | <note important> | ||
- | |||
- | First, create a directory for this example: | ||
- | |||
- | Prepare a fresh directory: | ||
< | < | ||
- | mkdir lesson_6 | + | chmod 755 ./ |
</ | </ | ||
- | |||
- | Copy the needed directories with RootCore example: | ||
<code bash> | <code bash> | ||
- | cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_6/* lesson_6/ | + | ~> condor_submit job.sub |
+ | Submitting job(s).......... | ||
+ | 10 job(s) submitted to cluster 49677. | ||
+ | |||
+ | ~> condor_q ivukotic | ||
+ | -- Submitter: login.atlas.ci-connect.net : < | ||
+ | | ||
+ | 49677.0 | ||
+ | 49677.1 | ||
+ | 49677.2 | ||
+ | 49677.3 | ||
+ | 49677.4 | ||
+ | 49677.5 | ||
+ | 49677.6 | ||
+ | 49677.7 | ||
+ | 49677.8 | ||
+ | 49677.9 | ||
+ | |||
+ | 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended | ||
</ | </ | ||
- | |||
- | Then check that you can compile the analysis example using RootCore: | ||
- | |||
- | <code bash> | ||
- | cd lesson_6/ | ||
- | source setup.sh | ||
- | </ | ||
- | |||
- | As usual, our analysis is in " | ||
- | |||
- | Now we want to submit jobs to the data distributed on several computers of the farm. Go to the upper directory and setup the farm: | ||
- | |||
- | <code bash> | ||
- | cd ../; source s_asc; | ||
- | </ | ||
- | |||
- | We will send jobs using the directory " | ||
- | |||
- | <code bash> | ||
- | cd submit | ||
- | arc_ls -s / | ||
- | arc_ls | ||
- | </ | ||
- | |||
- | The first command shows the summary of distributed data (12 files per server), while the second lists all available data on each node. | ||
- | Now we will send the job to the farm. | ||
- | Change the line: | ||
- | < | ||
- | package_dir=/ | ||
- | </ | ||
- | inside " | ||
- | This sends jobs to the farm (2 jobs per server). Check the status as: | ||
- | |||
- | <code bash> | ||
- | condor_q | ||
- | </ | ||
- | |||
- | When the jobs are done, the output files will be inside " | ||
- | |||
- | <code bash> | ||
- | arc_add | ||
- | </ | ||
- | This will create the final output file " | ||
- | |||
- | |||
- |
asc/tutorials/2014october_connect.1412882080.txt.gz · Last modified: 2014/10/09 19:14 by asc