ATLAS Tier3g environment is set up so that most analysis instructions available on general ATLAS Twiki's will work. This documentation will orient you in some specific features of a standard Tier3g(T3g) and point to other relevant documentation. The model Tier3g at ANL ASC is described here; where your own T3g will likely differ from ANL ASC, is noted.
ATLAS Tier3g consists of the following elements
ascint0y.hep.anl.gov ancint1y.hep.anl.gov
Your T3g account will have bash shell as default. It's recommended that you stick with this. Given the limited manpower, we did not install the rebuilt special C-shell needed for ATLAS software–nor test any of the functionality from C-type shells.
Your home login area will normally be /export/home/your_user_name. In case of ANL ASC, it is /users/your_user_name. This is because the home area at ANL ASC is shared with another cluster. You may find a similar arrangement at your T3g.
Before you get to work, you will probably want to do the following for convenience. In .bash_profile, put in
# Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi
And in .bashrc, put in.
# add ~/bin/ to path PATH=$PATH:$HOME/bin:./ # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi
To get the default bashrc to give you a prompt that shows user and node, add the current directory, and your own stuff to your PATH; you are also ready to put in aliases and functions in .bashrc.
The ATLAS envrionment in a T3g is based on the https://twiki.atlas-canada.ca/bin/view/AtlasCanada/ATLASLocalRootBase][ATLAS Local Root Base Package developed at ATLAS Canada. The original documentation resides on Canadian ATLAS pages; they will move to CERN (as will these pages) in the future and be maintained centrally.
The other part of your environment comes from the file system CVMFS which is a web file system (and part of the http://cernvm.cern.ch/cernvm/][CERNVM project–although what is being used here does not have to do with Virtual Machines) that maintains the Athena versions as well as conditions data centrally at CERN.
The two environments are designed to work together.
To start your envrionment, you need to do the following:
export ATLAS_LOCAL_ROOT_BASE=/export/share/atlas/ATLASLocalRootBase alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'
You can put this into .bashrc or, if you prefer, make a separate shell script that you can execute.
Now you can do:
setupATLAS
You should see the following output on your screen
...Type localSetupDQ2Client to use DQ2 Client ...Type localSetupGanga to use Ganga ...Type localSetupGcc to use alternate gcc ...Type localSetupGLite to use GLite ...Type localSetupPacman to use Pacman ...Type localSetupPandaClient to use Panda Client ...Type localSetupROOT to setup (standalone) ROOT ...Type localSetupWlcgClientLite to use wlcg-client-lite ...Type saveSnapshot [--help] to save your settings ...Type showVersions to show versions of installed software ...Type createRequirements [--help] to create requirements/setup files ...Type changeASetup [--help] to change asetup configuration ...Type setupDBRelease to use an alternate DBRelease ...Type diagnostics for diagnostic tools
This is the generally recommended way to run Athena at a Tier3. The athena versions which is suitable at scientific linux 5 (sl5) installation such as ANL ASC is at
/opt/atlas/software/i686_slc5_gcc43_opt/
Note: /opt/atlas/ area is remotely mounted and cached locally: this means you shouldn't do a recursive command (like ls -R) on these directories or you could be waiting for a very long time.
You can do a simple “ls”, for example to find installed versions on CVMFS:
[test_user@ascwrk2 ~]$ ls /opt/atlas/software/i686_slc5_gcc43_opt/ 15.6.3 15.6.4 15.6.5 15.6.6 gcc432_i686_slc4 gcc432_i686_slc5 gcc432_x86_64_slc5
You can look for patched versions in the following way:
[ryoshida@ascint1y ~]$ ls /opt/atlas/software/i686_slc5_gcc43_opt/15.6.6/AtlasProduction/ 15.6.6 15.6.6.1 15.6.6.2 15.6.6.3 15.6.6.4
You can set up your testarea as usual (example here sets up 16.0.0)
mkdir ~/testarea mkdir ~/testarea/16.0.0 export ATLAS_TEST_AREA=~/testarea/16.0.0
Now you need to set up the correct version of the C++ compiler for Athena and your setup (at ANL ASC it is 64-bit slc5) using the environment created by the ATLASLocalRootBase package. (This version of gcc will become the default in the future).
localSetupGcc --gccVersion=gcc432_x86_64_slc5
Now you need to setup the version you want. (An alternate setup procedure using cmthome directory is HowToCreateRequirements][HERE)
source /opt/atlas/software/i686_slc5_gcc43_opt/16.0.0/cmtsite/setup.sh -tag=16.0.0,AtlasOffline,32,opt,oneTest,setup
For patched versions, an example is
source /opt/atlas/software/i686_slc5_gcc43_opt/16.0.0/cmtsite/setup.sh -tag=16.0.0.1,AtlasProduction,32,opt,oneTest,setup
(note the directory of the setup.sh is in the main release version. Also note that the “tag” options are somewhat different for base and patched version)
You may need the following definition in addition to run some type of jobs. (They define how to access conditions files and database)
export FRONTIER_SERVER="(proxyurl=http://vmsquid.hep.anl.gov:3128)(serverurl=http://squid-frontier.usatlas.bnl.gov:23128/frontieratbnl)"
In this, “vmsquid.hep.anl.gov” is specific to the ANL ASC cluster. Ask your administrator for the name of your local squid server.
(For recent versions of Atlas Local Root Base, it is no longer necessary to define the following: “export ATLAS_POOLCOND_PATH=/opt/atlas/conditions/poolcond/catalogue”)
Sometimes, a job will require a specific recent Database release which is not shipped with Athena versions. If this is the case, it is possible to access these which are installed on CVMFS. To see which versions of the database are available:
[ryoshida@ascint1y ~]$ ls /opt/atlas/database/DBRelease 9.6.1 9.7.1 9.8.1 9.9.1 current
If you want to use these instead (9.6.1 in this example) of the ones built into the athena version, give the following commands.
export DBRELEASE_INSTALLDIR="/opt/atlas/database" export DBRELEASE_VERSION="9.6.1" export ATLAS_DB_AREA=${DBRELEASE_INSTALLDIR} export DBRELEASE_OVERRIDE=${DBRELEASE_VERSION}
More information on database releases are HERE.
In order to check out packages from CERN SVN using commands like “cmt co”. you need to do a Kerberos authentication. If your local user_name is not the same as at CERN (your lxplus account), you will need to create a file called.
~/.ssh/config
This file should contain the following.
Host svn.cern.ch User your_cern_username GSSAPIAuthentication yes GSSAPIDelegateCredentials yes Protocol 2 ForwardX11 no
Then you can give the commands (after setting up an Athena version)
kinit [email protected] ! give lxplus password export SVNROOT=svn+ssh://svn.cern.ch/reps/atlasoff
and you will have access to the svn repository at CERN.
If your usernames are the same, you only need to do the kinit command.
At this stage, you are set up so that examples in the Atlas computing workbook should work. (But skip the “setting up your account” section–you have done the equivalent already). Also examples from Physics Analysis Workbook should work. The following is a small example to get you started.
cmt show versions PhysicsAnalysis/AnalysisCommon/UserAnalysis
cmt co -r UserAnalysis-nn-nn-nn PhysicsAnalysis/AnalysisCommon/UserAnalysis
cd PhysicsAnalysis/AnalysisCommon/UserAnalysis/run
get_files -jo HelloWorldOptions.py
athena.py HelloWorldOptions.py
The algorithm will first initialize and will then run ten times (during each run it will print various messages and echo the values given in the job options file). Then it will finalize, and stop. You should see something that includes this:
HelloWorld INFO initialize() HelloWorld INFO MyInt = 42 HelloWorld INFO MyBool = 1 HelloWorld INFO MyDouble = 3.14159 HelloWorld INFO MyStringVec[0] = Welcome HelloWorld INFO MyStringVec[1] = to HelloWorld INFO MyStringVec[2] = Athena HelloWorld INFO MyStringVec[3] = Framework HelloWorld INFO MyStringVec[4] = Tutorial
If so you have successfully run Athena !HelloWorld.
After doing
setupATLAS
give the command:
localSetupDQ2Client
You'll get a banner
************************************************************************ It is strongly recommended that you run DQ2 in a new session It may use a different version of python from Athena. ************************************************************************ Continue ? (yes[no]) :
Say “yes”. It's safest to dedicate a window for DQ2, or log out and in after using DQ2 if you want to use Athena.
The usage and documentation on DQ2 tools are HERE.
As usual, you need to copy your certificates userkey.pem and usercert.pem in the ~/.globus area. Instruction on obtaining certificates (for US users) is HERE.
After setting up for Athena as described above give the following command:
localSetupPandaClient
After the above setup you can follow the general Distributed Analysis on Panda Instructions HERE. Skip the Setup section in the document since you have done this already for T3g.
Your Tier3 will have local batch queues you can use to run over larger amount of data. In general one batch queue is more or less equivalent to one analysis slot at a Tier2 or Tier1. As an example, ANL ASC cluster has 42 batch queues; this means a job that runs at a Tier1/2 analysis queues in an hour, split to 42 jobs, will also run in about an hour at ANL ASC (assuming you are using all of the queues).
<!– The basic batch system that is running is Condor which can be used on it's own as described https://atlaswww.hep.anl.gov/twiki/bin/view/UsAtlasTier3/Tier3gUsersGuide#Condor][below. However, most will probably want to take advantage of some type of interface to make this easier. These are
–>
This is still under preliminary tests.
The batch nodes of your cluster may be configured so that it can accept Pathena jobs in a similar way to Tier2 and Tier1 analysis queues. The description of Tier3 Panda is HERE.
The name of the Panda site at ASC ANL is ANALY_ANLASC. Pathena submission with option –site=ANALY_ANLASC will submit jobs to the Condor queues described below.
T3g is not part of the Grid and the data storage you have locally is not visible to Panda.
The following is an example of a command submitting a job to ANALY_ANLASC. This submitted the Analysis Skeleton example from the computing workbook.
pathena --site=ANALY_ANLASC --pfnList=my_filelist --outDS=user10.RikutaroYoshida.t3test.26Mar.v0 AnalysisSkeleton_topOptions.py
Note that everything is the same as usual except that ANALY_ANLASC is chosen as the site and the input is specified by a file called my_filelist.
[ryoshida@ascint1y run]$ cat my_filelist root://ascvmxrdr.hep.anl.gov//xrootd/mc08.105597.Pythia_Zprime_tt3000.recon.AOD.e435_s462_s520_r808_tid091860/AOD.091860._000003.pool.root.1
This file specifies the file you want use as input in the format which can be understood by the local system. The local data storage for you batch jobs is explained below under XRootD.
The output of the jobs cannot be registered with DQ2 as with the T2 or T1, but is kept locally in the area.
/export/share/data/users/atlasadmin/2010/YourGridId
. For example:
[ryoshida@ascint1y run]$ ls /export/share/data/users/atlasadmin/2010/RikutaroYoshida/ user10.RikutaroYoshida.t3test.22Mar.v0 user10.RikutaroYoshida.t3test.25Mar.v0 user10.RikutaroYoshida.t3test.22Mar.v1 user10.RikutaroYoshida.t3test.25Mar.v0_sub06230727 user10.RikutaroYoshida.t3test.22Mar.v1_sub06183810 user10.RikutaroYoshida.t3test.26Mar.v0 user10.RikutaroYoshida.t3test.22Mar.v2 user10.RikutaroYoshida.t3test.26Mar.v0_sub06253711 user10.RikutaroYoshida.t3test.22Mar.v2_sub06183821
Looking at the output of the job submitted by the command above:
ls /export/share/data/users/atlasadmin/2010/RikutaroYoshida/user10.RikutaroYoshida.t3test.26Mar.v0_sub06253711 user10.RikutaroYoshida.t3test.26Mar.v0.AANT._00001.root user10.RikutaroYoshida.t3test.26Mar.v0._1055967261.log.tgz
The output root file and the zipped log files can be seen.
ArCond (Argonne Condor) is a wrapper around condor which allows you to automatically parallelize your jobs (athena and non-athena) with input data files and helps you to concatenate your output at the end. Unlike pathena, this is a completely local submission system.
To start with !ArCond do the following
mkdir arctest cd arctest source /export/home/atlasadmin/condor/Arcond/etc/arcond/arcond_setup.sh arc_setup
This will set up your arctest directory
[ryoshida@ascint1y arctest]$ arc_setup Current directory=/users/ryoshida/asc/arctest --- initialization is done --- [ryoshida@ascint1y arctest]$ ls DataCollector Job arcond.conf example.sh patterns user
Now do
arc_ls /xrootd/
to see the files loaded into the batch area. You are now set up to run a test job from the arctest directory with the “arcond” command. We're working on a more complete description. However https://atlaswww.hep.anl.gov/twiki/bin/view/Workbook/UsingPCF][these pages although it describes an installation on a different cluster, contains most of the needed information.
The batch system being used by ANL ASC is Condor. There are detailed documentation in the link. The information here is meant for you to be able to look at the system and to do simple submissions. We believe you will do most of your job submission using interfaces which are provided (pathena and !ArCond: see above) so that parallel processing of the data is automated.
There is no user setup necessary for using Condor.
To see what queues are there, give the following command.
[test_user@ascint1y ~]$ condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime [email protected] LINUX X86_64 Unclaimed Idle 0.000 2411 1+19:38:32 [email protected] LINUX X86_64 Unclaimed Idle 0.000 2411 1+19:42:55 ...(abbreviated) [email protected] LINUX X86_64 Claimed Busy 0.000 2411 0+00:00:05 [email protected] LINUX X86_64 Claimed Busy 0.010 2411 0+00:00:05 [email protected] LINUX X86_64 Claimed Busy 0.010 2411 0+00:00:06 [email protected] LINUX X86_64 Claimed Busy 0.010 2411 0+00:00:07 [email protected] LINUX X86_64 Claimed Busy 0.010 2411 0+00:00:08 [email protected] LINUX X86_64 Claimed Busy 0.010 2411 0+00:00:09 ...(abbreviated) [email protected]. LINUX X86_64 Unclaimed Idle 0.000 2411 0+14:10:17 [email protected]. LINUX X86_64 Unclaimed Idle 0.000 2411 0+14:10:18 [email protected]. LINUX X86_64 Unclaimed Idle 0.000 2411 0+14:10:11 [email protected]. LINUX X86_64 Unclaimed Idle 0.000 2411 0+14:10:12 Total Owner Claimed Unclaimed Matched Preempting Backfill X86_64/LINUX 45 3 14 28 0 0 0 Total 45 3 14 28 0 0 0
This tells you that there are 45 queues (3 reserved for service jobs) and the status of each queue. Note that queues (slots) in ascwrk1 nodes are running a job (busy). To see the queues themselves:
[test_user@ascint1y ~]$ condor_q -global -- Submitter: ascint1y.hep.anl.gov : <146.139.33.41:9779> : ascint1y.hep.anl.gov ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 76139.0 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.1 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.2 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.3 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.4 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.5 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.6 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.7 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.8 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.9 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.10 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.11 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.12 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 76139.13 test_user 3/18 10:53 0+00:00:00 I 20 0.0 run_athena_v2_1.sh 14 jobs; 14 idle, 0 running, 0 held
In this case 14 jobs from the user “test_user” is in the idle state (just before it begins to run)
Prepare your submission file. An example is the following
[test_user@ascint1y xrd_rdr_access_local]$ less run_athena_v2.sub # Some incantation.. universe = vanilla # This is the actual shell script that runs executable = /export/home/test_user/condor/athena_test/xrd_rdr_access_local/run_athena_v2.sh # The job keeps the environmental variables of the shell from which you submit getenv = True # Setting the priority high Priority = +20 # Specifies the type of machine. Requirements = ( (Arch == "INTEL" || Arch == "X86_64")) # You can also specify the node on which it runs, if you want #Requirements = ( (Arch == "INTEL" || Arch == "X86_64") && Machine == "ascwrk2.hep.anl.gov") # The following files will be written out in the directory from which you submit the job log = test_v2.$(Cluster).$(Process).log # The next two will be written out at the end of the job; they are stdout and stderr output = test_v2.$(Cluster).$(Process).out error = test_v2.$(Cluster).$(Process).err # Ask that you transfer any file that you create in the "top directory" of the job should_transfer_files = YES when_to_transfer_output = ON_EXIT_OR_EVICT # queue the job. queue 1 # more than once if you want # queue 14
The actual shell script that executes look like this:
[test_user@ascint1y xrd_rdr_access_local]$ less run_athena_v2.sh #!/bin/bash ## You will need this bit for every Athena job # non-interactive shell doesn't do aliases by default. Set it so it does shopt -s expand_aliases # set up the aliases. # note that ATLAS_LOCAL_ROOT_BASE (unlike the aliases) is passed the shell from where you are submitting. alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh' # now proceed normally to set up the other aliases setupATLAS # Condor works in a "sandbox" directory. # We now want to create our Athena environment in this sandbox area. mkdir testarea mkdir testarea/15.6.3 export ATLAS_TEST_AREA=${PWD}/testarea/15.6.3 localSetupGcc --gccVersion=gcc432_x86_64_slc5 cd testarea/15.6.3 # Set up the Athena version version source /export/home/atlasadmin/temp/setupScripts/setupAtlasProduction_15.6.3.sh # For this example, just copy the code from my interactive work area where I have the code running. cp -r ~/cvmfs2test/15.6.3/NtupleMaker . # compile the code cd NtupleMaker/cmt cmt config gmake source setup.sh # cd to the run area and start running. cd ../share athena Analysis_data900GeV.py # Just to see what we have at the end do an ls. This will end up in the *.out file echo "ls -ltr" ls -ltr # copy the output file back up to the top directory to get it back from CONDOR into you submission directory. cp Analysis.root ../../../../Analysis.root
Now to submit the job do
condor_submit run_athena_v2.sub
The baseline Tier 3g configuration has several data storage options. The interactive nodes can be configurated to have some local space. This space should be considered shared scratch space. Local site policies will define how this space will be used. There is also space located in the standalone file server (also know as the nfs node). Due to limitations of nfs within Scientific Linux, XrootD is used to access the data on this node. In the baseline Tier 3g setup, the majority of storage is located on the worker nodes. This storage space is XrootD managed and accessed space.