User Tools

Site Tools


lcrc:introduction

Introduction to HEP/LCRC resources

In order to use HEP/LCRC computing resources, you will need an account for the ATLAS-g group. Look at the page: https://accounts.lcrc.anl.gov/ You will need to ask to join to the “ATLAS-HEP-group” by providing your ANL user name. The users are usually placed into the “g-ATLAS” group.

Please also look at the description https://collab.cels.anl.gov/display/HEPATLASLCRC

The description here uses the bash shell. Please go to https://accounts.lcrc.anl.gov/account.php to change your shell.

At this very moment (Jan 2024), this resource cannot replace the ATLAS cluster and have several features:

  • LCRC resources are under maintenance on Monday (each week?)
  • HOME directory is tiny (100GB) and you have to use some tricks to deal with it (not well tested)
  • logins are done using the .ssh key (often requires 1 day to change with LCRC assist)
  • Cannot be mounted on desktops

Available resources

The following interactive nodes can be used:

heplogin.lcrc.anl.gov   # login at random to either  hepd-0003 or hepd-0004
heplogin1.lcrc.anl.gov  # login directly to hepd-0003
heplogin2.lcrc.anl.gov  # login directly to hepd-0004

Each node has 72 CPUs and a lot of memory. After login, you will end up in a rather small “home” space which has a limit of 500 GB.

You cannot login to these servers directly (from Aug 2024). First login to:

ssh -i $HOME/.ssh/YOURKEY [USER]@bebop.lcrc.anl.gov -X

then ssh to hep1/hep2:

ssh -i  $HOME/.ssh/YOURKEY  heplogin1.lcrc.anl.gov  -X

or

ssh -i  $HOME/.ssh/YOURKEY  heplogin2.lcrc.anl.gov  -X


/home/[USER]

You can use this location to keep code etc. (but not data):

/lcrc/group/ATLAS/users/[USER]
Please do not put large data to your /lcrc/group/ATLAS/users/[USER] area! Look at the sections below that describe where data should be stored

Updating the password

LCRC is not user friendly system when it comes to login and passwords. Changing passwords is a “process”. If you went through all steps in changing your ANL domain password, you will need to create ssh public key, update it on your account https://accounts.lcrc.anl.gov/account.php, and then send email to [email protected] with the request to update public key in your account. Since it involves manual labor from the LCRC, do not expect that you will be able to login to the LCRC at the same day.

Read:

Setting up local HEP software

You can setup some pre-defined HEP software as:

source /soft/hep/hep_setup.sh

It will set gcc7.1, root, FastJet, Pythia8, LHAPDF etc. Look at the directory “/soft/hep/”. Note sets ROOT 6.12 with all plugins included. PyROOT uses Python 2.7 compiled under gcc71.

It was noticed that you cannot use the debugger “dbg” with this setup, since the debugger uses Python from the system. To fix dbg, copy the script hep_setup.sh to you home area and comment out the Python setup

If you need to setup new TexLive 2016, use:

source /soft/hep/hep_texlive.sh

Setting up LCRC software

You can setup more software packages using Lmod. Look at https://www.lcrc.anl.gov/for-users/software/.

Setting up ATLAS software

You can setup ATLAS software as:

source /soft/hep/setupATLAS

or if you need RUCIO and the grid, use:

source /soft/hep/setupATLASg

In this case you need to put the grid certificate as described in Grid certificate.

Data storages

Significant data from the grid should be put to the following locations:

/lcrc/project/ATLAS/data/              # for the group
/lcrc/group/ATLAS/atlasfs/local/[USER] # for users

Allocating interactive nodes

If you need to run a job interactively, you can allocate a node to do this. Try this command:

srun --pty -p  bdwall   -A condo -t 24:00:00 /bin/bash

It will allocate a new node (in bash) for 24h. It uses Xeon(R) CPU E5-2695 v4 @ 2.10GHz (36 CPUs per node). More info about this can be found in Running jobs on BeBob. Note that you should keep the terminal open while jobs are running.

When you use bdwall partition, your jobs will accounted against default CPU allocations (100k per 4 months). Therefore, when possible, please use “hepd” partition. See the next section.

Running Batch job on HEP resources

srun --pty -p hepd -t 24:00:00 /bin/bash
module load StdEnv            # important to avoid slum bug

Then you can setup root etc as “source /soft/hep/setup.sh”.

SLURM is used as the batch system. It does whole node scheduling (not “core scheduling”)! If you run single core job, your allocation will be multiplied by 36 (cores!) Please see this page for details on how to use SLURM on LCRC http://www.lcrc.anl.gov/for-users/using-lcrc/running-jobs/running-jobs-on-bebop/

The partion for the HEP nodes is hepd

To run on non HEP nodes use partition bdwall with Account - ATLAS-HEP-group

Using interactive jobs

First, allocate a HEP node:

salloc -N 1 -p hepd -A condo  -t 00:30:00

This allocates it for 30 min, but you can allocate it up to 7 days. You cam also allocate it on bebob:

salloc -N 1 -p bdwall --account=ATLAS-HEP-group -t 00:30:00

This does not login you! Check what node did you allocate

squeue -u user

Now you know the node. Then login to bebob (first!) and then ssh to this node.

Another method is to use

srun --pty -p  bdwall  --account=ATLAS-HEP-group -t 00:30:00  /bin/bash

Running long interactive jobs

See more description in: https://www.lcrc.anl.gov/for-users/using-lcrc/running-jobs/running-jobs-on-bebop/

You should be able to do for example:

-ssh bebop
-screen
-salloc -N 1 -p hepd -A condo -t 96:00:00
-ssh <nodename>
-Work on interactive job for x amount of time...
-Disconnect from screen (different than exit, see the documentation)
-Logout
-Login to the same login node screen was started on
-screen -ls
-Connect to screen session
-Continue where you left off (if they allocation is still active)

See below for more details:

https://www.gnu.org/software/screen/

https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/

CVMFS repositories

Mounted CVMFS repositories on Bebop and Swing computing node.

/cvmfs/atlas.cern.ch
/cvmfs/atlas-condb.cern.ch
/cvmfs/grid.cern.ch
/cvmfs/oasis.opensciencegrid.org
/cvmfs/sft.cern.ch
/cvmfs/geant4.cern.ch
/cvmfs/spt.opensciencegrid.org
/cvmfs/dune.opensciencegrid.org
/cvmfs/larsoft.opensciencegrid.org
/cvmfs/config-osg.opensciencegrid.org
/cvmfs/fermilab.opensciencegrid.org
/cvmfs/icarus.opensciencegrid.org
/cvmfs/sbn.opensciencegrid.org
/cvmfs/sw.hsf.org

Note, they are not mounted on login nodes

Using Singularity

To run jobs on all LCRC resources using ATLAS analysis base requires Docker/Singularity. Yiming (Ablet) Abulaiti created a tutorial on how to do this. Read this

Here are the suggested steps for 21.2.51 release.

docker pull atlas/analysisbase:21.2.51

Then make singularity image:

docker run -v /var/run/docker.sock:/var/run/docker.sock -v `pwd`:/output --privileged -t --rm singularityware/docker2singularity:v2.3 atlas/analysisbase:21.2.51

Currently, the image for AtlasBase 2.2.51 located here:

/soft/hep/atlas.cern.ch/repo/containers/images/singularity/atlas_analysisbase_21.2.51-2018-11-04-01795eabe66c.img

You can go inside this image as:

singularity exec /soft/hep/atlas.cern.ch/repo/containers/images/singularity/atlas_analysisbase_21.2.51-2018-11-04-01795eabe66c.img bash -l

Using Singularity for cvmfsexec

One can also setup cvmf on any LCRC nodes as this:

source /soft/hep/CVMFSexec/setup.sh

Then check:

ls /cvmfs/

You will see the mounted directory (SL7):

atlas-condb.cern.ch/      atlas.cern.ch/  cvmfs-config.cern.ch/  sft-nightlies.cern.ch/  sw.hsf.org/
atlas-nightlies.cern.ch/  cms.cern.ch/    projects.cern.ch/      sft.cern.ch/            unpacked.cern.ch/

Sergei&Doug&Rui 2018/01/04 13:36

lcrc/introduction.txt · Last modified: 2024/09/19 19:06 by asc