User Tools

Site Tools


asc:asc_cluster_storage

Introduction

The baseline Tier 3g configuration has several data storage options. The interactive nodes can be configurated to have some local space. This space should be considered shared scratch space. Local site policies will define how this space will be used. There is also space located in the standalone file server (also know as the nfs node). Due to limitations of nfs within Scientific Linux, !XrootD is used to access the data on this node. In the baseline Tier 3g setup, the majority of storage is located on the worker nodes. This storage space is !XrootD managed and accessed space.

Getting data to your site

The ATLAS dq2 client tools are used to fetch data to your site. Details of the dq2 clients If you plan to transfer a relatively small amount of data (50 - 100 GB/day), then the basic dq2-get client is sufficient. If you are transferring larger amounts of data (100-1000 GB/day) then you should use dq2-get with the FTS plugin to transfer the data into a gridftp server or servers.

Installing FTS plugin at your site

In the future these instructions not be needed. They are valid for v0.1.36.2 . Until then these instructions should be followed.

As the =atlasadmin= account on the nfs node :

The following steps must be done by the =atlasadmin= account after the DQ2 client software has been installed

%W% These instructions assume that ATLAS_LOCAL_ROOT_BASE has been defined

  • Edit =$ATLAS_LOCAL_ROOT_BASE/x86_64/DQ2Client/current/DQ2Clients/opt/dq2/etc/dq2.cfg=
  • Add the lines:
    [dq2-clients] 
    FTS=fts.FTSdownloader 
  • Edit =$ATLAS_LOCAL_ROOT_BASE/x86_64/DQ2Client/current/setup.sh=
  • Change the PYTHON_PATH variable to be:
PYTHONPATH="/export/share/atlas/ATLASLocalRootBase/x86_64/DQ2Client/0.1.36.2/DQ2Clients/opt/dq2/lib/dq2/clientapi/cli/plugins:/export/share/atlas/ATLASLocalRootBase/x86_64/DQ2Client/0.1.36.2/DQ2Clients/opt/dq2/lib:$PYTHONPATH"

%W% This is a very long line

  • Edit =$ATLAS_LOCAL_ROOT_BASE/x86_64/DQ2Client/current/setup.csh=
  • Change the PYTHON_PATH variable to be:
if ($?PYTHONPATH) then 
        setenv PYTHONPATH /export/share/atlas/ATLASLocalRootBase/x86_64/DQ2Client/0.1.36.2/DQ2Clients/opt/dq2/lib/dq2/clientapi/cli/plugins:/export/share/atlas/ATLASLocalRootBase/x86_64/DQ2Client/0.1.36.2/DQ2Clients/opt/dq2/lib:${PYTHONPATH}
else
        setenv PYTHONPATH /export/share/atlas/ATLASLocalRootBase/x86_64/DQ2Client/0.1.36.2/DQ2Clients/opt/dq2/lib/dq2/clientapi/cli/plugins:/export/share/atlas/ATLASLocalRootBase/x86_64/DQ2Client/0.1.36.2/DQ2Clients/opt/dq2/lib
endif

%W% These are very long lines

Using FTS plugin

attribute : /atlas/Role=NULL/Capability=NULL
attribute : /atlas/lcg1/Role=NULL/Capability=NULL
attribute : /atlas/usatlas/Role=NULL/Capability=NULL
  • To effectively use dq2-get + FTS you need to decide with gridftp server (one that stores to the xrootd space in the worker nodes or the nfs node) to use.
  • Use dq2-ls to determine the site that contains your data set (use dq2-ls -r <dataset>). Select a site that “near” your Tier 3 if possible.
  • setup the ATLAS DQ2 libraries and get your grid proxy with voms extentions
setupATLAS
localSetupDQ2Client --skipConfirm
voms-proxy-init -voms atlas
  • run the dq2-get command (here “full name of your gridftp server” is the URL of the head node for transfer to worker nodes and the URL of the nfs node to transfer there).
dq2-get -Y -L <Remote site with your data set>  -q FTS -o https://fts.usatlas.bnl.gov:8443/glite-data-transfer-fts/services/FileTransfer -S gsiftp://<full name of  your gridftp server>/atlas <your data set name>

ATLAS Global name space

As part of the !XRootD demonstrator project, an ATLAS wide global name space has been defined. By storing files at the Tier 3 in common !XRootD paths, with a global redirector Tier 3 sites will be able to fetch files from other Tier 1/ Tier 2 or Tier 3 as is needed. With this system Tier 3 can get the files from the ATLAS data sets that the users actually want because they the files that users request in their analysis jobs. The idea is that files will be stored at Tier 3g sites according the ATLAS Global name space path and the file name.

Installing the script to decode ATLAS Global name space path

A python script has been written to present to the user the ATLAS global name space path based on the ATLAS data set name.

In the future the plugin will be part of the DQ2 clients. Until then these instructions should be followed.

As the =atlasadmin= account on the nfs node :

  • create the directory as needed
mkdir -pv /export/share/atlas/ddm
  • go to the new new directory and fetch the python script
cd /export/share/atlas/ddm
svn export http://svnweb.cern.ch/guest/atustier3/post_install/trunk/scripts/decodeFilepath.py

%H% note the steps above should have done when you initially configured the nfs node

Using the script to decode ATLAS Global name space path

In order to determine the ATLAS global name space for a given dataset

The script decodeFilepath.py python script is used. The DQ2 libraries are needed

%H% Note: this script uses a data set name and NOT a container name. Container names are lists of one or more date set names and end in /. Data set names do not have a / at the end. If you need to know what the data set name(s) is(are) that are associate with a container name, use the following command:

dq2-ls -r <container name>

This will not only list all the associated data set name(s) but also the site(s) where they exist.

  • To use this script first setup the DQ2 libraries:
 
setupATLAS 
 localSetupDQ2Client --skipConfirm
  • run the script passing to the script your data set
 python /export/share/atlas/ddm/decodeFilepath.py -d <your dataset name> 
  • Here is an example using the data set - =user.bdouglas.LOG.20101111_1224=
[benjamin@ascint0y ~]$ python /export/share/atlas/ddm/decodeFilepath.py -d user.bdouglas.LOG.20101111_1224
/atlas/dq2/user/bdouglas/LOG/user.bdouglas.LOG.20101111_1224

Copying files into the !XRootD storage

In the baseline Tier3g cluster there are two types of !XRootD storage. ( For more information on !XRootD please see the website: http://www.xrootd.org) The types of storage are:

 1 Storage on the worker nodes accessed by contacting the !XRootD redirector running on the head node
 1 Storage on the stand alone file server ( often referred to in the documentation as the nfs node). Accessed by contacting the node directly.

Using xrdcp to copy files into storage

Files can be copied into the !XRootD storage using the !XRootD command xrdcp. If you fetched the files using dq2-get (with out FTS plugin), the files will be located in a sub directory with date set name To copy these files into the !XRootD according to the ATLAS global name space convention, follow these steps.

  • determine the ATLAS Global name space path from the data set using the example shown above
  • get the list of files to copy
  • create a new list that includes the ATLAS Global name space path pre pended to each file name. Store this list in a file.
  • Setup the !XRootD package installed via the OSG cache.
source /opt/osg_v1.2/setup.sh
  • Use xrpep command to speed up copying file in to !XRootD storage system
xprep -w -f <file containng ATLAS Global name space path and file name> <redirector full name or nfs node full name>
  • use xrdcp to copy files into the xrootd storage system
$ xrdcp -h
usage: xrdcp <source> <dest> [-d lvl] [-DSparmname stringvalue] ... [-DIparmname intvalue] [-s] [-ns] [-v] [-OS<opaque info>] [-OD<opaque info>] [-force] [-md5] [-adler] [-np] [-f] [-R] [-S]

<source> can be:
   a local file
   a local directory name suffixed by /
   an xrootd URL in the form root://user@host/<absolute Logical File Name in xrootd domain>
      (can be a directory. In this case the -R option can be fully honored only on a standalone server)
<dest> can be:
   a local file
   a local directory name suffixed by /
   an xrootd URL in the form root://user@host/<absolute Logical File Name in xrootd domain>
      (can be a directory LFN)

%W% Note - xrcdp respects the security policy set on the data severs

XRootD data security policy on data servers

When the data servers were initially setup the data security policy was established.

%INCLUDE{“Tier3gXrootdSetup” section=“data security” }%

%H% users are allowed to write into the =/atlas/local/&lt; user name &gt;/*=

example using a ATLAS data set stored on local disk in interactive node

In the example - we are using the test data set =user.bdouglas.TST.20101120_1415=

  • setup the environment
setupATLAS
localSetupDQ2Client --skipConfirm
localSetupWlcgClientLite
source /opt/osg_v1.2/setup.sh
  • Determine the Global name space path using the data set name
$ python /export/share/atlas/ddm/decodeFilepath.py -d user.bdouglas.TST.20101120_1415
/atlas/dq2/user/bdouglas/TST/user.bdouglas.TST.20101120_1415
  • use dq2-ls, grep and awk to determine file list for xprep
$ dq2-ls -f user.bdouglas.TST.20101120_1415 | grep 'ad:' | awk '{print "/atlas/dq2/user/bdouglas/TST/user.bdouglas.TST.20101120_1415/"  $3}' >> /tmp/xprep_file_list.txt
  • Lets look at the file list
$ cat /tmp/xprep_file_list.txt 
/atlas/dq2/user/bdouglas/TST/user.bdouglas.TST.20101120_1415/dummy_zero_file_0004.dat
/atlas/dq2/user/bdouglas/TST/user.bdouglas.TST.20101120_1415/dummy_zero_file_0003.dat
/atlas/dq2/user/bdouglas/TST/user.bdouglas.TST.20101120_1415/dummy_zero_file_0002.dat
/atlas/dq2/user/bdouglas/TST/user.bdouglas.TST.20101120_1415/dummy_zero_file_0001.dat
/atlas/dq2/user/bdouglas/TST/user.bdouglas.TST.20101120_1415/dummy_zero_file_0000.dat
  • now go to the directory with files and use find and xrdcp to copy the files into the xrootd storage
$ find . -name *.dat -exec xrdcp '{}' root://aschead.hep.anl.gov//atlas/dq2/user/bdouglas/TST/user.bdouglas.TST.20101120_1415/'{}' ';'
[xrootd] Total 976.56 MB	|====================| 100.00 % [118.1 MB/s]
[xrootd] Total 976.56 MB	|====================| 100.00 % [117.6 MB/s]
[xrootd] Total 976.56 MB	|====================| 100.00 % [118.0 MB/s]
[xrootd] Total 976.56 MB	|====================| 100.00 % [117.9 MB/s]
[xrootd] Total 976.56 MB	|====================| 100.00 % [118.1 MB/s]

Using xrootdfs to view/ manage the files within the !XRootD storage

Both the storage on the worker nodes and the storage on the stand alone file server (nfs node) is managed by !XRootD.

As a convenience to the users and the person(s) responsible for managing the data. We have installed xrootdfs http://wt2.slac.stanford.edu/xrootdfs/xrootdfs.html

Data storage on your stand alone file server (!XrootD based storage)

The baseline Tier 3g configuration as a !XrootD data server installed on the stand alone file server in your cluster (nfs node). The files are located in the area =/local/xrootd/a/atlas/….= on the file server. The files are accessed using this URL =root:&lt;file server name&gt;atlas/…=

Looking at the files

xrootdfs is used by users to view the files within the xrootd storage on the nfs node and on the worker nodes. In order to make it easier for the users. The files located on the nfs node are seen in these directories =/&lt; nfs node short name&gt;/atlas/…= (for example on the ASC cluster =/ascnfs/atlas/…=) Users can use ls to see the files. %H% if the =ls= command returns =Transport endpoint is not connected= then xrootdfs must be restarted.

Data storage for your batch cluster (XrootD)

The data on which you will run on batch jobs are stored on the batch nodes themselves in order to minimize network traffic for batch processing. The disks on the wrk nodes are managed using the package http://xrootd.slac.stanford.edu/][XrootD.

The files are located in the area =/local/xrootd/a/atlas/….= on each worker node. The head node contains the !XrootD redirector for the !XrootD data servers running on the worker nodes. The files are accessed using this URL =root:&lt;head node name&gt;atlas/…=

At ANL ASC, the there are approximately 15 TB of space available which are distributed over the three worker nodes.

It's recommended that loading of the files into this analysis area is managed by the your local data administrator.

Looking at the files

In order to see the files, do the following setup (maybe some of it already done in the session you're in) code20

Note, that as in the case of setting up DQ2, you probably want to log out and in again before you begin to work on Athena.

Now you can do: code21

which are directories and the files in one of the directories are… (Here ascvmxrdr.hep.anl.gov is particular to ANL ASC. Ask your administrator for where the !XrootD redirector is.) code22

Running Athena jobs on files stored on XrootD

The Athena package will automatically recognize the !XrootD format of specifying the input file(s). There is no need for special !XrootD setup when running Athena.

The XrootD files are specified (for example) as code23

asc/asc_cluster_storage.txt · Last modified: 2013/05/30 18:53 (external edit)