====== Tuning TCP Linux ======
In order to achieve the maxim possible network speed, you must to tune your Linux box (assuming that you have 1 Gb network switch).
Open: the file: /etc/sysctl.conf and put the lines:
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
and then run "sysctl -p". This increases the TCP buffer to 16 Mb.
Another thing you can try that may help increase TCP throughput is to increase the size of the interface queue. To do this, do the following:
ifconfig eth0 txqueuelen 1000
For large files, this will increase the performance by a factor 4-5 compared to non-tuned TCP. The above ttune as tested for Scientific Linux 4.7.
Please look at the Esnet TCP-tuning recommendations for more details and updates.
===== Getting data from a Tier1/2 =====
The major challenge for the T3G is to get ATLAS data to PC farms. A good starting point is to use dq2-get to copy the data to some temporary storage, split the data using ArCond and then "scp" data to PC farm computers. This approach is not well scaled, and is not appropriate if you do not have large data storage. Therefore, we will discuss how to download data directly to each PC farm box in parallel. This can be done ArCond and dq2-get.
This task is suppose to be done by few persons (no needed to be root admin) who will be responsible for maintaining data on the cluster. For a correct configuration of a PC farm, usual users should not be able to login on the PC farm machines, thus they should not be able to put data sets on the cluster. Of course, they can keep their private datasets on the worker node which should have enough scratch disk space.
We assume that you have already a functional T3G with a worker node and several PC farm computers. We also assume that that all necessary software is correctly installed. Below we discuss the steps which are necessary for downloading data in parallel on PC farm boxes:
1. Make a passwordless login using ssh-agent to all PC nodes.
* Create a key pair on an interactive worker machine. Do this on the host that you want to connect from. Always protect your keys with a password.
$ ssh-keygen -t dsa -f ~/.ssh/id_dsa
Generating DSA keys: Key generation complete.
Enter passphrase (empty for no passphrase): USE-A-PASSPHRASE
Enter same passphrase again: USE-A-PASSPHRASE
Your identification has been saved in ~/.ssh/id_dsa
Your public key is:
1024 35 [really long string] you@example.com
Your public key has been saved in ~/.ssh/id_dsa.pub
Make sure the public key is in the ~/.ssh/authorized_keys file on the hosts you wish to connect to. You can use a password authenticated connection to do this:
$ cat ~/.ssh/id_dsa.pub > ~/.ssh/authorized_keys
Verify that DSA authentication works. Try to login on any PC farm machine. Since your home directory on the AFS, you should see an invitation to type password:
$ ssh pcfarm@example.com
Enter passphrase for DSA key 'you@example.com': ^D
$
(If you do not get the prompt for your DSA key, then something has gone wrong.)
* Run ssh-agent to cache login credentials for the session. The simplest solution is to put the following lines into .bash_profile
SSHAGENT=/usr/bin/ssh-agent
SSHAGENTARGS="-s"
if [ -z "$SSH_AUTH_SOCK" -a -x "$SSHAGENT" ]; then
eval `$SSHAGENT $SSHAGENTARGS`
trap "kill $SSH_AGENT_PID" 0
fi
This brings SSH_AUTH_SOCK and SSH_AGENT_PID as environment variables into the current shell.The trap should kill off any remaining ssh-agent process. If it does not do it, you can add this line into .logout:
kill $SSH_AGENT_PID
* Finally, add your password using ssh-add and type it.
$ ssh-add ~/.ssh/id_dsa
Need passphrase for /home/mah/.ssh/id_dsa (you@example.com).
Enter passphrase:
$
* Now, you should test it:
$ ssh pcfarm@example.com
No mail.
[pcfarm@example.com]$
As you will see, you will need to type the password only once to add it to ssh-add, then you can login on any PC farm PC without password. Create a file hosts-file where you put host names of computer boxes with local disks to keep data. For example:
atlas1.name.org
atlas2.name.org
atlas3.name.org
Check your submission. Type:
arc_ssh -h hosts-file -l date
It will show that the bash command date is executed on each host specified in the file hosts-file. You are ready to start to download the files. Setup atlas release, OSG-client and ArCond. Get proxy for the grid. Make a directory (any name), go inside and type
arc_setup
cp $ARCOND_SYS/etc/arcond/share/send_dq2.sh .
You will need to use send_dq2.sh to specify:
* Input data set.
* site from which you will need to copy files.
* where data set should be copied on your PC farm nodes.
* which computers should be used to store the data.
Open the file send_dq2.sh and edit the corresponding lines. Finally, try to download the data set as:
arc_ssh -h hosts-file -l -t200 -o /tmp/log "exec /send_dq2.sh"
Here we tell that we need 200 sec for the dq2-get for downloading data on each Linux box. is the path to the directory where your send_dq2.sh script is located.
Of course, one can also download data to a local central storage and distribute data uniformly. This can be done as described on the page Help for Administrators.
-- SergeiChekanov - 22 Apr 2009 -- SergeiChekanov - 02 Apr 2009