Tuning TCP Linux

In order to achieve the maxim possible network speed, you must to tune your Linux box (assuming that you have 1 Gb network switch).

Open: the file: /etc/sysctl.conf and put the lines:

  
 net.ipv4.tcp_rmem = 4096 87380 16777216
 net.ipv4.tcp_wmem = 4096 65536 16777216 
 

and then run “sysctl -p”. This increases the TCP buffer to 16 Mb.

Another thing you can try that may help increase TCP throughput is to increase the size of the interface queue. To do this, do the following:

 
     ifconfig eth0 txqueuelen 1000

For large files, this will increase the performance by a factor 4-5 compared to non-tuned TCP. The above ttune as tested for Scientific Linux 4.7.

Please look at the Esnet TCP-tuning recommendations for more details and updates.

Getting data from a Tier1/2

The major challenge for the T3G is to get ATLAS data to PC farms. A good starting point is to use dq2-get to copy the data to some temporary storage, split the data using ArCond and then “scp” data to PC farm computers. This approach is not well scaled, and is not appropriate if you do not have large data storage. Therefore, we will discuss how to download data directly to each PC farm box in parallel. This can be done ArCond and dq2-get.

This task is suppose to be done by few persons (no needed to be root admin) who will be responsible for maintaining data on the cluster. For a correct configuration of a PC farm, usual users should not be able to login on the PC farm machines, thus they should not be able to put data sets on the cluster. Of course, they can keep their private datasets on the worker node which should have enough scratch disk space.

We assume that you have already a functional T3G with a worker node and several PC farm computers. We also assume that that all necessary software is correctly installed. Below we discuss the steps which are necessary for downloading data in parallel on PC farm boxes:

 1. Make a passwordless login using ssh-agent to all PC nodes.
              
             $ ssh-keygen -t dsa -f ~/.ssh/id_dsa
              Generating DSA keys:  Key generation complete.
              Enter passphrase (empty for no passphrase): USE-A-PASSPHRASE
              Enter same passphrase again: USE-A-PASSPHRASE
              Your identification has been saved in ~/.ssh/id_dsa
              Your public key is:
              1024 35 [really long string] [email protected]
              Your public key has been saved in ~/.ssh/id_dsa.pub

Make sure the public key is in the ~/.ssh/authorized_keys file on the hosts you wish to connect to. You can use a password authenticated connection to do this:

              $ cat ~/.ssh/id_dsa.pub  >  ~/.ssh/authorized_keys

Verify that DSA authentication works. Try to login on any PC farm machine. Since your home directory on the AFS, you should see an invitation to type password:

$ ssh [email protected]

            Enter passphrase for DSA key '[email protected]': ^D
          $

(If you do not get the prompt for your DSA key, then something has gone wrong.)

      SSHAGENT=/usr/bin/ssh-agent
      SSHAGENTARGS="-s"
      if [ -z "$SSH_AUTH_SOCK" -a -x "$SSHAGENT" ]; then
               eval `$SSHAGENT $SSHAGENTARGS`
               trap "kill $SSH_AGENT_PID" 0
      fi

This brings SSH_AUTH_SOCK and SSH_AGENT_PID as environment variables into the current shell.The trap should kill off any remaining ssh-agent process. If it does not do it, you can add this line into .logout:

       kill $SSH_AGENT_PID
 
       $ ssh-add ~/.ssh/id_dsa
          Need passphrase for /home/mah/.ssh/id_dsa ([email protected]).
          Enter passphrase:
       $
        $ ssh [email protected]
           No mail.
           [[email protected]]$

As you will see, you will need to type the password only once to add it to ssh-add, then you can login on any PC farm PC without password. Create a file hosts-file where you put host names of computer boxes with local disks to keep data. For example:

 
        atlas1.name.org
        atlas2.name.org
        atlas3.name.org

Check your submission. Type:

        arc_ssh -h hosts-file -l <your-user-name> date

It will show that the bash command date is executed on each host specified in the file hosts-file. You are ready to start to download the files. Setup atlas release, OSG-client and ArCond. Get proxy for the grid. Make a directory (any name), go inside and type

arc_setup

    cp $ARCOND_SYS/etc/arcond/share/send_dq2.sh .

You will need to use send_dq2.sh to specify:

Open the file send_dq2.sh and edit the corresponding lines. Finally, try to download the data set as:

             arc_ssh -h hosts-file -l <your-user-name> -t200 -o  /tmp/log "exec <full-path>/send_dq2.sh"

Here we tell that we need 200 sec for the dq2-get for downloading data on each Linux box. is the path to the directory where your send_dq2.sh script is located.

Of course, one can also download data to a local central storage and distribute data uniformly. This can be done as described on the page Help for Administrators.

– SergeiChekanov - 22 Apr 2009 – SergeiChekanov - 02 Apr 2009