Banner
Title: gLite Data Management Commands
Section: gLite Advanced Practical
Tutor: Tony Calanducci
Authors: Valeria Ardizzone, Tony Calanducci, Diego Scardaci


File Catalog Interactions

Before to start to interact with the gLite File Catalog (LFC) you need to have a valid proxy with VOMS extensions. As you learned in the introductionary exercise, you can check the validity of your proxy with:

voms-proxy-info -all

If you don't have a valid proxy yet or it is expired, type:

voms-proxy-init --voms gilda

Be sure to have set up correctly the following environment variables:

  • export LFC_HOST=lfc-gilda.ct.infn.it
  • export LCG_CATALOG_TYPE=lfc
The first variable ($LFC_HOST) specifies the File Catalog Server to be used, while the second one indicates the type of file catalog we intend to use (actually only lfc is supported). Anyway, these variables should already be set up with their correct values in all your workstations.

Listing the entries of a LFC directory

lfc-ls [-cdiLlRTu] [--comment] /grid/VO_NAME/<the_directory_you_want_to_list>
The LFC File Catalog uses a hierarchical directory structure like a regular Unix Filesystem. File Catalog administrators use the convenction to have a /grid root dir, and under this, a directory for each VO the File Catalog is supposed to support (ex.: gilda, eela, infngrid, ...). Starting from /gilda/VO_NAME, users belonging to that VO have the capability to create their own directories and register their own entries. E.g.:

$ lfc-ls -l /grid/gilda/tony 
-rwx------ 1 101 102 611283 Mar 10 15:55 0511200001.JPG
-rw----r-- 1 101 102 463583779 Jun 26 23:27 UIPnPcomb3.tar.gz
-rwxr-x--- 1 101 102 0 Mar 08 17:18 empty.txt
-rw----r-- 1 101 102 340 Jun 12 13:54 hostname.jdl
-rw----r-- 1 101 102 385 Apr 03 19:03 mpi.jdl
-rw----r-- 1 101 102 1861870 Jul 07 19:47 pil1.dat
-rw----r-- 1 101 102 611283 Apr 24 17:55 prova3.jpg
-rw----r-- 1 101 102 38 Apr 23 19:17 runshell.sh
-rw----r-- 1 101 102 2367 Jul 10 13:12 simple.dat
-rw----r-- 1 101 102 787112 Jul 10 13:40 simple2.dat

lfc-ls has several parameters. You can get more information using man lfc-ls. A common one is -l that enables long listing showing permissions, owner, group, size and timestamp per each entry listed. -R is used to do recursily listing, but please use it carefully! Relative paths can also be used. At that purpose you can define the environment variable $LFC_HOME to point to your home catalog dir. For example:
 $ export LFC_HOME=/grid/gilda/tony 
$ lfc-ls -l simple.dat
-rw----r-- 1 101 102 2367 Jul 10 13:12 simple.dat
TO DO: Take a look inside the catalog using lfc-ls

LFC Directory Management

lfc-mkdir [-m absolute_path] [-p] dirname...
lfc-rm [-f] [-i] -r dirname...

The two commands are self-explaining. Again, man can give you more details about the meaning of the flags.
We already set up a directory into LFC, called /grid/gilda/ischia06.

TO DO: Create your own home LFC directory under the /grid/gilda/ischia06 directory using your name or surname or account name.

You will use that dir for the following exercises.

Summary of the LFC commands

For completeness here is a table with all the available LFC related commands.

lfc-chmod
Change access mode of a file/directory
lfc-chown
Change owner and group of a file/directory
lfc-delcomment
Delete the comment associated with a file/directory
lfc-getacl
Get file/directory access control lists
lfc-ln
Make a symbolic link to a file/directory
lfc-ls
List file/directory entries in a directory
lfc-mkdir
Create a directory
lfc-rename
Rename a file/directory
lfc-rm
Remove a file/directory
lfc-setacl
Set file/directory access control lists
lfc-setcomment
Add/replace a comment

Top

Storage Elements Interactions

gLite provides another set of tools to deal with Storage Elements and File Catalogs. They form the so called lcg_utils tools.
These commands allow users and jobs running on Worker Nodes to copy files between a User Interface or Worker Node and a Storage Element, replicate files among Storage Elements and registers atomically all the completed operation into the File Catalog.

Be sure to have set up correctly the following environment variable:

  • export LCG_GFAL_INFOSYS=dualxeon.gs.unina.it:2170
This variable is used to set up the Information System Server (the BDII) used by lcg_utils tools.

Upload a file to a Storage Element and register it into the file catalog: lcg-cr (Copy & Register)

lcg-cr [-v | --verbose] -d <destination_host> -l <logicalFileName> --vo <vo_name> <src_file>
where:
  • destination_host      is the fully qualified hostname of the destination SE
  • logicalFileName        specifies the Logical File Name associated with the file
  • vo_name                        specifies the Virtual Organization the user belongs to
  • src_file                      specifies the source file name: the protocol can be file:/// or gsiftp:///
To discover which SEs the user is allowed to use, you can use the lcg-infosites command:
$ lcg-infosites --vo gilda se
Avail Space(Kb) Used Space(Kb)  Type    SEs
----------------------------------------------------------
52950000 2860000 n.a grid038.ct.infn.it
2980000000 870000000 n.a aliserv6.ct.infn.it
69340000 3950000 n.a gildase.oact.inaf.it
28059764 3100016 n.a testbed005.cnaf.infn.it
60410000 4860000 n.a opteron.gs.unina.it
131660000 7080000 n.a grid-se.bio.dist.unige.it

The output is a list of SEs and related information on available/used space.

lcg-cr usage example:

$ touch myTest.dat
$ lcg-cr -v -d opteron.gs.unina.it -l lfn:/grid/gilda/ischia06/tcaland/myTest.dat \ --vo gilda file://$PWD/myTest.dat Using grid catalog type: lfc Using grid catalog : lfc-gilda.ct.infn.it Source URL: file:///home/users/tcaland/myTest.dat File size: 0 VO name: gilda Destination specified: opteron.gs.unina.it Destination URL for copy: gsiftp://opteron.gs.unina.it/opteron.gs.unina.it:/storage/gilda \ /2006-07-16/file5fa84a9a-376d-44e9-9381-a4a8262731e7.115.0 # streams: 1 # set timeout to 0 seconds Alias registered in Catalog: lfn:/grid/gilda/ischia06/tcaland/myTest.dat 0 bytes 0.00 KB/sec avg 0.00 KB/sec inst Transfer took 9110 ms Destination URL registered in Catalog: srm://opteron.gs.unina.it/dpm/gs.unina.it/home/gilda \ /generated/2006-07-16/file5fa84a9a-376d-44e9-9381-a4a8262731e7 guid:b1391277-52ed-4030-9ca2-56c0356d2c41 $ lfc-ls -l /grid/gilda/ischia06/tcaland/ -rw-rw-r-- 1 479 102 0 Jul 16 12:16 myTest.dat

In the previous example, we have created a local file called myTest.dat and then uploaded the file into a grid Storage Element whose hostname is opteron.gs.unina.it, and then register the uploaded file into the File Catalog with the LogicalFileName lfn:/grid/gilda/ischia06/tcaland/myTest.dat, inside the LFC directory /grid/gilda/ischia06 that I have created previously. I have also used the -v flag to get more verbosity and the the parameter --vo gilda because I am a member of the gilda Virtual Organization.

In the output of lcg-cr, please notice that the GUID that was assigned to the file (guid:b1391277-52ed-4030-9ca2-56c0356d2c41), its SURL (srm://opteron.gs.unina.it/dpm/gs.unina.it/home/gilda/generated/2006-07-16/file5fa84a9a-376d-44e9-9381-a4a8262731e7) and the TURL (gsiftp://opteron.gs.unina.it/opteron.gs.unina.it:/storage/gilda/2006-07-16/file5fa84a9a-376d-44e9-9381-a4a8262731e7.115.0).

As said during the theoretical part this morning, the GUID is a human non-readable string generated by the FileCatalog and garanteed to be unique, the SRL (Site Resource Locator) gives you information on which SE the file is actually stored, while the TURL (Transport URL Or Temporary URL) gives information on which protocol is used to access/transfer the file and its temporary because its location inside the SE can be changed by the SE deamons according to the internal policy set up by the administrator (for example, the file can be stored actually on the disk1 of a disk array and maybe moved later on disk2, or later onto a tape of the library, etc).

TO DO: Create a new local file or use one already existent in your UI, upload it onto a SE, and register it in the Catalog using lcg-cr.

Definition: A grid file is a file that is stored into a Storage Element AND is registered in a File Catalog (has an assigned Logical File Name).


Make a replica of a grid file into another Storage Element: lcg-rep

lcg-rep [-v | --verbose] -d <destination_host> --vo <vo_name> <src_file>

where:
  • destination_host      is the fully qualified hostname of the destination SE
  • vo_name                        specifies the Virtual Organization the user belongs to
  • src_file                      specifies the source file name of the file we want to replicate: the protocol can be LFN, GUID or SURL

For example, let's replicate on a different Storage Element the file that we have previously uploaded:

$ lcg-rep -v  -d aliserv6.ct.infn.it --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat
Using grid catalog type: lfc
Using grid catalog : lfc-gilda.ct.infn.it
Source URL: lfn:/grid/gilda/ischia06/tcaland/myTest.dat
File size: 0
VO name: gilda
Destination specified: aliserv6.ct.infn.it
Source URL for copy: 
gsiftp://opteron.gs.unina.it/opteron.gs.unina.it:/storage/gilda \
         /2006-07-16/file5fa84a9a-376d-44e9-9381-a4a8262731e7.115.0
Destination URL for copy: 
gsiftp://aliserv6.ct.infn.it/aliserv6.ct.infn.it:/gpfs/dpm/gilda \
        /2006-07-16/filed270060a-d8ee-4d59-8506-7cc68f47bdb8.46643.0
# streams: 1
# set timeout to 0
            0 bytes      0.00 KB/sec avg      0.00 KB/sec inst
Transfer took 12130 ms
Destination URL registered in LRC: srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda \
     /generated/2006-07-16/filed270060a-d8ee-4d59-8506-7cc68f47bdb8

To inspect how many replicas a Grid File has and where they are located, you can use lcg-lr (List Replicas):

lcg-lr --vo <vo_name> <src_file>

where
  • vo_name specifies the Virtual Organization the user belongs to
  • src_file specifies the Logical File Name, the Grid Unique IDentifier or the Site  URL of the file we want to list the replicas.

For example, let's check if the previously issued replica command was successful:

$ lcg-lr --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat
srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda \
    /generated/2006-07-16/filed270060a-d8ee-4d59-8506-7cc68f47bdb8
srm://opteron.gs.unina.it/dpm/gs.unina.it/home/gilda \
    /generated/2006-07-16/file5fa84a9a-376d-44e9-9381-a4a8262731e7

Yes, it was: we do have two replicas (one on aliserv6 and one of opteron) of the grid file lfn:/grid/gilda/ischia06/tcaland/myTest.dat.

TO DO: Make two replicas of the file you have previously uploaded and registered.

Downloading a Grid file in a SE to a local destination (UI or WN)

If you want to download a Grid file saved on a Storage Element to your User Interface, or you need to do that from a job running on a Worker Node, you can use the following command:

lcg-cp [ -v | --verbose ] --vo <vo_name> <src_file> <dest_file>

where
  • vo specifies the Virtual Organization the user belongs to
  • src_file specifies the source file name: the protocol can be LFN, GUID, SURL or  local  file
  • dest_file  specifies the destination. The protocol can be file:/// or gsiftp:///

Example:

$ lcg-cp -v --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat file://$PWD/myTest2.dat
Using grid catalog type: lfc
Using grid catalog : lfc-gilda.ct.infn.it
Source URL: lfn:/grid/gilda/ischia06/tcaland/myTest.dat
File size: 0
VO name: gilda
Source URL for copy: gsiftp://gildase.oact.inaf.it/gildase.oact.inaf.it:/data01/gilda \ /2006-07-16/file4f2e4f41-5ef2-4afe-bf5d-4180836387e2.40.0
Destination URL: file:///home/users/tcaland/myTest2.dat
# streams: 1
# set timeout to 0 (seconds)
0 bytes 0.00 KB/sec avg 0.00 KB/sec inst
Transfer took 2030 ms $ ls -la myTest* -rw-r--r-- 1 tcaland users 0 Jul 16 13:30 myTest2.dat
-rw-r--r-- 1 tcaland users 0 Jul 16 12:16 myTest.dat

TO DO: Download in your home the file you have previously uploaded and registered.


How to delete a replicas

lcg-del [ -a ] | [ -s se ] [ -v | --verbose ] --vo <vo_name> <file>

where
  • a is used to delete all replicas of the given file. The entry in the file catalog will be also deleted
  • se specifies the SE from which you want to remove the replica
  • vo specifies the Virtual Organization the user belongs to
  • file specifies  the  Logical  File Name, the Grid Unique IDentifier or the Site URL.

Example:
Let me add one more replica to myTest.dat (to have 3 replicas in total):

$ lcg-rep -v -d gildase.oact.inaf.it --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat
$ lcg-lr --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat
srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda \
     /generated/2006-07-16/filed270060a-d8ee-4d59-8506-7cc68f47bdb8
srm://gildase.oact.inaf.it/dpm/oact.inaf.it/home/gilda \
     /generated/2006-07-16/file1f0333e4-89c6-4fcf-ae86-a5f9938206a8
srm://opteron.gs.unina.it/dpm/gs.unina.it/home/gilda \
     /generated/2006-07-16/file5fa84a9a-376d-44e9-9381-a4a8262731e7

  • Let's delete just one replica:
$ lcg-del -v -s aliserv6.ct.infn.it --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat
VO name: gilda
set timeout to 0 seconds
$ lcg-lr --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat
srm://gildase.oact.inaf.it/dpm/oact.inaf.it/home/gilda \ /generated/2006-07-16/file1f0333e4-89c6-4fcf-ae86-a5f9938206a8
srm://opteron.gs.unina.it/dpm/gs.unina.it/home/gilda \ /generated/2006-07-16/file5fa84a9a-376d-44e9-9381-a4a8262731e7
  • Let's delete the last two replicas at a time:

 

$ lcg-del -v -a --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat
VO name: gilda
set timeout to 0 seconds
$ lcg-lr --vo gilda lfn:/grid/gilda/ischia06/tcaland/myTest.dat
lcg_lr: No such file or directory
$ lfc-ls -l /grid/gilda/ischia06/tcaland/ $ (you will not see anymore myTest.dat)

TO DO: Delete the replicas of the file you have previously uploaded and registered and make sure that it is also removed from the File Catalog

 

Top

Top

 

Top




Top