ISSGC'06 :: Integrating Practical

Title:	Integrating Practical
Subtitle:	The Search for Knowledge - gLite flavour
Tutor:	Tony Calanducci, Diego Scardaci
Authors:	Tony Calanducci, Diego Scardaci, Valeria Ardizzone

The Search for Knowledge with gLite

The gLite approach to solve the Search for Knowledge problem is a data intensive one. So, we have to take an additional step to generate data. For this purpose, we provide an additional program, SampleGen, that creates samples of data, saving them on Storage Elements and register them on the LFC File Catalog.

You can find all the needed files in the integrating.zip package. Add to your CLASSPATH environment variable the full path where sfk.jar and gfal.jar packages are located. Moreover, add the libgfalfile.so to your LD_LIBRARY_PATH environment variable. These last two packages are needed because our Java classes use GFAL to remotely read/write files from/to Storage Elements.

Be sure , also, to have a valid proxy before using SampleGen (and PillarFinder too) program (otherwise you will get an unknown exception).

export CLASSPATH=$HOME/integrating/gfal.jar:$HOME/integrating/sfk.jar:$CLASSPATH
export LD_LIBRARY_PATH=$HOME/integrating/lib/:$LD_LIBRARY_PATH
export LCG_GFAL_VO=gilda
export LCG_RFIO_TYPE=dpm

SampleGen

SampleGen takes the following parameters:

center-x, x-coordinate from where start sampling
center-y, y-coordinate from where start sampling
radius, radius of the sampled area
step, how big of a jump to make while sampling
number of sample points
destination Storage Element
LogicalFileName to give to the generated Grid File
Optional threshold

Usage:  
       SampleGen <cx> <cy> <radius> <step> <samples> <SE_hostname> <LFN> [<threshold>]

Example:      
$ java uk.ac.nesc.training.sfk.SampleGen 9447.5925 -445.8208  10 5 10000  opteron.gs.unina.it lfn:/grid/gilda/scardaci/ischia1807_5.dat
Total written bytes: 1861870
File stored on the given SE and registered into the FC

$ lfc-ls -l /grid/gilda/scardaci | grep ischia1807_5.dat
-rw-rw-r--   1 508      102                 1861870 Jul 18 23:49 ischia1807_5.dat

N.B.:If you need to generate big surface (having pretty big file), use the following string to start java that uses -Xms and -Xmx flags to increase the heap size:

java -Xms512m -Xmx512m uk.ac.nesc.training.sfk.SampleGen 9447.5925 -445.8208 10 5 10000 opteron.gs.unina.it lfn:/grid/gilda/scardaci/ischia1807_5.dat

N.B.2: The Storage Elements that can be used to save the sample files are only the DPM-based (Disk Pool Managers) ones. Because of a bug that does not let you display correctly the type of a SE, the list of the SEs that can be used is given here:

opteron.gs.unina.it (installed here)
egee016.cnaf.infn.it
grid038.ct.infn.it
aliserv6.ct.infn.it
gildase.oact.inaf.it
trigrid-ce01.unime.it
grid-se.bio.dist.unige.it

PillarFinder

PillarFinder looks at a given sampled area and tells you if it contains a pillar or not. In case one or more pillars are found, it prints out a message with its/their possible location(s).

Again, our version of PillarFinder uses GFAL to read remotely its input files. So, it takes as input the LogicalFileName of the file we want PillarFinder to analyze.

Usage:
        PillarFinder <LFN> [<threshold>]
        e.g. PillarFinder lfn:/grid/gilda/tony/in.dat

Example:
$ java uk.ac.nesc.training.sfk.PillarFinder lfn:/grid/gilda/scardaci/ischia1807_5.dat
Possible pillar at [9453.167499999956,-440.1457999999993]-[9454.517499999936,-438.795799999999]

N.B.:If you need to analyze big surface (having pretty big file), use the following string to start java that uses -Xms and -Xmx flags to increase the heap size:

java -Xms512m -Xmx512m uk.ac.nesc.training.sfk.PillarFinder lfn:/grid/gilda/scardaci/ischia1807_5.dat

PillarReader

PillarReader is a program to display words on pillars from a specific area given by the surface provider. The background terrain is noisy but the pillar can be identified as it has a flat surface. A plaque is attached on the pillar, and a word is printed on the plaque. The plaque is a bit higher than the pillar surface and the print is either a bit higher (embossed) or lower (engraved) than the plaque surface.

If the given area is a pillar with a flat surface, the word on the plaque will be printed in the specified text file, otherwise it reports "no plaque found" message.

So, after you get indication from possible pillar locations from PillarFinder, use the results given by PillarFinder to

PillarReader takes eight parameters:

lower-x: x-coordinate of the lower-left corner
lower-y: y-coordinate of the lower-left corner
upper-x: x-coordinate of the upper-left corner
upper-y: y-coordinate of the upper-left corner
step: Distance between two sample points. The smaller this is, the more sample points. Numbers like .001, give or take, are good choices.
output-file: location for visualization, if it was found. If it is not found, the file is not created.
threshold: the value of height which can be regarded as flat. If the difference between the height of two points is no greater than threshold, they are regarded as on a flat plane. It's optional, and defaults to 0.000000001. You probably don't need to change it.

Usage:


        PillarReader <x1> <y1> <x2> <y2> <step> <output_filename> [<threshold>]

Example:
$ java uk.ac.nesc.training.sfk.PillarReader 9453.167499999956 -440.1457999999993 9454.517499999936 -438.795799999999 0.001 myplaque.txt

$ tac myplaque.txt 
                                                                                                                                 
                                                                                                                                 
                                                                                                                                 
                                                                                                                                 
     #########             ###                   #########          #########          ###         ###          #########        
     #########             ###                   #########          #########          ###         ###          #########        
     #########             ###                   #########          #########          ###         ###          #########        
  ###         ###          ###                ###         ###       ###      ###       ###         ###      ####         ###     
  ###         ###          ###                ###         ###       ###      ###       ###         ###      ####         ###     
  ###         ###          ###                ###         ###       ###      ###       ###         ###      ####         ###     
  ###                      ###                ###         ###       ###      ###       ###         ###      ####                 
  ###                      ###                ###         ###       ###      ###       ###         ###      ####                 
  ###                      ###                ###         ###       #########          ###         ###          #########        
  ###                      ###                ###         ###       #########          ###         ###          #########        
  ###                      ###                ###         ###       #########          ###         ###          #########        
  ###      ##########      ###                ###         ###       ###      ###       ###         ###                   ###     
  ###      ##########      ###                ###         ###       ###      ###       ###         ###                   ###     
  ###      ##########      ###                ###         ###       ###      ###       ###         ###                   ###     
  ###         ###          ###                ###         ###       ###      ###       ###         ###                   ###     
  ###         ###          ###                ###         ###       ###      ###       ###         ###                   ###     
  ###         ###          ###                ###         ###       ###      ###       ###         ###                   ###     
  ###         ###          ###                ###         ###       ###      ###       ###         ###      ####         ###     
  ###         ###          ###                ###         ###       ###      ###       ###         ###      ####         ###     
  ###         ###          ###                ###         ###       ###      ###       ###         ###      ####         ###     
     #########             ############          #########          #########             #########             #########        
     #########             ############          #########          #########             #########             #########

Hints

Hints on the possible position of the pillars are given to help you where to start the search. Remember that the hints are incomplete and may not be accurate.

You can access the hints both from AMGA /ischia06/hints collection or using OGSA-DAI.

Remember that, to use AMGA, you should configure properly your $HOME/.mdclient.config, as explained here.

Possible Solution

Once you have testing successfully the above programs on your Workstation (User Interface), you can implement some strategy to run the SampleGen and the PillarFinder on the grid.
You can submit a pool of SampleGen with different arguments and PillarFinder using a DAG job: you can think of them as a pair of Producers - Consumers. Look at gLite Introductionary and Advanced section how to run a java program on the grid and how to create and submit a DAG job.
To exchange messages between Producers and Consumers, you can use an AMGA collection, that could have the following schema:
- LFN : varchar
- Analized : int
- MagicNumber : int
Your script, running SampleGen on the WN, after generate the sample and saved/register it a SE/LFC, can use AMGA client (mdcli) to save on a proper collection the LogicalFileName of the just created file, and set to 0 the Analyzed flag.
In a similar way the script running PillarFinder can make a query on the same collection and select a LFN that has Analyzed equals to 0. Before starting to run PillarFinder, the Analyzed flag should be set to 1, to avoid other concurrent PillarFinder, running on other WN, "consume" the same data. While updating Analyzed flag, you should also set the MagicNumber attribute, to be sure that this instance of PillarFinder running "owns" the piece of data. An example of how to do this is showed in the Summary Exercise page.

Top