Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://hpc.msu.ru/files/HPC/X-Com_tutorial.pdf
Äàòà èçìåíåíèÿ: Fri May 28 15:35:13 2010
Äàòà èíäåêñèðîâàíèÿ: Mon Oct 1 19:22:04 2012
Êîäèðîâêà:
ISC-2010, Hamburg, Germany

X-Com: Distributed Computing Software
Research Computing Center M.V. Lomonosov Moscow State University

HPC.MSU.RU


X-Com key features (1)
· X-Com takes in account all unique characteristics of distributed computational environments:
­ large scale
· thousands CPUs

­ geographically distributed resources
· very high latency in communications

­ variable set of available resources
· use node on connection · don't fail on node disconnection

­ heterogeneous resources:
· operating system (MS Windows, Linux, ...) · CPU (x86, x86_64, ia64, ...)


X-Com key features (2)
· Lightweight portable toolkit:
­ easy to install
· Perl-based software · doesn't need administrator/root access

­ easy applications adaptation
· maybe several Perl strings must be written

­ simultaneous work of different platforms
· joining Windows and Linux machines

­ d is tr ib u te d fr e e ly
· BSD license


X-Com basic requirements
· Linux / Windows · Perl 5.8.0+
­ for Windows:
· Strawberry Perl (preferred) · ActivePerl

· Open TCP port for communications
­ if firewall is used

· Disabled CPU time limits
­ if used on head machine of computing cluster


Structure of application in X-Com distributed environment
· Application:
­ ability to split a task into a number of independent subtasks (portions) ­ more computing, less communicating

· X-Com ­ clients & server
­ server is responsible for splitting tasks into p o r tio n s a n d jo in in g r e s u lts ­ clients receive data from server, perform computations, send results back


General idea of X-Com computing
Server part of the application
Server API

X-Com server

Internet

X-Com clients
Client API

Client (computing) part of the application


X-Com server APIs
· Files API
­ for quite simple applications ­ only need to specify input/output paths and a file lis t

· Perl API
­ for more complicated applications ­ need to create 6 functions in a Perl module


X-Com server APIs: Files
· Options of the server settings file (ini-file):
­ ta s k L is t
· input files list

­ ta s k In
· path where input files are stored

­ taskOut
· output path


X-Com server APIs: Perl
· Functions to be write in a task server module:
­ initialize ($taskArg)
· performs starting actions, e.g. arguments checking

­ getFirstPortionNumber ()
· returns number of the 1st portion (usually returns 1)

­ getLastPortionNumber ()
· returns number of the last portion (if we know total number of portions)

­ isFinished ()
· checks finishing conditions and returns true if the work is over (if we don't know total portions number)

­ getPortion ($N)
· returns content of Nth portions

­ addPortion ($N, $data)
· processes the result of Nth portions which is stored in $data

­ finalize ()
· makes any final actions


X-Com client APIs
· Processes API
­ only need to specify command lines for in itia liz in g a n d p o r tio n p r o c e s s in g

· Perl API
­ need to create 2 Perl functions


X-Com client APIs: Processes
· Options of the server settings file (ini-file):
­ xCliInitCmd
· command line which will be run once before portions processing

­ xCliPortionCmd
· command line used for processing each portion

­ xCliPortionIn
· file name in which incoming portions will be stored

­ xCliPortionOut
· file name from which resulting data well be taken


X-Com client APIs: Perl
Functions to be write in a client module:
­ gcprepare ($addarg)
· function called once before any portion-processing actions, i.e. client part initializing
­ $addarg is an additional parameter that can be passed to the XCom client from command line

­ gctask ($task, $taskarg, $portion, $din, $dout, $addarg)
· function called for every portion
­ ­ ­ ­ ­ $task - task name $taskarg - task arguments $portion - portion number $din - file with portion content $dout - file from which results well be read


X-Com computing sequence using Perl API on both server and clients
1. Task initializing on server side Server part of the application 2. Task description request 3. Request for required files 4. Task initializing on clients 5. Portion request 6. Computations 7. Result sending 8. Finalizing
initialize() getPortion() addPortion() finalize()

X-Com server
T SR TFILE REQ ASW

X-Com client
gcprepare() gctask()

Client part of the application


X-Com files structure at a glance
· gserv
­ ­ ­ ­ XServ.pl *.pm *.ini logs

- X-Com server
executable of the server server components ini-files of server log-files saving path

· tasks
­ Pi
· Pi.pm · gctask · *.tar.g z

- applications
Pi calculation application (sample task) server part interface of Pi app. client part interface of Pi app. packed client parts

· gcli
­ client.pl ­ *.pm

- X-Com client
- executable of the client - client components

· utils
­ XLogProc.pl

- utilities
- log-files analyzing tool


Simple example: distributed grep. X-Com server settings & running
# dgrep .ini # Serve r inf o prSock ssHost ssPort # Task info t t t t t t a a a a a a s s s s s s k k k k k k N A A L I O a P r i n u me I gs st t = = = = = = d F 2 / / / g i 1 t t t r l 2 m m m e e . p p p p s 1 / / /
# task name # task API

= / tmp/x com.soc k = n ewser v.srcc. msu.r u = 6 5002

# internal UNIX socket, need for server # external TCP interface for clients ­ host... # ... and port

9 a a a

2 l l l

. o o o

2 g g g

4 s s s

4 .in /in.t xt .in .ou t

# arguments (a substring to search in files) # files list # input path (A pache log-files, by day) # output path

$ ./XServ.pl dgrep.ini


Simple example: distributed grep. Client part functions & running
sub gcprepare { return 1; } sub gctask { my ($task, $taskarg, $portion, $din, $dout) = @_; my ($file, $pattern) = split (' ', $taskarg); `/bin/grep '$pattern' $din > $dout`; return 1; } 1;

$ ./client.pl ­s http://newserv.srcc.msu.ru:65002


Another example: Pi calculation using Monte Carlo method
X 1

Let Xn = (x1,y1), (x2,y2), ..., (xn,yn) are n pairs of independent random values with normal distribution
Let KXn is a number of pairs for which (xi2+yi2) < 1

0

1

Y Than 4KXn/n , n

Accuracy with quite large n: up to i-th sign after dot, where i = [-log10(2/(3n)½)]-1


Pi calculating: server part functions (1) ~/xcom/tasks/Pi/Pi.pm
sub initialize { ($portionsCount,$expo) = split (/ /, $_[0]) or die ("Pi error: incorrect arguments!"); $totalSum = 0; $donePortions = 0; $outCSV = "../tasks/Pi/out/pi_esteem_$portionsCount"."_$expo.csv"; $outRes = "../tasks/Pi/out/pi_$portionsCount"."_$expo.txt"; open E, "> $outCSV" or die ("Pi error: can't write to $outCSV!"); print E 'Portions;Drops;EstimatedPi;RealPi;Error'."\n"; close E; print STDERR "Pi initialized: $portionsCount portions, ".(10**$expo)." (10**$expo) drops in portion\n"; return 1; } sub getFirstPortionNumber { return 1; } sub getLastPortionNumber { return $portionsCount; }


Pi calculating: server part functions (2) ~/xcom/tasks/Pi/Pi.pm
sub get Porti on { my ($po rtion ) = @_; return $port ion; } sub m $ $ m $ o m s p s c } add Porti on { y ($n, $por tionSum ) = totalS um = $totalS um + donePo rtion s += 1; y $dro psCou nt = $d oneP thePI = $to talSum / $d pen E, ">> $outCSV " or die ("Pi error: y $lc_ numer ic = se tloc etloca le(LC _NUMERI C,'r rint E "$do nePorti ons; ($thePI -$PI) ."\n"; etloca le(LC _NUMERI C,$l lose E ;

( $_[0], $_[1] ); $portio nSum; o rtions * (10 ** $ex po); r opsCoun t; can't a le(LC u _RU') $ drops write to $o utCSV!" ); _N UMERI C); ; Co unt;$ thePI;$ PI;".

c _numeri c);


Pi calculating: server part functions (3) ~/xcom/tasks/Pi/Pi.pm
sub isF inish ed { return 0; } sub fin alize { open F, "> $ outRes" or die ("Pi error: can't write to $o utRes") ; print F $the PI."\n" ; close F ; return 1; }


Pi calculating: client part functions ~/xcom/tasks/Pi/gctask
sub gcp repar e { return 1; } sub m m m $ m i } o p c r } gct ask { y ($ta sk,$ y ($n, $exp y $cmd = " cmd = "./$ y $s = rea f ($s= =-1) return p r l e e i o t n n s u (F tF e( rn

t o P c d

as ) i md pi { 0;

k = $ " p

ar s ex i e(

g p p f $

, $p l it o "; ($ c md

ort ion,$di n,$do ut) = @ _; (/ /, $ta skarg ); ^O ne 'MSW in32' ); );

,"> $ dout"); "$s" ; F); 1;


Pi calculating: X-Com server settings ~/xcom/gserv/Pi.ini
# Pi.ini # Server info prHost prPort ssHost ssPort # Task info taskName taskAPI taskArgs = Pi = Perl = 100 9 = = = = 212.192.244.31 65000 212.192.244.31 65001
# host-port pair for internal server purposes... # ... that can be used instead of UNIX socket # external (client) interface: host... # ... and port

# 100 portions, 10

9

random drops in each


X-Com server console: starting ~/xcom/gserv
$ ./XSe rv.pl Pi.ini XS P S D T T T T e r u o a a a a r o b w s s s s v c s n k k k k : e e l s s v a P a D r t o e d I m : g art ing r: rs: pr efix: : e: ume nts: s r o A n I a 2 2 { P P 1 1 1 1 h e i 2 2 t r .192 .244. 31:6500 0 .192 .244. 31:6500 1 tp:/ /212. 192.244 .31:6 5001/} l

00 9

Pi init ializ ed: 100 port ions, 1 00000 0000 (1 0**9) drops in po rtion XServPr ocess or: Tes tPi i nitiali zed ( 1 -> 10 0)


X-Com server console: basic monitoring of computing process
P P P P P P P P P P P P P P P P P P P P ... > > > > > > > > > > > > > > > > > > > > m m m m m m m m m m m m m m m m m m m m s s s s s s s s s s s s s s s s s s s s k k k k k k k k k k k k k k k k k k k k . . . . . . . . . . . . . . . . . . . . s s s s s s s s s s s s s s s s s s s s k k k k k k k k k k k k k k k k k k k k i i i i i i i i i i i i i i i i i i i i f f f f f f f f f f f f f f f f f f f f _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ m m m m m m m m m m m m m m m m m m m m g g g g g g g g g g g g g g g g g g g g u u u u u u u u u u u u u u u u u u u u . . . . . . . . . . . . . . . . . . . . n n n n n n n n n n n n n n n n n n n n o o o o o o o o o o o o o o o o o o o o d d d d d d d d d d d d d d d d d d d d e e e e e e e e e e e e e e e e e e e e 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 4 3 4 4 3 3 3 3 3 3 4 4 4 3 4 3 3 3 3 . . . . . . . . . . . . . . . . . . . . 3 1 4 4 2 2 1 3 4 1 2 2 4 1 3 3 4 1 4 1 : : : : : : : : : : : : : : : : : : : : T T T T T T T T R R R R R R R R A A R R S S S S S S S S E E E E E E E E S S E E R R R R R R R R Q Q Q Q Q Q Q Q W W Q Q

1 2 3 4 5 6 7 8 1 2 9 10


Web-based monitoring
http:///


X-Com server console: finishing
... P> msk.skif_mgu.node-33-08.1: ASW P> msk.skif_mgu.node-33-08.1: REQ P> msk.skif_mgu.node-33-03.1: ASW P> msk.skif_mgu.node-33-03.1: REQ P> msk.skif_mgu.node-33-05.1: ASW P> msk.skif_mgu.node-33-05.1: REQ P> msk.skif_mgu.node-33-07.1: ASW P> msk.skif_mgu.node-33-07.1: REQ P> msk.skif_mgu.node-33-06.1: ASW P> msk.skif_mgu.node-33-06.1: REQ P> msk.skif_mgu.node-33-02.1: ASW P> Finishing communications... XServSubserver: cannot connect to XServSubserver: cannot connect to Killed $ 96 97 97 98 98 99 99 100 95 100 100 212.192.244.31:65000, retrying (1)... 212.192.244.31:65000, retrying (2)...


Analysis of the finished calculations
~/xcom/utils/XLogProc.pl


X-Com additional features: model (testing) task
· The task aimed to model real applications for preliminary assessment of environment characteristics and application behavior · Parameters:
­ size of input/output data ­ time of portion processing ­ computational modules


X-Com additional features: proxy (buffering) servers
· Unlimited levels of hierarchy · Data (portions and files) buffering
­ less network connections ­ less traffic consumption

· Optimization of server load


X-Com additional features:
XQServ - task management subsystem
· High-level user interface
Web Browser XQServ Client

­ almost as common batch system
XQServ Server

· Multiuser and multitasking · Tasks scheduling:
­ consecutive queue ­ parallel running ­ using specified task conditions:
· CPU type, frequency, performance · OS type · RAM size

X-Com Server X-Com Server

X - C om Server

C 4 _6 86 :x PU


X-Com additional features: working on computing clusters
· Exclusive nodes using · Using nodes idle time · Using batch systems:
­ ­ ­ ­ Torque L o a d L e v e le r Cleo Unicore


X-Com additional features: Flash-based computing visualization


Real distributed computing practice (1)
Computing of matrix coefficients for electromagnetic field diffraction on homogeneous dielectric free shape objects (together with Penza State University) Computing resources ­ 6 HPC sites:
­ Moscow, Chelyabinsk, Ufa, Sankt-Petersburg, Novosibirsk, Tomsk
· Linux OS · Intel Xeon, Intel Itanium, AMD Opteron CPUs · exclusive using of nodes

­ 2199 cores ­ 24.5 TFlops

Task splitting:
­ 280525 portions, about 5 minutes processing each

Time:
­ total ­ 11 hours 46 minutes ­ CPU ­ 2.7 years

Efficiency:
­ server ­ 98.81% ­ environment ­ 98.77%


Real distributed computing practice (2)
Virtual screening, i.e. searching for inhibitors for cancer proteins (together with "Molecular Technologies" company) Source:
­ 41 tasks, 318 portions in each ­ average time of portion processing - 6 hours

Computing resources ­ SKIF MSU "Chebyshev":
­ simultaneous processing of several tasks ­ 160 cores (40 nodes) per each task ­ working together with Cleo batch system

Time:
­ total ­ 7.5 days ­ CPU ­ 10.6 years


Contacts
· http://X-Com.parallel.ru (in Russian)
­ latest versions:
· http://x-com.parallel.ru/download/xcom2.tar.gz · http://x-com.parallel.ru/download/xcom2.zip

· x-com@parallel.ru