Документ взят из кэша поисковой машины. Адрес оригинального документа : http://sp.cs.msu.ru/dvm/dvmhtm1107/eng/sys/libdvm/rtsIDe1.html
Дата изменения: Mon Feb 13 12:59:10 2006
Дата индексирования: Mon Oct 1 23:22:46 2012
Кодировка: Windows-1251
Lib-DVM. Interface description. Part 1 (1-5)
Lib-DVM interface description (contents) Part 1
(1-5)
Part 2
(6-7)
Part 3
(8-11)
Part 4
(12-13)
Part 5
(14-15)
Part 6
(16-18)
Part 7
(19)
created: february, 2001 - last edited 03.05.01 -

1 Introduction

Before proceeding with DVM Run-Time Library functions let us give a short description of the parallel computations model. A parallel C-DVM (or Fortran DVM) program is translated to the program in the standard C (or Fortran 77) language extended by calls of the Run-Time Library functions, and to be executed according to SPMD model on each processor assigned to the task.

On startup the program has the only branch (control flow). This branch is executed from the first program statement on all the processors of the processor system.

Let us define the processor system (or system of the processors) as computing machine, assigned to the user program by hardware and by the base system software. For example, for computers with distributed memory the computing machine can be an MPI-machine. In this case, the processor system is a group of MPI-processes, created when the program is started. The number of the processors of processor system, as well as its representation as a multidimensional grid is specified in the command line starting the program. All declared variables are replicated over all the processors. The only exception is arrays specially defined as "distributed".

Entering a parallel loop, the branch is split into some number of parallel branches. Each of the branches is executed on a separate processor of the processor system.

Leaving a parallel construct, all parallel branches are merged into the original branch, which was executed before entering the parallel construct. At this moment all changes in replicated variables caused by the parallel branches execution become visible to all processors (that is, the variables are set to coherent state).

2 Run-Time System initialization and completion

Initialization in C program:

long rtl_init ( long
int
char
InitParam,
argc,
*argv[] );

Initialization in Fortran program:

long linit_ (long *InitParamPtr);

InitParam or
*InitParamPtr
-
parameter of Run-Time System initialization.
argc - number of string parameters in command line.
argv - array containing pointers to string parameters in command.

The functions rtl_init and linit_ initializes Run-Time System internal structures according to modes of interprocessor exchanges, statistic and trace accumulation, and so on, defined in configuration files.

The initialization parameter can be:

0 - default initialization;
1 - initialization with blocked dynamic control (in this case dynamic control specified in Run-Time System startup parameters is suppressed).

The function returns zero.

long lexit_ (long *UserResPtr);

*UserResPtr - value returned by user program.

The function lexit_ completes correctly the execution of Run-Time System. That is, the function frees the memory used by Run-Time System, writes the statistic and trace information into disk file, and so on.

The function does not return control.

Note. A user program startup on processor system requires to specify (as startup parameters) the following characteristics of the processor system as multidimensional array: the processor system rank and sizes of all its dimensions.

Let the rank of processor system be n, and size of k-th dimension be PSSizek (1 ? k ? n). Then when Run-Time System is initializes an internal number ProcNumberint will be assigned to the each processor

where:

Ik - processor index value of k-th dimension of the processor system index space (0 ? Ik ? PSSizek - 1).

So the internal number is the linear index of the processor in index space of the procеssor system.

In interprocessor exchanges a processor identifier ProcIdent is used as the processor adsress. The correspondence

ProcNumberint => ProcIdent

is defined by Message Passing System and returned to Run-Time System when it is initialized.

There are two functionally special processors: input/output processor and central processor among processors, assigned to a task. Input/output processor is intended to deal with the file system directly (see section 16) and its internal number is zero. The central processor computes the reduction functions (see section 11) and is defined by an index vector ([PSSize1/2], ... ,[PSSizen/2]).

3 Creating abstract machine representations

An abstract machine concept is introduced for two-step mapping of a parallel program onto a real parallel computer. First, a programmer creates an abstract machine, most suitable for his program (that is, the abstract machine realizing all potential program parallelism). Then, the programmer defines the mapping of his computations and data onto this machine, and he also defines the rules of mapping this abstract machine onto a real parallel computer. Therefore, an abstract machine is a hierarchy of abstract parallel subsystems. Each of these subsystems can be represented as a multidimensional array of subsystems of the next hierarchy level. Several different representations for each subsystem may co-exist.

There is no an "abstract machine" notion in C-DVM and Fortran-DVM languages. Instead of that the term "template" ("TEMPLATE") is used. Each "template", described in the program, is represented as abstract machine in Run-Time System. For each explicitly distributed array (that is array, specified with DVM-directive "DISTRIBUTE") a corresponding abstract machine is created too.

3.1 Requesting current abstract machine

AMRef getam_(void);

This function returns a reference to current abstract machine. The current abstract machine is an abstract machine the current program branch is mapped on. Only one abstract machine (the top level of hierarchy) exists when the program starts. The initial abstract machine is mapped onto the processor system assigned by Operating System (OS) for program execution. Therefore, all processors concerned execute initial program branch (mapped onto initial abstract machine). All abstract machines, which program creates later, are descendants of the initial abstract machine. An abstract machine becomes the current one when control enters parallel branch (subtask or parallel loop iteration) mapped onto this abstract machine or when control exits from the parallel construct.

3.2 Creating abstract machine representation

AMViewRef crtamv_ ( AMRef
long
long
long
*AMRefPtr,
*RankPtr,
SizeArray[],
*StaticSignPtr );
     
*AMRefPtr - a reference to the abstract machine.
*RankPtr - a rank of created representation.
SizeArray - array, which i-th element is a size of the (i+1)-th dimension of the created representation (0 ? i ? *RankPtr - 1).
*StaticSignPtr - the flag of static representation creation.

The function crtamv_ creates a representation of the assigned abstract machine as an array of abstract machines of the next hierarchy level. The function returns reference to the created representation. The representation of the abstract machine as an array allows data arrays and parallel constructs to be mapped onto the abstract machine. An abstract machine can possess several representations, and each of them being an array of abstract machines of the next hierarchy level.

Parental abstract machine, specified by *AMRefPtr reference must be the current abstract machine or its (direct or indirect) descendant. If AMRefPtr = NULL or *AMRefPtr = 0, a representation of the current abstract machine will be created.

If the flag *StaticSignPtr of static representation is not equal to zero, then the created representation will not be deleted automatically, when the control exits the program block (see section 8). Such type of representation can be deleted only explicitly using the function delamv_ considered below.

3.3 Requesting reference to an element of abstract machine representation

AMRef getamr_ ( AMViewRef
long
*AMViewRefPtr,
IndexArray[] );
     
*AMViewRefPtr - a reference to the abstract machine representation.
IndexArray - array, which i-th element is a index value of requested element (that is abstract machine) along (i+1)-th dimension.

The size of the array IndexArray must be equal to the rank of specified representation of the abstract machine.

3.4 Deleting abstract machine representation

long delamv_(AMViewRef *AMViewRefPtr);

*AMViewRefPtr – the reference to the abstract machine representation.

The function deletes an abstract machine representation, created by the function crtamv_. When the representation is deleted, all representations of the abstract machines, included in the representation and all distributed arrays mapped on the representation are deleted aмso.

After deleting the representation the reference can be used by user program for its own goals.

A representation of abstract machine can be deleted by delamv_ function only if it was created in the current subtask and in the current program block (or in its subblock) (see sections 8 and 10).

To delete an abstract machine representation the function delobj_ can also be used (see section 17.5).

The function returns zero.

4 Processor systems

4.1 Requesting reference to the processor system

PSRef getps_ (AMRef *AMRefPtr);

*AMRefPtr – a reference to the abstract machine.

The function getps_ returns a reference to the processor system the specified abstract machine is mapped onto. The parameters of the processor system (the rank and the size of each dimension) can be obtained through the functions getrnk_ and getsiz_ (see section 17).

If AMRefPtr = NULL or *AMRefPtr = 0 the reference to the processor system, the current abstract machine is mapped on, is returned (i.e. the reference on the current processor system is returned).

If *AMRefPtr = –1, the function returns the reference to initial processor system.

The returned reference will be equal to zero, if specified abstract machine is not mapped on any processor system.

4.2 Creating subsystem of specified processor system

PSRef crtps_ ( PSRef
long
long
long
*PSRefPtr,
InitIndexArray[],
LastIndexArray[],
*StaticSignPtr );
     
*PSRefPtr - reference to the (source) processor system whose subsystem will be created.
InitIndexArray - array, which i-th element is an initial value of the index of the (i+1)-th dimension of the source processor system.
LastIndexArray - array, which i-th element is last value of the index of the (i+1)-th dimension of the source processor system.
*StaticSignPtr - the flag of the static subsystem creation.

The function crtps_ creates the subsystem of the same rank, as the rank of source processor system, and returns the reference to the subsystem. The sizes of InitIndexArray and LastIndexArray arrays must be equal to the rank of the source (and to be created) processor systems.

All processors of source processor system, specified by *PSRefPtr reference, must be the members of the current processor system. If the pointer PSRefPtr is equal to NULL or the reference *PSRefPtr has a zero value then the current processor system will be used as source one.

Internal values of element coordinates of any processor system are numbered from 0. Therefore element (P1, ... , Pj, ... , Pn) of the created processor subsystem is the element (P1+InitIndexArray[0], ... , Pj+InitIndexArray[j-1], ... , Pn+InitIndexArray[n-1]) of the source processor system (n is the rank of source and created systems). The size of i-th dimension of the created subsystem is equal to LastIndexArray[i-1]-InitIndexArray[i-1]+1.

If the flag *StaticSignPtr of static processor subsystem is not equal to zero, then the created processor system will not be deleted automatically, when the control exits the program block (see section 8). Such subsystem can be deleted only explicitly using the function delps_ considered below.

Computational coordinate weights of created subsystem processors will be set to 1.

Note, that processor systems, created by crtps_ function, can be intersected by processors.

4.3 Reconfiguring (changing shape of) processor system

PSRef psview_ ( PSRef
long
long
long
*PSRefPtr,
*RankPtr,
SizeArray[],
*StaticSignPtr );
     
*PSRefPtr - reference to the source (to be reconfigured) processor system.
*RankPtr - rank of the target (reconfigured) processor system.
SizeArray - array, which i-th element is the size of (i+1)-th dimension of the target processor system.
*StaticSignPtr - flag of static target processor system.

The function psview_ creates new processor system from the elements of the source processor system and returns reference to the target system. A number of elements in the source and target systems must be the same.

All processors of source processor system, specified by *PSRefPtr reference, must be members of the current processor system. If the pointer PSRefPtr is equal to NULL or the reference *PSRefPtr has a zero value then the current processor system will be used as source one.

Computational coordinate weights of created subsystem processors will be set to 1.

4.4 Deleting processor system

long delps_ (PSRef *PSRefPtr);

*PSRefPtr - reference to the processor system to be deleted.

The function deletes the processor system, created by the function crtps_ (or psview_). When the processor system is deleted, all its subsystems and all mapped on the processor system representations of abstract machines and distributed arrays are deleted also. Abstract machines, mapped on deleted processor system by mapam_ function are not deleted, but the subtasks, created by the function, will not exist (see section 10).

The processor system can be deleted by delps_ function only if it was created in the current subtask and in the current program block (or in its sub-block) (see sections 8 and 10). Initial processor system can't be deleted.

To delete a processor system the function delobj_ can also be used (see section 17.5).

The function returns zero.

4.5 Weights of processor system elements

Let a processor system be n-dimension array, and a function WEIGHTi with the domain of definition in a space of values of i-th dimension index variable and with image in real numbers, more or equal to 1 be defined for every dimension. A value of function WEIGHTi(Pi) will be called the weight of Pi coordinate (1 ? i ? n , 0 ? Pi < PSSIZEi , PSSIZEi is a size of i-th dimension of processor system).

Then processor (P1, ... , Pi, ... , Pn) weight is by definition

The coordinate weights of the initial processor system elements (therefore and processor weights) are parameters of Run-Time System startup. When a user program is running the coordinate weights of the processors of the processor system can be assigned (changed) by the function

long setpsw_( PSRef
AMViewRef
double
*PSRefPtr,
*AMViewRefPtr,
CoordWeightArray[] );
     
*PSRefPtr - reference to the processor system, whose element coordinate weights will be assigned.
*AMViewRefPtr - reference to the abstract machine representation, to be mapped onto specified processor system with the assigned coordinate weights.
CoordWeightArray - array, containing processor coordinate weights.

Depending on AMViewRefPtr parameter value there are two ways of setpsw_ function execution.

1. AMViewRefPtr ? NULL and *AMViewRefPtr ? 0.

Assigned weights of processor coordinates are intended only for mapping or remapping given representation of abstract machine on given processor system.

When setpsw_ function is called the abstract machine (parental) with representation, specified by *AMViewRefPtr reference, must be already mapped.

All processors of the system, specified by *PSRefPtr reference, must belong to elementary intersection of the current processor system with the processor system, the parental abstract machine is mapped on. NULL value of PSRefPtr pointer or zero value of *PSRefPtr reference means, that coordinate weights of the current processor system elements are assigned.

2. AMViewRefPtr = NULL or *AMViewRefPtr = 0.

Assigned weights of processor coordinates, as the weights, specified at Run-Time System startup, will be used in mapping or remapping of all representations of abstract machines on given processor system (except ones, for those their own procеssor coordinate weights were assigned or will be assigned by setpsw_ function).

All processors of system, specified by *PSRefPtr reference, must be the members of the current processor system. NULL value of PSRefPtr pointer or zero value of *PSRefPtr reference means, that coordinate weights of the current processor system elements are assigned.

The weight of coordinate Pi is specified by a value of

    i-1
(е PSSIZEk + Pi)-th
   k=1

element of the array CoordWeightArray. A number of elements of the array CoordWeightArray, containing processor coordinate weights, must be equal to sum of sizes of all processor system dimensions. The coordinate weights in the array CoordWeightArray may be any positive numbers. Executing function setpsw_, Run-Time System updates every weight Pi dividing it by minimal weight of the coordinate in the array CoordWeightArray