Tue, 24 Mar 1992 17:47:45 GMT

Related articles |
---|

Some proposed parallel extensions to C knighten@SSD.intel.com (1992-03-24) |

Newsgroups: | comp.compilers |

From: | knighten@SSD.intel.com (Bob Knighten) |

Keywords: | C, parallel |

Organization: | Compilers Central |

Date: | Tue, 24 Mar 1992 17:47:45 GMT |

There are a number of efforts underway to define variants of common

programming languages (e.g. C and Fortran) to take advantange of

parallel computers. The issue of compiling these languages has not yet

gotten much attention. Here is one proposal from a Russian group for an

extension to C that was sent to Walt Rudd because of his role in the

X3H5 effort.

Dear Prof. Walter G. Rudd,

We are participants of implementation of programming languages (the

Assembly language, Fortran 77, Pascal and C) for one of the first Russian

supercomputers - "Electronica SSBIS". The work was started in 1985. All

compilers are already working on simulator of the supercomputer on

instrumental computers, but hardware yet is not ready. Answering the

request of our software engineers (they were developing the Operating

system and Scientific subroutine packages) we have developed the extension

of C language for our supercomputer. One of our aims in developing the

extension was to achieve the software portability among different

supercomputers.

We send you the paper describing the main features of our C language

extension and hope that it may be useful for your working group X3H5.

We are interested in discussion on C language extensions for

computers with parallel architecture and should be very grateful to you

for our further contacts.

Sincerely Yours,

Sergey S. Gaissaryan

Alexey L. Lastovetsky

Our address is:

Dr. S.S.Gaissaryan

37 Vavilova street

Cybernetics Problems' Institute

Russian Academy of Sciences

Moscow, 117312,

Russia

e-mail kuz@ivann.delta.msk.su.

Fax 7(095)9382136.

EXTENSION OF C FOR SUPERCOMPUTERS

S.S.Gaissaryan, A.L.Lastovetsky

1. Introduction.

In this paper we consider the language aspects of the problem of

software transfer within the class of supercomputers having shared

storage.

The problem of program transfer from one computing environment to

another consists in keeping of the functional semantics and the

operational properties of this program.

The first step toward solution of the transfer problem is the

development of#_ #.transfer language#_ #.for desired class of computers.

The transfer language is a programming language which satisfies the

following demands:

- the system of language notions includes all basic notions of

assembly languages for the corresponding class of computers;

- language's semantics must be defined with maximal possible degree

of completeness, allowed by considered class of computers;

- the facilities to localize the dependence of program semantics on

the architecture must be provided;

- the language description must specify clearly what exactly in the

language does not depend on the implementation, what and how exactly

depends on it, and what is undefined.

The transfer language effectiveness is ensured on the one hand by

including all basic notions of assembler languages for the corresponding

class of computers in the system of language notions, and on the other

hand by describing these notions such that its semantics is fuzzy enough

to provide the effective mapping in the system of notions of every

computer of this class.

The desired degree of language portability is ensured, first of all,

by the highest possible degree of completeness of its functional semantics

which affects its effectiveness.

For the class which consists of computers of traditional von Neumann

architecture the C language as specified by [1] is used as transfer

language.

In this paper we describe the C[] language proposed as the transfer

language for the supercomputers having shared storage. The s[] language

is the exact extension of C. It means that any correct C-program has the

same semantics in C[] and C languages.

In C[] the conceptions of array and pointer are extended in

comparison with C. The concept of vector is introduced. The new [], [:]

operators are included in C[]. Once more storage-class cache is added.

The standard libraries are also extended. Besides, it is allowable in C[]

to describe a member of a structure as an array of bit-fields.

2. Array.

In the C language an array comprises "a contiguously allocated set of

members of any one type of object"[1,p.21].

In C[] language an array comprises a sequentially allocated (with a

constant positive, negative or zero step) set of members of any one type

of object. A negative step specifies the allocation of members of the

array in the reverse (from right to left) order. The zero step specifies

the allocation of all members of the array in the same element of storage.

Thus in the C[] language an array has at least 3 attributes, namely:

a type of members, a number of members and an allocation step.

In the C[] language the array declarators syntax differs from

standard one (see [1,p.56]) in following way. The rule

<direct-declarator>:

<direct-declarator>[<constant-expression>(opt)] is replaced

with the rules

<direct-declarator>:

<direct-declarator>[<constant-expression>(opt)

<step>(opt)]

<step>:

:<constant-expression>

If <step> is not specified then it is equal to 1.

3. Pointer.

In the C language a pointer has only one attribute, namely: the type

of objects pointed to.

This attribute is necessary, firstly, to interpretate correctly

values of objects in case of an access to objects by means of the pointer,

and secondly, to interpretate correctly the address + and - operators.

In the C language the address + and - operators are correct only if

the pointer operands and the pointer results point to members of the same

array object. The same rule is valid for the C[] language. Therefore, to

support the correct interpretation of the address operators once more

attribute of the pointer is introduced in the C[] language. This

attribute is step.

In the C language, "when an expression that has integral type is

added to or substructed from a pointer, the integral value is first

multiplied by the size of the object pointed to"[1,p.41]. In the C[]

language, the similar multiplier is equal to the product of the pointer

step and the size of the object pointed to. In the C language, "when two

pointers to members of the same array object are substracted, the

difference is divided by the size of a member"[1,p.41]. In the C[]

language, the similar divisor is equal to the product of the pointer step

and the size of a member.

In the C[] language the pointer declarators syntax differs from

standard one (see [1,p.56]) in following way. The rules

<pointer>:

*<type-specifier-list>(opt)

*<type-specifier-list>(opt)<pointer>

is replaced with the rules

<pointer>:

*<step>(opt)<type-specifier-list>(opt)

*<step>(opt)<type-specifier-list>(opt)<pointer>

If <step> is not specified then it is equal to 1.

Example. The declarations

int a[]={0,1,2,3,4};

int *:2 p1=(void*)a, *:-1 p2=(void*)&a[4];

result the address expressions (R1+1) and (p2+2) point to the same member

of array, namely, a[2]. Indeed, the offset from p1 defined by the

expression (p1+1) is equal to (2*sizeof(int))*1 bytes, and the offset from

p2 defined by the expression (p2+2) is equal to ((-1)*sizeof(int))*2

bytes.

4. Access to member of array.

In the C[] language the access to e2-th member of an array object e1

is fulfiled with the help of the expression e1[e2] which is identical to

(*(e1+(e2))). Here, e2 is an integral expression, e1 is an lvalue that

has the type array of type, and this lvalue is converted to an expression

that has the type pointer to type and that points to the initial member of

the array object (the attribute step of this pointer is identical to the

attribute step of the array object).

5. Vector as value of array object. Access to value of array.

In the C[] language a value of an array object is defined as an

ordered sequence of values of its members and is called vector. The i-th

member of an vector is a value of the i-th member of the corresponding

array object.

It follows from this definition that an array object can not be a

member of a vector (since any array object is not a value of an object of

some type), but one vector can be a member of another vector (since a

vector is a value of an array object).

Note, that in contrast to the C language in C[] language the concepts

type of object and type of value of object are different. In particular, a

vector is a value of array object but it has not the attribute step. This

attribute specifies an allocation of an array value in an array object. A

type of a vector is defined by only two attributes, namely: type of

members and number of members. Therefore, two array objects that have a

different types may have a values of the same type.

No access to a member of a vector is defined.

Example. The value of the array defined by the

declaration

int a[3][2];

is the vector consisting from three vectors, each from which consists from

two integers. So, the execution of the iteration statement

for(i=0; i<3; i++)

for(j=0; j<2; j++)

a[i][j]=i+j;

causes the value of the array a to be equal to the vector { {0,1}, {1,2},

{2,3} }. The type of this vector is named by the type name int[3][2].

In the C language, except when used as an operand where an lvalue is

permitted (in particular, as an operand of the sizeof operator), or when a

string literal is used to initialize an array of chars, an expression that

has type array of type is converted to an expression that has type pointer

to type and that points to the initial member of the array object.

To support the access to an array as the whole the postfix []

operator is introduced. The operand of this operator shall have type

array of type. The [] operator blocks (forbids) the convertion of the

operand to a pointer. Such lvalues that designate an array as the whole

are called a blocked lvalues. A rules of the treatment with a blocked

lvalues are fully similar to a rules of the treatment with a lvalues that

have a scalar type. In particular, if a is an array then the expression

&a[] is equal to the lvalue a.

A modifiable blocked lvalue#_ #.is a blocked lvalue that does not

have a type declared with the const type specifier.

The unary & operator deblocks a blocked lvalue.

Example. If the arrays a, b and c are declared as

int a[8], b[8:2], c[8:-1];

then the expression a[]=b[]+c[] assigns the sum of vectors being a values

of the arrays b and c to the array a. Note that here the expression

a[]+b[] has type vector from 8 ints.

6. Access to subarrays.

Any set of objects belonging to some array is called its subarray iff

this set itself forms an array. By definition an object belongs to an

array if either it is a member of this array or it belongs to a member of

this array. Besides, any subarray is also an object belonging to the

array.

In principle, the introduced facilities are sufficient to access to a

subarray. For example, if the array object a is defined by the declaration

int a[5][5];

then the blocked lvalue

(*(int(*)[5:6])a)[] (1)

designates the array of five ints (with the step 6) that contains the main

diagonal of the matrix a, and the blocked lvalues

(*(int(*)[4:6])(a[0]+1))[] (2)

and

(*(int(*)[4:6])(&a[0][1]))[] (3)

designates the array of four ints (with the step 6) that contains the

diagonal of the matrix a which is placed under the main diagonal.

The more compact notation comes out if a variables of type pointer to

array are used. So, if the pointer objects p1 and p2 are defined by the

declaration

int (*p1)[5:6]=(void*)a, (*p2)[4:6]=(void*)(a[0]+1); then the

expression (*p1)[] can be used instead of (1) and the expression (*p2)[]

can be used instead of (2) and (3).

In the C language the array subscripting operator is surplus. This

operator provides a comfortable abbreviation e1[e2] for the expression

(*(e1+(e2))) where the expression e1 shall have type pointer to type and

the expression e2 shall have integral type. Similarly, the ternary [:]

operator is introduced in the C[] language. This operator provides an

abbreviation for the blocked lvalues like (1)-(3). Namely, if e1 is a

lvalue by type t and e2, e3 are an integral constant expressions then the

expression e1[e2:e3] is equal to the expression (*(T(*)[e2:e3])(&(e1)))[]

being within the scope of the type definition of the typedef t T kind. The

expression e3 is optional (the expression e1[e2:] is equal to the

expression e1[e2:1]).

For example, the expression (1) can be expressed as a[0][0][5:6], and

the expressions (2) and (3) can be expressed as a[0][1][4:6].

In common case, both second and third operands of the [:] operator

can be any integral expressions (not only constant ones). It allows to

specify dynamically the required subarray.

7. Segments of arrays.

Not every regular set of objects belonging to an array is a subarray.

To make the access to the similar sets of objects belonging to an

array easy, new derived types segments of arrays#_ #.are introduced in the

C[] language and the [:] operator is extended.

A segment of an array may comprise a sequential members of the array

allocated with a constant step (a unit of stepsize is a member of the

array). Besides, if a members of the array are also an arrays then a

segment of the array may comprise a sequential segments (of the same type)

of the array members allocated with a constant step. In this last case, a

unit of stepsize may be not only a segment of a member of the array but a

member of a segment of a member of the array if a member of a segment of a

member of the array is also an array and so on.

In common case, in the expression e1[e2:e3] the third operand that

specified a step is described as follows

<segment_step>:

<integral_expression>(opt)

:<segment_step>(opt).

The expression e1 is either an lvalue whose type is not array or a blocked

lvalue of type array or an lvalue of type segment of array. If e3 is an

integral expression then a unit of step is equal to the size of the object

designated by e1. If e3 is :e4 where e4 is an integral expression then a

unit of step is a member of the object designated by e1. If e3 is ::e4

then a unit of step is a member of a member of the object designated by e1

(naturally, in this case e1 should be an lvalue of the corresponding

type).

If e3 is :e4 then the expression e1[e2:e3] is an lvalue of type

segment of array and designates the corresponding segment. If e1 is an

lvalue of type segment of array then the expression e1[e2:e3] is also an

lvalue of type segment of array and designates the corresponding segment.

In all rest cases the expression e1[e2:e3] is a blocked lvalue of type

array.

An lvalue of type segment of array is never converted to a pointer

that points to the initial member of the segment and always designates the

segment as the whole.

There are some constraints in a usage of lvalues that have type

segment of array. In particular, the & unary operator is not applicable

to such lvalues.

If e1 is an lvalue of type segment of array and e2 is an integral

expression then the expression e1[e2] designates the e2-th member of the

segment e1 (counting from zero). In difference from the array

subscripting operator the segment subscripting operator is not expressed

by the * and + address operators that is in this case the expression

e1[e2] is not identical to the expression (*(e1+(e2))).

8. Vector construction operators.

If e1,...,eN are expressions of type t then the expression

{e1,...,eN} is of type vector of N members of type t.

If vector or scalar expressions e1, e3 are of type t and e2 is

integral positive expression then the expression e1[e2#e3] is of type

vector of e2 members of type t. The first member of this vector is e1, the

second one is ((e1)+(e3)), the third one is (((e1)+(e3))+(e3)) etc., the

sign + denoting the vector (memberwise) addition in vector case. The

operand e3 may be omitted in e1[e2#e3] expression. In that case the

vector consists of e2 identical members each of which is equal to e1. If

the length of the vector e3 is less than the length of the vector e1 then

missing right members are considered zeroes.

Example. The constant expression 1[5#2] denotes the vector

{1,3,5,7,9}, the constant expression 1[5#1][2#1[5#]] denotes the vector {

{1,2,3,4,5}, {2,3,4,5,6} }, and the constant expression 10[4#-2][3#{1,2}]

denotes the vector { {10,8,6,4}, {11,10,6,4}, {12,12,6,4} }.

If scalar operands of the { } operator are of different arithmetic

types the usual arithmetic conversions are performed. If they are of

pointer type all of them shall have the same type.

9. Lvector.

In addition to notions of lvalue, blocked lvalue and segment of array

which are used for designation of objects and are allowed as left operand

of assignment operators the notion of lvector is introduced.

Just as lvalue is an expression designating some object, lvector is a

vector expression designating the set of objects.

Example. If the objects x, y and pa are described as

int x, y, *pa[10];

then the vector expressions {y, x, *pa[3]} and *pa[] are lvectors but the

vector expression {x+y, y, x} is not.

Blocked lvalues of type array and segments of array can be considered

as special cases of lvectors which designate the sets of objects allocated

in regular way.

Lvector is modifiable if every member of the set of objects

designated by it is modifiable.

10. Vector conversions.

A conversion of the vector of one type to the vector of another type

is composition of two conversions, the first of which converts the vector

length the types of members remaining unchanged and the second one

converts the type of vector members the vector length being unchanged.

We define a vector length as the number of its members. For example,

the vectors {1,2} and {{3,4,5},{6,7,8}} both have the length 2.

There is not defined conversion of vector to non-vector (in

particular, to scalar).

A conversion of the non-vector (in particular, of the scalar) to the

vector of specified length and type of the members is composition of two

conversions, the first of which converts the non-vector to the vector of

specified length all members of which have the same type and value as the

initial non-vector has and the second one converts the type of vector

members to specified type.

Shortening of a vector is performed by dropping its ending members.

Conversion of a vector to the longer one is performed by adding of members

having indefinite values to its tail.

11. Integer vectors packing.

The C[] language provides bit-fields packing facilities for integer

vectors. Namely, an array declarator is allowed in position of optional

<declarator> in the bit-field declarator of

<declarator>(opt):<constant_expression> kind. The bit-field declarator

A[N:L]:M (A is an identifier, N,L,M are constant expressions) specifies N

sequentially (with the step L) allocated bit-fields of the same width M.

The width of bit-fields and the step size are measured in bits. The

assignment of an integer vector of length N to the member A of structure

causes the packing of this vector members in corresponding bit-fields.

Unpacking of integer vector packed in bit-fields is performed during the

access to corresponding member of the structure by means of the .

operator.

12. Scalar operators.

Three groups of scalar operators are added to the C[] language.

The first group consists of two binary ?> and ?< operators of maximum

and minimum calculation together with corresponding compound assignments.

The second group contains three unary bitwise operators, namely: ?

(counting unit bits), % (counting leading zeroes), @ (word reversing).

The third group contains floating operators alowing to specify the

way of rounding in runtime, namely, the modifiers of floating cast

operators are introduced: > (rounding up), < (rounding down),! (rounding

to the nearest) and = (chopping). Syntactically, a modifier is added as

suffix to a name of floating type in corresponding cast operator.

Example. In the fragment

float x;

double y,z;

...........

x=(float!)(y+z);

the value of the sum (y+z) having type double is rounded to nearest value

of type float, and resulting value is assigned to varyable x.

13. Vector operators.

The operand of unary *, +, -, ~, ?, %, @, ! operators and scalar cast

operators may have vector type. In this case the result of such operator

is vector members of which are the results of application of corresponding

operator to members of operand.

If a vector type name is used in a cast operator the vector length

specified by this cast operator may be specified by arbitrary (not only

constant) integral expression.

One or both operands of binary *, /, %, ?<, ?>, +, -, <<, >>, <, >,

<=, >=, ==, !=, &, ^, |, &&, || operators may have vector type.

If both operands are vectors of the same length then the result is

vector members of which are the results of application of corresponding

operator to members of operands.

If one of the operands is scalar it is converted to the type of

vector operand.

If vector operands of binary operator have different length then

behavior is undefined.

The first operand of conditional operator may have vector type. In

that case the second or the third operand but not both of them may be

omitted. If none of the operands is omitted then unlike the C language

all three operands are evaluated.

If all three operands are vectors of the same length then the result

is acheived by memberwise application of the operator. If vector operands

of conditional operator have different length then behavior is undefined.

If the second or the third operand is non-vector they are converted

to the type of the first (vector) operand.

If the first and the second (or the third) operands are vectors, the

third (the second) operand is omitted and the members of the first operand

have scalar type then the result will be the vector of the same type as

the second (the third) operand; the i-th member of the result is equal to

the k(i)-th member of the second (the third) operand where k(i) is index

of the i-th non-zero (zero) member of the first operand. The other

members have indefinite values.

If the first and the second (or the third) operands are vectors, the

third (the second) operand is omitted and the members of the first operand

have vector type then the result is acheived by memberwise application of

the operator.

An assignment operator may have as its left operand a modifiable

blocked lvalue, a segment of modifiable array or any other modifiable

lvector . In that case its right operand may have vector type. In any

case the type of its right operand converts to the type of the left

operand's value.

Example. The execution of following fragment

int a[]={0,1,2,3,4};

int *pa[]={a+1,a+2,a+3,a+4,a};

*(pa[])=a[];

results the vector {1,2,3,4,0} as value of array a.

The unary linear [*], [/], [%], [?<], [?>], [+], [-], [<<], [>>],

[&], [^], [|] operators correspond to binary *, /, %, ?<, ?>, +, -, <<,

*>>, &, ^, | operators. These operators are applicable only to vector*

operands. Let v[0], v[1],...,v[N] denote the members of vector operand v;

then the expression [op] v[] has the same semantics as the expression of

(...((v[0] op v[1]) op v[2]) op ... op v[N]) kind.

Example. The execution of following fragment

int a[]={0,1,2,3,4}, sum;

sum=[+]a[];

results the value of sum equal to value of expression

a[0]+a[1]+a[2]+a[3]+a[4] that is 10.

In the C[] language unlike the C language a formal parameter of a

function may have an array type. The declaration of such formal parameter

must specify all array attributes. Corresponding argument is the

expression of the same vector type as defined by formal parameter. The

function value also may have vector type.

14. Arrays and structures of register storage-class.

An array whose members are of scalar type and the step is equal to 1

may be declared with storage-class specifier register. It causes

corresponding array head allocation on vector register if there is free

one. If there is no free vector register or if array of declared type

should not be allocated on register the storage-class specifier is

ignored.

There are some constraints in usage of register variables of array

type. In particular, the address & operator and the [:] operator should

not be applied to it. The array subscripting operator is not expressed by

* and + address operators for register array.

A structure with members of scalar types may be declared with

storage-class specifier register. It causes corresponding structure

allocation on available scalar registers which allow collective exchange

with main storage.

15. Cache storage-class.

In C[]-machine we define a cache storage as additional address space

which is independent from main storage. A definition of an object with

storage-class specifier cache causes an appropriate amount of cache

storage to be reserved.

Aggregate object shall be allocated in main or cache storage in

indivisible way, that is all its members belong to one address space.

Binary address operators are invalid if one of their operands points to

cache storage and the another points to main storage.

Storage-class specifier cache may be used in pointer declarators. The

declarator

* <type_specifier_list> <opt> <identifier>,

where the type_specifier_list contains specifier cache defines the

identifier as a pointer allocated in cache-storage. The following pair of

declarations demonstrates the difference between a pointer to

cache-storage allocated in main-storage and a pointer to main-storage

allocated in cache-storage:

cache char *ptr_to_cache;

char *cache cache_ptr;

NULL pointer has no attrbute specifying address space.

16. Concurrent calculation facilities.

The C[]- machine includes several processors each connected with its

own cache-storage and shared main-storage. Access to the region of shared

storage by one processor does not block the access to the same region of

storage by other processors (except the semaphore operator lock).

Initially program is executed by one of the processors but during its

execution other processors may be activated. Therefore in common case

program statements are concurrently executed by several processors.

In any time each processor may be in one of following three states:

active, passive or waiting. Initially only one processor is active.

In C[] language is introduced statement label +:. This label relates

to one statement.

When a program-fragment is being executed by one of the processors of

the C[]-machine (we denote it as A-processor) and control achieves

+:-label, one of passive processors (B-processor) is activated by

A-processor and B-processor begins to execute the statement labelled by

+:-label, while the following statement is concurrently executed by A-

processor. When thread of control leaves +:-labelled statement,

B-processor is returned to passive state. Note that cancelling of

automatic objects of compound statement, created by B-processor while

entering this statement is done while leaving it.

Synchronization of the access to shared data is achieved by

semaphores.

New integral type sem is introduced in the C[] language . The set of

values of this type is subset of the set of values of int. The set of

values of sem is divided on two unintersecting subsets one of which

corresponds to the state semaphore is open, and another to state semaphore

is locked. A semaphore variable is implicitly initialized by value from

the set semaphore is open.

The C[] library contains three functions to deal with semaphore

objects:

- the function

void lock(sem* s);

checks to which set the value of object *s belongs. If it belongs to set

semaphore open, then the value of *s is turned to a value from set

semaphore locked, after that function normally terminates. If the initial

value of variable *s belongs to set semaphore locked, then function turns

the processor in waiting state until the value of *s be changed in such

way that it belongs to set semaphore open, after that the processor is

activated and continues its work from the execution of the function lock.

This function is not shared that is no processor can access to object *s

while processor which executes it is active;

- the function

int unlock(sem* s);

assigns to variable *s value from set semaphore open and returns 0 if

initial value of *s belonged to set semaphore open, and 1 otherwise; is

function unlock shared or not, depends on realization;

- the function

int test(sem* s);

returns 0, if *s belongs to set semaphore open, and 1, if *s belongs to

set semaphore locked; is function test shared or not, depends on

realization.

References

1. First working draft on programming language C (draft proposed by

ANSI). Ottawa, ISO/TC97/SC22/WG14, 1986.

Robert L. Knighten | knighten@ssd.intel.com

Intel Supercomputer Systems Division |

15201 N.W. Greenbrier Pkwy. | (503) 629-4315

Beaverton, Oregon 97006 | (503) 629-9147 [FAX]

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.