Related articles |
---|
Some proposed parallel extensions to C knighten@SSD.intel.com (1992-03-24) |
Newsgroups: | comp.compilers |
From: | knighten@SSD.intel.com (Bob Knighten) |
Keywords: | C, parallel |
Organization: | Compilers Central |
Date: | Tue, 24 Mar 1992 17:47:45 GMT |
There are a number of efforts underway to define variants of common
programming languages (e.g. C and Fortran) to take advantange of
parallel computers. The issue of compiling these languages has not yet
gotten much attention. Here is one proposal from a Russian group for an
extension to C that was sent to Walt Rudd because of his role in the
X3H5 effort.
Dear Prof. Walter G. Rudd,
We are participants of implementation of programming languages (the
Assembly language, Fortran 77, Pascal and C) for one of the first Russian
supercomputers - "Electronica SSBIS". The work was started in 1985. All
compilers are already working on simulator of the supercomputer on
instrumental computers, but hardware yet is not ready. Answering the
request of our software engineers (they were developing the Operating
system and Scientific subroutine packages) we have developed the extension
of C language for our supercomputer. One of our aims in developing the
extension was to achieve the software portability among different
supercomputers.
We send you the paper describing the main features of our C language
extension and hope that it may be useful for your working group X3H5.
We are interested in discussion on C language extensions for
computers with parallel architecture and should be very grateful to you
for our further contacts.
Sincerely Yours,
Sergey S. Gaissaryan
Alexey L. Lastovetsky
Our address is:
Dr. S.S.Gaissaryan
37 Vavilova street
Cybernetics Problems' Institute
Russian Academy of Sciences
Moscow, 117312,
Russia
e-mail kuz@ivann.delta.msk.su.
Fax 7(095)9382136.
EXTENSION OF C FOR SUPERCOMPUTERS
S.S.Gaissaryan, A.L.Lastovetsky
1. Introduction.
In this paper we consider the language aspects of the problem of
software transfer within the class of supercomputers having shared
storage.
The problem of program transfer from one computing environment to
another consists in keeping of the functional semantics and the
operational properties of this program.
The first step toward solution of the transfer problem is the
development of#_ #.transfer language#_ #.for desired class of computers.
The transfer language is a programming language which satisfies the
following demands:
- the system of language notions includes all basic notions of
assembly languages for the corresponding class of computers;
- language's semantics must be defined with maximal possible degree
of completeness, allowed by considered class of computers;
- the facilities to localize the dependence of program semantics on
the architecture must be provided;
- the language description must specify clearly what exactly in the
language does not depend on the implementation, what and how exactly
depends on it, and what is undefined.
The transfer language effectiveness is ensured on the one hand by
including all basic notions of assembler languages for the corresponding
class of computers in the system of language notions, and on the other
hand by describing these notions such that its semantics is fuzzy enough
to provide the effective mapping in the system of notions of every
computer of this class.
The desired degree of language portability is ensured, first of all,
by the highest possible degree of completeness of its functional semantics
which affects its effectiveness.
For the class which consists of computers of traditional von Neumann
architecture the C language as specified by [1] is used as transfer
language.
In this paper we describe the C[] language proposed as the transfer
language for the supercomputers having shared storage. The s[] language
is the exact extension of C. It means that any correct C-program has the
same semantics in C[] and C languages.
In C[] the conceptions of array and pointer are extended in
comparison with C. The concept of vector is introduced. The new [], [:]
operators are included in C[]. Once more storage-class cache is added.
The standard libraries are also extended. Besides, it is allowable in C[]
to describe a member of a structure as an array of bit-fields.
2. Array.
In the C language an array comprises "a contiguously allocated set of
members of any one type of object"[1,p.21].
In C[] language an array comprises a sequentially allocated (with a
constant positive, negative or zero step) set of members of any one type
of object. A negative step specifies the allocation of members of the
array in the reverse (from right to left) order. The zero step specifies
the allocation of all members of the array in the same element of storage.
Thus in the C[] language an array has at least 3 attributes, namely:
a type of members, a number of members and an allocation step.
In the C[] language the array declarators syntax differs from
standard one (see [1,p.56]) in following way. The rule
<direct-declarator>:
<direct-declarator>[<constant-expression>(opt)] is replaced
with the rules
<direct-declarator>:
<direct-declarator>[<constant-expression>(opt)
<step>(opt)]
<step>:
:<constant-expression>
If <step> is not specified then it is equal to 1.
3. Pointer.
In the C language a pointer has only one attribute, namely: the type
of objects pointed to.
This attribute is necessary, firstly, to interpretate correctly
values of objects in case of an access to objects by means of the pointer,
and secondly, to interpretate correctly the address + and - operators.
In the C language the address + and - operators are correct only if
the pointer operands and the pointer results point to members of the same
array object. The same rule is valid for the C[] language. Therefore, to
support the correct interpretation of the address operators once more
attribute of the pointer is introduced in the C[] language. This
attribute is step.
In the C language, "when an expression that has integral type is
added to or substructed from a pointer, the integral value is first
multiplied by the size of the object pointed to"[1,p.41]. In the C[]
language, the similar multiplier is equal to the product of the pointer
step and the size of the object pointed to. In the C language, "when two
pointers to members of the same array object are substracted, the
difference is divided by the size of a member"[1,p.41]. In the C[]
language, the similar divisor is equal to the product of the pointer step
and the size of a member.
In the C[] language the pointer declarators syntax differs from
standard one (see [1,p.56]) in following way. The rules
<pointer>:
*<type-specifier-list>(opt)
*<type-specifier-list>(opt)<pointer>
is replaced with the rules
<pointer>:
*<step>(opt)<type-specifier-list>(opt)
*<step>(opt)<type-specifier-list>(opt)<pointer>
If <step> is not specified then it is equal to 1.
Example. The declarations
int a[]={0,1,2,3,4};
int *:2 p1=(void*)a, *:-1 p2=(void*)&a[4];
result the address expressions (R1+1) and (p2+2) point to the same member
of array, namely, a[2]. Indeed, the offset from p1 defined by the
expression (p1+1) is equal to (2*sizeof(int))*1 bytes, and the offset from
p2 defined by the expression (p2+2) is equal to ((-1)*sizeof(int))*2
bytes.
4. Access to member of array.
In the C[] language the access to e2-th member of an array object e1
is fulfiled with the help of the expression e1[e2] which is identical to
(*(e1+(e2))). Here, e2 is an integral expression, e1 is an lvalue that
has the type array of type, and this lvalue is converted to an expression
that has the type pointer to type and that points to the initial member of
the array object (the attribute step of this pointer is identical to the
attribute step of the array object).
5. Vector as value of array object. Access to value of array.
In the C[] language a value of an array object is defined as an
ordered sequence of values of its members and is called vector. The i-th
member of an vector is a value of the i-th member of the corresponding
array object.
It follows from this definition that an array object can not be a
member of a vector (since any array object is not a value of an object of
some type), but one vector can be a member of another vector (since a
vector is a value of an array object).
Note, that in contrast to the C language in C[] language the concepts
type of object and type of value of object are different. In particular, a
vector is a value of array object but it has not the attribute step. This
attribute specifies an allocation of an array value in an array object. A
type of a vector is defined by only two attributes, namely: type of
members and number of members. Therefore, two array objects that have a
different types may have a values of the same type.
No access to a member of a vector is defined.
Example. The value of the array defined by the
declaration
int a[3][2];
is the vector consisting from three vectors, each from which consists from
two integers. So, the execution of the iteration statement
for(i=0; i<3; i++)
for(j=0; j<2; j++)
a[i][j]=i+j;
causes the value of the array a to be equal to the vector { {0,1}, {1,2},
{2,3} }. The type of this vector is named by the type name int[3][2].
In the C language, except when used as an operand where an lvalue is
permitted (in particular, as an operand of the sizeof operator), or when a
string literal is used to initialize an array of chars, an expression that
has type array of type is converted to an expression that has type pointer
to type and that points to the initial member of the array object.
To support the access to an array as the whole the postfix []
operator is introduced. The operand of this operator shall have type
array of type. The [] operator blocks (forbids) the convertion of the
operand to a pointer. Such lvalues that designate an array as the whole
are called a blocked lvalues. A rules of the treatment with a blocked
lvalues are fully similar to a rules of the treatment with a lvalues that
have a scalar type. In particular, if a is an array then the expression
&a[] is equal to the lvalue a.
A modifiable blocked lvalue#_ #.is a blocked lvalue that does not
have a type declared with the const type specifier.
The unary & operator deblocks a blocked lvalue.
Example. If the arrays a, b and c are declared as
int a[8], b[8:2], c[8:-1];
then the expression a[]=b[]+c[] assigns the sum of vectors being a values
of the arrays b and c to the array a. Note that here the expression
a[]+b[] has type vector from 8 ints.
6. Access to subarrays.
Any set of objects belonging to some array is called its subarray iff
this set itself forms an array. By definition an object belongs to an
array if either it is a member of this array or it belongs to a member of
this array. Besides, any subarray is also an object belonging to the
array.
In principle, the introduced facilities are sufficient to access to a
subarray. For example, if the array object a is defined by the declaration
int a[5][5];
then the blocked lvalue
(*(int(*)[5:6])a)[] (1)
designates the array of five ints (with the step 6) that contains the main
diagonal of the matrix a, and the blocked lvalues
(*(int(*)[4:6])(a[0]+1))[] (2)
and
(*(int(*)[4:6])(&a[0][1]))[] (3)
designates the array of four ints (with the step 6) that contains the
diagonal of the matrix a which is placed under the main diagonal.
The more compact notation comes out if a variables of type pointer to
array are used. So, if the pointer objects p1 and p2 are defined by the
declaration
int (*p1)[5:6]=(void*)a, (*p2)[4:6]=(void*)(a[0]+1); then the
expression (*p1)[] can be used instead of (1) and the expression (*p2)[]
can be used instead of (2) and (3).
In the C language the array subscripting operator is surplus. This
operator provides a comfortable abbreviation e1[e2] for the expression
(*(e1+(e2))) where the expression e1 shall have type pointer to type and
the expression e2 shall have integral type. Similarly, the ternary [:]
operator is introduced in the C[] language. This operator provides an
abbreviation for the blocked lvalues like (1)-(3). Namely, if e1 is a
lvalue by type t and e2, e3 are an integral constant expressions then the
expression e1[e2:e3] is equal to the expression (*(T(*)[e2:e3])(&(e1)))[]
being within the scope of the type definition of the typedef t T kind. The
expression e3 is optional (the expression e1[e2:] is equal to the
expression e1[e2:1]).
For example, the expression (1) can be expressed as a[0][0][5:6], and
the expressions (2) and (3) can be expressed as a[0][1][4:6].
In common case, both second and third operands of the [:] operator
can be any integral expressions (not only constant ones). It allows to
specify dynamically the required subarray.
7. Segments of arrays.
Not every regular set of objects belonging to an array is a subarray.
To make the access to the similar sets of objects belonging to an
array easy, new derived types segments of arrays#_ #.are introduced in the
C[] language and the [:] operator is extended.
A segment of an array may comprise a sequential members of the array
allocated with a constant step (a unit of stepsize is a member of the
array). Besides, if a members of the array are also an arrays then a
segment of the array may comprise a sequential segments (of the same type)
of the array members allocated with a constant step. In this last case, a
unit of stepsize may be not only a segment of a member of the array but a
member of a segment of a member of the array if a member of a segment of a
member of the array is also an array and so on.
In common case, in the expression e1[e2:e3] the third operand that
specified a step is described as follows
<segment_step>:
<integral_expression>(opt)
:<segment_step>(opt).
The expression e1 is either an lvalue whose type is not array or a blocked
lvalue of type array or an lvalue of type segment of array. If e3 is an
integral expression then a unit of step is equal to the size of the object
designated by e1. If e3 is :e4 where e4 is an integral expression then a
unit of step is a member of the object designated by e1. If e3 is ::e4
then a unit of step is a member of a member of the object designated by e1
(naturally, in this case e1 should be an lvalue of the corresponding
type).
If e3 is :e4 then the expression e1[e2:e3] is an lvalue of type
segment of array and designates the corresponding segment. If e1 is an
lvalue of type segment of array then the expression e1[e2:e3] is also an
lvalue of type segment of array and designates the corresponding segment.
In all rest cases the expression e1[e2:e3] is a blocked lvalue of type
array.
An lvalue of type segment of array is never converted to a pointer
that points to the initial member of the segment and always designates the
segment as the whole.
There are some constraints in a usage of lvalues that have type
segment of array. In particular, the & unary operator is not applicable
to such lvalues.
If e1 is an lvalue of type segment of array and e2 is an integral
expression then the expression e1[e2] designates the e2-th member of the
segment e1 (counting from zero). In difference from the array
subscripting operator the segment subscripting operator is not expressed
by the * and + address operators that is in this case the expression
e1[e2] is not identical to the expression (*(e1+(e2))).
8. Vector construction operators.
If e1,...,eN are expressions of type t then the expression
{e1,...,eN} is of type vector of N members of type t.
If vector or scalar expressions e1, e3 are of type t and e2 is
integral positive expression then the expression e1[e2#e3] is of type
vector of e2 members of type t. The first member of this vector is e1, the
second one is ((e1)+(e3)), the third one is (((e1)+(e3))+(e3)) etc., the
sign + denoting the vector (memberwise) addition in vector case. The
operand e3 may be omitted in e1[e2#e3] expression. In that case the
vector consists of e2 identical members each of which is equal to e1. If
the length of the vector e3 is less than the length of the vector e1 then
missing right members are considered zeroes.
Example. The constant expression 1[5#2] denotes the vector
{1,3,5,7,9}, the constant expression 1[5#1][2#1[5#]] denotes the vector {
{1,2,3,4,5}, {2,3,4,5,6} }, and the constant expression 10[4#-2][3#{1,2}]
denotes the vector { {10,8,6,4}, {11,10,6,4}, {12,12,6,4} }.
If scalar operands of the { } operator are of different arithmetic
types the usual arithmetic conversions are performed. If they are of
pointer type all of them shall have the same type.
9. Lvector.
In addition to notions of lvalue, blocked lvalue and segment of array
which are used for designation of objects and are allowed as left operand
of assignment operators the notion of lvector is introduced.
Just as lvalue is an expression designating some object, lvector is a
vector expression designating the set of objects.
Example. If the objects x, y and pa are described as
int x, y, *pa[10];
then the vector expressions {y, x, *pa[3]} and *pa[] are lvectors but the
vector expression {x+y, y, x} is not.
Blocked lvalues of type array and segments of array can be considered
as special cases of lvectors which designate the sets of objects allocated
in regular way.
Lvector is modifiable if every member of the set of objects
designated by it is modifiable.
10. Vector conversions.
A conversion of the vector of one type to the vector of another type
is composition of two conversions, the first of which converts the vector
length the types of members remaining unchanged and the second one
converts the type of vector members the vector length being unchanged.
We define a vector length as the number of its members. For example,
the vectors {1,2} and {{3,4,5},{6,7,8}} both have the length 2.
There is not defined conversion of vector to non-vector (in
particular, to scalar).
A conversion of the non-vector (in particular, of the scalar) to the
vector of specified length and type of the members is composition of two
conversions, the first of which converts the non-vector to the vector of
specified length all members of which have the same type and value as the
initial non-vector has and the second one converts the type of vector
members to specified type.
Shortening of a vector is performed by dropping its ending members.
Conversion of a vector to the longer one is performed by adding of members
having indefinite values to its tail.
11. Integer vectors packing.
The C[] language provides bit-fields packing facilities for integer
vectors. Namely, an array declarator is allowed in position of optional
<declarator> in the bit-field declarator of
<declarator>(opt):<constant_expression> kind. The bit-field declarator
A[N:L]:M (A is an identifier, N,L,M are constant expressions) specifies N
sequentially (with the step L) allocated bit-fields of the same width M.
The width of bit-fields and the step size are measured in bits. The
assignment of an integer vector of length N to the member A of structure
causes the packing of this vector members in corresponding bit-fields.
Unpacking of integer vector packed in bit-fields is performed during the
access to corresponding member of the structure by means of the .
operator.
12. Scalar operators.
Three groups of scalar operators are added to the C[] language.
The first group consists of two binary ?> and ?< operators of maximum
and minimum calculation together with corresponding compound assignments.
The second group contains three unary bitwise operators, namely: ?
(counting unit bits), % (counting leading zeroes), @ (word reversing).
The third group contains floating operators alowing to specify the
way of rounding in runtime, namely, the modifiers of floating cast
operators are introduced: > (rounding up), < (rounding down),! (rounding
to the nearest) and = (chopping). Syntactically, a modifier is added as
suffix to a name of floating type in corresponding cast operator.
Example. In the fragment
float x;
double y,z;
...........
x=(float!)(y+z);
the value of the sum (y+z) having type double is rounded to nearest value
of type float, and resulting value is assigned to varyable x.
13. Vector operators.
The operand of unary *, +, -, ~, ?, %, @, ! operators and scalar cast
operators may have vector type. In this case the result of such operator
is vector members of which are the results of application of corresponding
operator to members of operand.
If a vector type name is used in a cast operator the vector length
specified by this cast operator may be specified by arbitrary (not only
constant) integral expression.
One or both operands of binary *, /, %, ?<, ?>, +, -, <<, >>, <, >,
<=, >=, ==, !=, &, ^, |, &&, || operators may have vector type.
If both operands are vectors of the same length then the result is
vector members of which are the results of application of corresponding
operator to members of operands.
If one of the operands is scalar it is converted to the type of
vector operand.
If vector operands of binary operator have different length then
behavior is undefined.
The first operand of conditional operator may have vector type. In
that case the second or the third operand but not both of them may be
omitted. If none of the operands is omitted then unlike the C language
all three operands are evaluated.
If all three operands are vectors of the same length then the result
is acheived by memberwise application of the operator. If vector operands
of conditional operator have different length then behavior is undefined.
If the second or the third operand is non-vector they are converted
to the type of the first (vector) operand.
If the first and the second (or the third) operands are vectors, the
third (the second) operand is omitted and the members of the first operand
have scalar type then the result will be the vector of the same type as
the second (the third) operand; the i-th member of the result is equal to
the k(i)-th member of the second (the third) operand where k(i) is index
of the i-th non-zero (zero) member of the first operand. The other
members have indefinite values.
If the first and the second (or the third) operands are vectors, the
third (the second) operand is omitted and the members of the first operand
have vector type then the result is acheived by memberwise application of
the operator.
An assignment operator may have as its left operand a modifiable
blocked lvalue, a segment of modifiable array or any other modifiable
lvector . In that case its right operand may have vector type. In any
case the type of its right operand converts to the type of the left
operand's value.
Example. The execution of following fragment
int a[]={0,1,2,3,4};
int *pa[]={a+1,a+2,a+3,a+4,a};
*(pa[])=a[];
results the vector {1,2,3,4,0} as value of array a.
The unary linear [*], [/], [%], [?<], [?>], [+], [-], [<<], [>>],
[&], [^], [|] operators correspond to binary *, /, %, ?<, ?>, +, -, <<,
>>, &, ^, | operators. These operators are applicable only to vector
operands. Let v[0], v[1],...,v[N] denote the members of vector operand v;
then the expression [op] v[] has the same semantics as the expression of
(...((v[0] op v[1]) op v[2]) op ... op v[N]) kind.
Example. The execution of following fragment
int a[]={0,1,2,3,4}, sum;
sum=[+]a[];
results the value of sum equal to value of expression
a[0]+a[1]+a[2]+a[3]+a[4] that is 10.
In the C[] language unlike the C language a formal parameter of a
function may have an array type. The declaration of such formal parameter
must specify all array attributes. Corresponding argument is the
expression of the same vector type as defined by formal parameter. The
function value also may have vector type.
14. Arrays and structures of register storage-class.
An array whose members are of scalar type and the step is equal to 1
may be declared with storage-class specifier register. It causes
corresponding array head allocation on vector register if there is free
one. If there is no free vector register or if array of declared type
should not be allocated on register the storage-class specifier is
ignored.
There are some constraints in usage of register variables of array
type. In particular, the address & operator and the [:] operator should
not be applied to it. The array subscripting operator is not expressed by
* and + address operators for register array.
A structure with members of scalar types may be declared with
storage-class specifier register. It causes corresponding structure
allocation on available scalar registers which allow collective exchange
with main storage.
15. Cache storage-class.
In C[]-machine we define a cache storage as additional address space
which is independent from main storage. A definition of an object with
storage-class specifier cache causes an appropriate amount of cache
storage to be reserved.
Aggregate object shall be allocated in main or cache storage in
indivisible way, that is all its members belong to one address space.
Binary address operators are invalid if one of their operands points to
cache storage and the another points to main storage.
Storage-class specifier cache may be used in pointer declarators. The
declarator
* <type_specifier_list> <opt> <identifier>,
where the type_specifier_list contains specifier cache defines the
identifier as a pointer allocated in cache-storage. The following pair of
declarations demonstrates the difference between a pointer to
cache-storage allocated in main-storage and a pointer to main-storage
allocated in cache-storage:
cache char *ptr_to_cache;
char *cache cache_ptr;
NULL pointer has no attrbute specifying address space.
16. Concurrent calculation facilities.
The C[]- machine includes several processors each connected with its
own cache-storage and shared main-storage. Access to the region of shared
storage by one processor does not block the access to the same region of
storage by other processors (except the semaphore operator lock).
Initially program is executed by one of the processors but during its
execution other processors may be activated. Therefore in common case
program statements are concurrently executed by several processors.
In any time each processor may be in one of following three states:
active, passive or waiting. Initially only one processor is active.
In C[] language is introduced statement label +:. This label relates
to one statement.
When a program-fragment is being executed by one of the processors of
the C[]-machine (we denote it as A-processor) and control achieves
+:-label, one of passive processors (B-processor) is activated by
A-processor and B-processor begins to execute the statement labelled by
+:-label, while the following statement is concurrently executed by A-
processor. When thread of control leaves +:-labelled statement,
B-processor is returned to passive state. Note that cancelling of
automatic objects of compound statement, created by B-processor while
entering this statement is done while leaving it.
Synchronization of the access to shared data is achieved by
semaphores.
New integral type sem is introduced in the C[] language . The set of
values of this type is subset of the set of values of int. The set of
values of sem is divided on two unintersecting subsets one of which
corresponds to the state semaphore is open, and another to state semaphore
is locked. A semaphore variable is implicitly initialized by value from
the set semaphore is open.
The C[] library contains three functions to deal with semaphore
objects:
- the function
void lock(sem* s);
checks to which set the value of object *s belongs. If it belongs to set
semaphore open, then the value of *s is turned to a value from set
semaphore locked, after that function normally terminates. If the initial
value of variable *s belongs to set semaphore locked, then function turns
the processor in waiting state until the value of *s be changed in such
way that it belongs to set semaphore open, after that the processor is
activated and continues its work from the execution of the function lock.
This function is not shared that is no processor can access to object *s
while processor which executes it is active;
- the function
int unlock(sem* s);
assigns to variable *s value from set semaphore open and returns 0 if
initial value of *s belonged to set semaphore open, and 1 otherwise; is
function unlock shared or not, depends on realization;
- the function
int test(sem* s);
returns 0, if *s belongs to set semaphore open, and 1, if *s belongs to
set semaphore locked; is function test shared or not, depends on
realization.
References
1. First working draft on programming language C (draft proposed by
ANSI). Ottawa, ISO/TC97/SC22/WG14, 1986.
Robert L. Knighten | knighten@ssd.intel.com
Intel Supercomputer Systems Division |
15201 N.W. Greenbrier Pkwy. | (503) 629-4315
Beaverton, Oregon 97006 | (503) 629-9147 [FAX]
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.