4.2.1 Clauses on the parallel Directive
The parallel directive may contain any of the following clauses:
PRIVATE (list)
SHARED (list)
DEFAULT (PRIVATE | SHARED | NONE)
REDUCTION ({op|intrinsic}:list)
IF (logical expression)
COPYIN (list)
The private, shared, default, reduction, and if clauses were discussed earlier in Chapter 3 and continue to
provide exactly the same behavior for the parallel construct as they did for the parallel do construct. We
briefly review these clauses here.
The private clause is typically used to identify variables that are used as scratch storage in the code
segment within the parallel region. It provides a list of variables and specifies that each thread have a
private copy of those variables for the duration of the parallel region.
The shared clause provides the exact opposite behavior: it specifies that the named variable be shared
among all the threads, so that accesses from any thread reference the same shared instance of that
variable in global memory. This clause is used in several situations. For instance, it is used to identify
variables that are accessed in a read-only fashion by multiple threads, that is, only read and not modified.
It may be used to identify a variable that is updated by multiple threads, but with each thread updating a
distinct location within that variable (e.g., the saxpy example from Chapter 2). It may also be used to
identify variables that are modified by multiple threads and used to communicate values between multiple
threads during the parallel region (e.g., a shared error flag variable that may be used to denote a global
error condition to all the threads).
The default clause is used to switch the default data-sharing attributes of variables: while variables are
shared by default, this behavior may be switched to either private by default through the default(private)
clause, or to unspecified through the default(none) clause. In the latter case, all variables referenced
within the parallel region must be explicitly named in one of the above data-sharing clauses.
Finally, the reduction clause supplies a reduction operator and a list of variables, and is used to identify
variables used in reduction operations within the parallel region.
The if clause dynamically controls whether a parallel region construct executes in parallel or in serial,
based on a runtime test. We will have a bit more to say about this clause in Section 4.9.1.
Before we can discuss the copyin clause, we need to introduce the notion of threadprivate variables. This
is the subject of Section 4.4.
4.2.2 Restrictions on the parallel Directive
The parallel construct consists of a parallel/end parallel directive pair that encloses a block of code. The
section of code that is enclosed between the parallel and end parallel directives must be a structured
block of code—that is, it must be a block of code consisting of one or more statements that is entered at
the top (at the start of the parallel region) and exited at the bottom (at the end of the parallel region).
Thus, this block of code must have a single entry point and a single exit point, with no branches into or
out of any statement within the block. While branches within the block of code are permitted, branches to
or from the block from without are not permitted.
Example 4.1 is not valid because of the presence of the return statement within the parallel region. The
return statement is a branch out of the parallel region and therefore is not allowed.
Although it is not permitted to branch into or out of a parallel region, Fortran stop statements are allowed
within the parallel region. Similarly, code within a parallel region in C/C++ may call the exit subroutine. If
any thread encounters a stop statement, it will execute the stop statement and signal all the threads to
stop. The other threads are signalled asynchronously, and no guarantees are made about the precise
execution point where the other threads will be interrupted and the program stopped.