2.2.2 Parallel Control Structures
Control structures are constructs that alter the flow of control in a program. We call the basic execution
model for OpenMP a fork/join model, and parallel control structures are those constructs that fork (i.e.,
start) new threads, or give execution control to one or another set of threads.
OpenMP adopts a minimal set of such constructs. Experience has shown that only a few control
structures are truly necessary for writing most parallel applications. OpenMP includes a control structure
only in those instances where a compiler can provide both functionality and performance over what a user
could reasonably program.
OpenMP provides two kinds of constructs for controlling parallelism. First, it provides a directive to create
multiple threads of execution that execute concurrently with each other. The only instance of this is the
parallel directive: it encloses a block of code and creates a set of threads that each execute this block of
code concurrently. Second, OpenMP provides constructs to divide work among an existing set of parallel
threads. An instance of this is the do directive, used for exploiting loop-level parallelism. It divides the
iterations of a loop among multiple concurrently executing threads. We present examples of each of these
directives in later sections.
2.2.3 Communication and Data Environment
An OpenMP program always begins with a single thread of control that has associated with it an
execution context or data environment (we will use the two terms interchangeably). This initial thread of
control is referred to as the master thread. The execution context for a thread is the data address space
containing all the variables specified in the program. This includes global variables, automatic variables
within subroutines (i.e., allocated on the stack), as well as dynamically allocated variables (i.e., allocated
on the heap).
The master thread and its execution context exist for the duration of the entire program. When the master
thread encounters a parallel construct, new threads of execution are created along with an execution
context for each thread. Let us now examine how the execution context for a parallel thread is
determined.
Each thread has its own stack within its execution context. This private stack is used for stack frames for
subroutines invoked by that thread. As a result, multiple threads may individually invoke subroutines and
execute safely without interfering with the stack frames of other threads.
For all other program variables, the OpenMP parallel construct may choose to either share a single copy
between all the threads or provide each thread with its own private copy for the duration of the parallel
construct. This determination is made on a per-variable basis; therefore it is possible for threads to share
a single copy of one variable, yet have a private per-thread copy of another variable, based on the
requirements of the algorithms utilized. Furthermore, this determination of which variables are shared and
which are private is made at each parallel construct, and may vary from one parallel construct to another.
This distinction between shared and private copies of variables during parallel constructs is specified by
the programmer using OpenMP date scoping clauses (…) for individual variables. These clauses are
used to determine the execution context for the parallel threads. A variable may have one of three basic
attributes: shared, private, or reduction. These are discussed at some length in later chapters. At this
early stage it is sufficient to understand that these scope clauses define the sharing attributes of an
object.
A variable that has the shared scope clause on a parallel construct will have a single storage location in
memory for the duration of that parallel construct. All parallel threads that reference the variable will
always access the same memory location. That piece of memory is shared by the parallel threads.
Communication between multiple OpenMP threads is therefore easily expressed through ordinary
read/write operations on such shared variables in the program. Modifications to a variable by one thread
are made available to other threads through the underlying shared memory mechanisms.
In contrast, a variable that has private scope will have multiple storage locations, one within the execution
context of each thread, for the duration of the parallel construct. All read/write operations on that variable
by a thread will refer to the private copy of that variable within that thread. This memory location is