3.6.3 Static and Dynamic Scheduling
One basic characteristic of a loop schedule is whether it is static or dynamic:
In a static schedule, the choice of which thread performs a particular iteration is purely a function of
the iteration number and number of threads. Each thread performs only the iterations assigned to it
at the beginning of the loop.
In a dynamic schedule, the assignment of iterations to threads can vary at runtime from one
execution to another. Not all iterations are assigned to threads at the start of the loop. Instead, each
thread requests more iterations after it has completed the work already assigned to it.
A dynamic schedule is more flexible: if some threads happen to finish their iterations sooner, more
iterations are assigned to them. However, the OpenMP runtime system must coordinate these
assignments to guarantee that every iteration gets executed exactly once. Because of this coordination,
requests for iterations incur some synchronization cost. Static scheduling has lower overhead because it
does not incur this cost, but it cannot compensate for load imbalances by shifting more iterations to less
heavily loaded threads.
In both schemes, iterations are assigned to threads in contiguous ranges called chunks. The chunk size
is the number of iterations a chunk contains. For example, if we executed the saxpy loop on an array with
100 elements using four threads and the default schedule, thread 1 might be assigned a single chunk of
25 iterations in which i varies from 26 to 50. When using a dynamic schedule, each time a thread
requests more iterations from the OpenMP runtime system, it receives a new chunk to work on. For
example, if saxpy were executed using a dynamic schedule with a fixed chunk size of 10, then thread 1
might be assigned three chunks with iterations 11 to 20, 41 to 50, and 81 to 90.
3.6.4 Scheduling Options
Each OpenMP scheduling option assigns chunks to threads either statically or dynamically. The other
characteristic that differentiates the schedules is the way chunk sizes are determined. There is also an
option for the schedule to be determined by an environment variable.
The syntactic form of a schedule clause is
schedule(type[, chunk])
The type is one of static, dynamic, guided, or runtime. If it is present, chunk must be a scalar integer
value. The kind of schedule specified by the clause depends on a combination of the type and whether
chunk is present, according to these rules, which are also summarized in Table 3.7:
If type is static and chunk is not present, each thread is statically assigned one chunk of iterations.
The chunks will be equal or nearly equal in size, but the precise assignment of iterations to threads
depends on the OpenMP implementation. In particular, if the number of iterations is not evenly
divisible by the number of threads, the runtime system is free to divide the remaining iterations
among threads as it wishes. We will call this kind of schedule "simple static."
If type is static and chunk is present, iterations are divided into chunks of size chunk until fewer than
chunk remain. How the remaining iterations are divided into chunks depends on the implementation.
Chunks are statically assigned to processors in a round-robin fashion: the first thread gets the first
chunk, the second thread gets the second chunk, and so on, until no more chunks remain. We will
call this kind of schedule "interleaved."
If type is dynamic, iterations are divided into chunks of size chunk, similarly to an interleaved
schedule. If chunk is not present, the size of all chunks is 1. At runtime, chunks are assigned to
threads dynamically. We will call this kind of schedule "simple dynamic."
If type is guided, the first chunk is of some implementation-dependent size, and the size of each
successive chunk decreases exponentially (it is a certain percentage of the size of the preceding
chunk) down to a minimum size of chunk. (An example that shows what this looks like appears
below.) The value of the exponent depends on the implementation. If fewer than chunk iterations
remain, how the rest are divided into chunks also depends on the implementation. If chunk is not
specified, the minimum chunk size is 1. Chunks are assigned to threads dynamically. Guided
scheduling is sometimes also called "guided selfscheduling," or "GSS."
If type is runtime, chunk must not appear. The schedule type is chosen at runtime based on the
value of the environment variable omp_ schedule. The environment variable should be set to a string
that matches the parameters that may appear in parentheses in a schedule clause. For example, if
the program is run on a UNIX system, then performing the C shell command
setenv OMP_SCHEDULE "dynamic,3"