3.4.7 Private Variable Initialization and Finalization
Normally, each thread's copy of a variable scoped as private on a parallel do has an undefined initial
value, and after the parallel do the master thread's copy also takes on an undefined value. This behavior
has the advantage that it minimizes data copying for the common case in which we use the private
variable as a temporary within the parallel loop. However, when parallelizing a loop we sometimes need
access to the value that was in the master's copy of the variable just before the loop, and we sometimes
need to copy the "last" value written to a private variable back to the master's copy at the end of the loop.
(The "last" value is the value assigned in the last iteration of a sequential execution of the loop—this last
iteration is therefore called "sequentially last.")
For this reason, OpenMP provides the firstprivate and lastprivate variants on the private clause. At the
start of a parallel do, firstprivate initializes each thread's copy of a private variable to the value of the
master's copy. At the end of a parallel do, lastprivate writes back to the master's copy the value contained
in the private copy belonging to the thread that executed the sequentially last iteration of the loop.
The form and usage of firstprivate and lastprivate are the same as the private clause: each takes as an
argument a list of variables. The variables in the list are scoped as private within the parallel do on which
the clause appears, and in addition are initialized or finalized as described above. As was mentioned in
Section 3.4.1, variables may appear in at most one scope clause, with the exception that a variable can
appear in both firstprivate and lastprivate, in which case it is both initialized and finalized.
In Example 3.9, x(1,1) and x(2,1) are assigned before the parallel loop and only read thereafter, while
x(1,2) and x(2,2) are used within the loop as temporaries to store terms of polynomials. Code after the
loop uses the terms of the last polynomial, as well as the last value of the index variable i. Therefore x
appears in a firstprivate clause, and both x and i appear in a lastprivate clause.
Example 3.9: Parallel loop with firstprivate and lastprivate variables.
common /mycom/ x, c, y, z
real x(n, n), c(n, n,), y(n), z(n)
...
! compute x(1, 1) and x(2, 1)
!$omp parallel do firstprivate(x) lastprivate(i, x)
do i = 1, n
x(1, 2) = c(i, 1) * x(1, 1)
x(2, 2) = c(i, 2) * x(2, 1) ** 2
y(i) = x(2, 2) + x(1, 2)
z(i) = x(2, 2) - x(1, 2)
enddo
...
! use x(1, 2), x(2, 2), and i
There are two important caveats about using these clauses. The first is that a firstprivate variable is
initialized only once per thread, rather than once per iteration. In Example 3.9, if any iteration were to
assign to x(1,1) or x(2,1), then no other iteration is guaranteed to get the initial value if it reads these
elements. For this reason firstprivate is useful mostly in cases like Example 3.9, where part of a privatized
array is read-only. The second caveat is that if a lastprivate variable is a compound object (such as an
array or structure), and only some of its elements or fields are assigned in the last iteration, then after the
parallel loop the elements or fields that were not assigned in the final iteration have an undefined value.
In C++, if an object is scoped as firstprivate or lastprivate, the initialization and finalization are performed
using appropriate member functions of the object. In particular, a firstprivate object is constructed by
calling its copy constructor with the master thread's copy of the variable as its argument, while if an object
is lastprivate, at the end of the loop the copy assignment operator is invoked on the master thread's copy,
with the sequentially last value of the variable as an argument. (It is an error if a firstprivate object has no
publicly accessible copy constructor, or a last-private object has no publicly accessible copy assignment
operator.) Example 3.10 shows how this works. Inside the parallel loop, each private copy of c1 is copy-