fairly straightforward: we loop for m points in the i direction and n points in the j direction, generate an x
and y coordinate for each point (here restricted to the range (0,1]), and then compute the Mandelbrot
equation for the given coordinates. The result is assigned to the two-dimensional array depth.
Presumably this array will next be passed to a graphics routine for drawing the image.
Our strategy for parallelizing this loop remains the same as for the saxpy loop—we would like to use the
parallel do directive. However, we must first convince ourselves that different iterations of the loop are
actually independent and can execute in parallel, that is, that there are no data dependences in the loop
from one iteration to another. Our brief description of a Mandelbrot generator would indicate that there are
no dependences. In other words, the result for computing the Mandelbrot equation on a particular point
does not depend on the result from any other point. This is evident in the code itself. The function
mandel_val only takes x, y, and maxiter as arguments, so it can only know about the one point it is
working on. If mandel_val included i or j as arguments, then there might be reason to suspect a
dependence because the function could conceivably reference values for some other point than the one it
is currently working on. Of course in practice we would want to look at the source code for mandel_val to
be absolutely sure there are no dependences. There is always the possibility that the function modifies
global structures not passed through the argument list, but that is not the case in this example. As a
matter of jargon, a function such as this one that can safely be executed in parallel is referred to as being
thread-safe. More interesting from a programming point of view, of course, are those functions that are
not inherently thread-safe but must be made so through the use of synchronization.
Having convinced ourselves that there are no dependences, let us look more closely at what the loop is
doing and how it differs from the saxpy loop of Example 2.3. The two loops differ in some fundamental
ways. Additional complexities in this loop include a nested loop, three more scalar variables being
assigned ( j, x, and y), and a function call in the innermost loop. Let us consider each of these added
complexities in terms of our runtime execution model.
Our understanding of the parallel do/end parallel do directive pair is that it will take the iterations of the
enclosed loop, divide them among some number of parallel threads, and let each parallel thread execute
its set of iterations. This does not change in the presence of a nested loop or a called function. Each
thread will simply execute the nested loop and call the function in parallel with the other threads. So as far
as control constructs are concerned, given that the called function is thread-safe, there is no additional
difficulty on account of the added complexity of the loop.
As one may suspect, things are not so simple with the data environment. Any time we have variables
being assigned values inside a parallel loop, we need to concern ourselves with the data environment.
We know from the saxpy example that by default the loop index variable i will be private and everything
else in the loop will be shared. This is appropriate for m and n since they are only read across different
iterations and don't change in value from one iteration to the next. Looking at the other variables, though,
the default rules are not accurate. We have a nested loop index variable j, and as a rule loop index
variables should be private. The reason of course is that we want each thread to work on its own set of
iterations. If j were shared, we would have the problem that there would be just one "global" value
(meaning the same for all threads) for j. Consequently, each time a thread would increment j in its nested
loop, it would also inadvertently modify the index variable for all other threads as well. So j must be a
private variable within each thread. We do this with the private clause by simply specifying private(j) on
the parallel do directive. Since this is almost always the desired behavior, loop index variables in Fortran
are treated as having private scope by default, unless specified otherwise.
[1]
What about the other variables, x and y? The same reasoning as above leads us to conclude that these
variables also need to be private. Consider that i and j are private, and x and y are calculated based on
the values of i and j; therefore it follows that x and y should be private. Alternatively, remember that x and
y store the coordinates for the point in the plane for which we will compute the Mandelbrot equation.
Since we wish each thread to work concurrently on a set of points in the plane, we need to have "parallel"
(meaning here "multiple") storage for describing a point such that each thread can work independently.
Therefore we must specify x and y to be private.
Finally we must consider synchronization in this loop. Recall that the main use of synchronization is to
control access to shared objects. In the saxpy example, the only synchronization requirement was an
implicit barrier at the close of the parallel do. This example is no different. There are two shared objects,
maxiter and depth. The variable maxiter is only read and never written (we would need to check the
mandel_val function to confirm this); consequently there is no reason or need to synchronize references
to this shared variable. On the other hand, the array depth is modified in the loop. However, as with the
saxpy example, the elements of the array are modified independently by each parallel thread, so the only