Назад
290 CHAPTER 7. MESSAGE PASSING INTERFACE
of two components, which is dictated by the count parameter in mpi_scatter()
in line 19. The two real numbers are stored in the rst two components in the
local arrays, a_loc. The components a_loc(2:7) are not dened, and the print
commands in line 20 verify this.
MPI/Fortran 9x Code scatmpi.f
1. program scatmpi
2.! Illustrates mpi_scatter.
3. implicit none
4. include ’mpif.h’
5. real, dimension(0:7):: a_list,a_loc
6. integer:: my_rank,p,n,source,dest,tag,ierr,loc_n
7. integer:: i,status(mpi_status_size)
8. data n,dest,tag/1024,0,50/
9. call mpi_init(ierr)
10. call mpi_comm_rank(mpi_comm_world,my_rank,ierr)
11. call mpi_comm_size(mpi_comm_world,p,ierr)
12. if (my_rank.eq.0) then
13. do i = 0,7
14. a_list(i) = i
15. end do
16. end if
17.! The array, a_list, is sent and received in groups of
18.! two to the other processors and stored in a_loc.
19. call mpi_scatter(a_list,2,mpi_real,a_loc,2,mpi_real,0,&
mpi_comm_world,status,ierr)
20. print*, ’my_rank =’,my_rank,’a_loc = ’, a_loc
21. call mpi_nalize(ierr)
22. end program scatmpi
my_rank = 0 a_loc = 0.0000000000E+00 1.000000000
!
0.2347455187E-40 0.1010193260E-38 -0.8896380928E+10
-0.2938472521E+30 0.3083417141E-40 0.1102030158E-38
!
my_rank = 1 a_loc = 2.000000000 3.000000000
!
.2347455187E-40 0.1010193260E-38 -0.8896380928E+10
-0.2947757071E+30 0.3083417141E-40 0.1102030158E-38
!
my_rank = 2 a_loc = 4.000000000 5.000000000
!
0.2347455187E-40 0.1010193260E-38 -0.8896380928E+10
-0.2949304496E+30 0.3083417141E-40 0.1102030158E-38
!
my_rank = 3 a_loc = 6.000000000 7.000000000
© 2004 by Chapman & Hall/CRC
7.3. GATHER AND SCATTER 291
!
0.2347455187E-40 0.1010193260E-38 -0.8896380928E+10
-0.3097083589E+30 0.3083417141E-40 0.1102030158E-38
7.3.4 Illustrations of mpi_gather()
The second code gathmpi.f collects some of the data loc_n, loc_a, and loc_b,
which is computed in lines 15-17 for each processor. In particular, all the values
of loc_a are sent and stored in the array a_list on processor 0. This is done by
mpi_gather() on line 23 where count is equal to one and the root processor is
zero. This is veried by the print commands in lines 18-20 and 25-29.
MPI/Fortran 9x Code gathmpi.f
1. program gathmpi
2.! Illustrates mpi_gather.
3. implicit none
4. include ’mpif.h’
5. real:: a,b,h,loc_a,lo c_b,total
6. real, dimension(0:31):: a_list
7. integer:: my_rank,p,n,source,dest,tag,ierr,loc_n
8. integer:: i,status(mpi_status_size)
9. data a,b,n,dest,tag/0.0,100.0,1024,0,50/
10. call mpi_init(ierr)
11. call mpi_comm_rank(mpi_comm_world,my_rank,ierr)
12. call mpi_comm_size(mpi_comm_world,p,ierr)
13. h = (b-a)/n
14.! Each pro cessor has a unique loc_n, loc_a and loc_b
15. loc_n = n/p
16. loc_a = a+my_rank*loc_n*h
17. loc_b = loc_a + loc_n*h
18. print*,’my_rank =’,my_rank, ’loc_a = ’,loc_a
19. print*,’my_rank =’,my_rank, ’loc_b = ’,loc_b
20. print*,’my_rank =’,my_rank, ’loc_n = ’,loc_n
21.! The loc_a are sent and recieved to an array, a_list, on
22.! processor 0.
23. call mpi_gather(loc_a,1,mpi_real,a_list,1,mpi_real,0,&
mpi_comm_world,status,ierr)
24. call mpi_barrier(mpi_comm_world,ierr)
25. if (my_rank.eq.0) then
26. do i = 0,p-1
27. print*, ’a_list(’,i,’) = ’,a_list(i)
28. nd do
29. end if
30. call mpi_nalize(ierr)
31. end program gathmpi
© 2004 by Chapman & Hall/CRC
292 CHAPTER 7. MESSAGE PASSING INTERFACE
my_rank = 0 loc_a = 0.0000000000E+00
my_rank = 0 loc_b = 25.00000000
my_rank = 0 loc_n = 256
my_rank = 1 loc_a = 25.00000000
my_rank = 1 loc_b = 50.00000000
my_rank = 1 loc_n = 256
my_rank = 2 loc_a = 50.00000000
my_rank = 2 loc_b = 75.00000000
my_rank = 2 loc_n = 256
my_rank = 3 loc_a = 75.00000000
my_rank = 3 loc_b = 100.0000000
my_rank = 3 loc_n = 256
!
a_list( 0 ) = 0.0000000000E+00
a_list( 1 ) = 25.00000000
a_list( 2 ) = 50.00000000
a_list( 3 ) = 75.00000000
The third version of a parallel dot product in dot3mpi.f uses mpi_gather()
to collect the local dot products that have been computed concurrently in
lines 25-27. The local dot products, loc_dot, are sent and stored in the ar-
ray loc_dots(0:31) on processor 0. This is done by the call to mpi_gather()
on line 31 where the count parameter is equal to one and the root processor is
zero. Lines 33-36 sum the local dot products, and the print commands in lines
21-23 and 33-36 conrm this.
MPI/Fortran 9x Code dot3mpi.f
1. program dot3mpi
2.! Illustrates dot product via mpi_gather.
3. implicit none
4. include ’mpif.h’
5. real:: loc_dot,dot
6. real, dimension(0:31):: a,b, loc_dots
7. integer:: my_rank,p,n,source,dest,tag,ierr,loc_n
8. integer:: i,status(mpi_status_size),en,bn
9. data n,dest,tag/8,0,50/
10. do i = 1,n
11. a(i) = i
12. b(i) = i+1
13. end do
14. call mpi_init(ierr)
15. call mpi_comm_rank(mpi_comm_world,my_rank,ierr)
16. call mpi_comm_size(mpi_comm_world,p,ierr)
17.! Each processor computes a local dot product
18. loc_n = n/p
19. bn = 1+(my_rank)*loc_n
© 2004 by Chapman & Hall/CRC
7.3. GATHER AND SCATTER 293
20. en = bn + loc_n-1
21. print*,’my_rank =’,my_rank, ’loc_n = ’,loc_n
22. print*,’my_rank =’,my_rank, ’bn = ’,bn
23. print*,’my_rank =’,my_rank, ’en = ’,en
24. loc_dot = 0.0
25. do i = bn,en
26. loc_dot = l oc_dot + a(i)*b(i)
27. end do
28. print*,’my_rank =’,my_rank, ’loc_dot = ’,loc_dot
29.! mpi_gather sends and recieves all local dot products
30.! to the array loc_dots in processor 0.
31. call mpi_gather(loc_dot,1,mpi_real,loc_dots,1,mpi_real,0,&
mpi_comm_world,status,ierr)
32.! Processor 0 sums the local dot products.
33. if (my_rank.eq.0) then
34. dot = loc_dot + sum(loc_dots(1:p-1))
35. print*, ’dot product = ’,dot
36. end if
37. call mpi_nalize(ierr)
38. end program dot3mpi
my_rank = 0 loc_n = 2
my_rank = 0 bn = 1
my_rank = 0 en = 2
my_rank = 1 loc_n = 2
my_rank = 1 bn = 3
my_rank = 1 en = 4
my_rank = 2 loc_n = 2
my_rank = 2 bn = 5
my_rank = 2 en = 6
my_rank = 3 loc_n = 2
my_rank = 3 bn = 7
my_rank = 3 en = 8
!
my_rank = 0 loc_dot = 8.000000000
my_rank = 1 loc_dot = 32.00000000
my_rank = 2 loc_dot = 72.00000000
my_rank = 3 loc_dot = 128.0000000
dot product = 240.0000000
Another application of mpi_gather() is in the matrix-matrix product code
mmmpi.f, which was presented in Section 6.5. Here the product
EF was formed
by computing in parallel
EF (eq : hq) > and these partial products were com-
municated via mpi_gather() to the root processor.
© 2004 by Chapman & Hall/CRC
294 CHAPTER 7. MESSAGE PASSING INTERFACE
7.3.5 Exercises
1. Duplicate the calculations for scatmpi.f and experiment with di erent
numbers of processors.
2. Duplicate the calculations for gathmpi.f and experiment with di
erent
numbers of processors.
3. Duplicate the calculations for dot3mpi.f and experiment with di
erent
numbers of processors and di
erent size vectors.
4. Use mpi_gather() to compute in parallel a linear combination of the two
vectors,
{ + |=
5. Use mpi_gather() to modify trapmpi.f to execute Simpson’s rule in par-
allel.
7.4 Grouped Data Types
7.4.1 Introduction
There is s ome startup time associated with each MPI subroutine. So if a large
number of calls to mpi_send() and mpi_recv() are made, then the communi-
cation p ortion of the code may be signicant. By collecting data in groups a
single communication subroutine may be used for large amounts of data. Here
we will present three methods for the grouping of data: count, derived types
and packed.
7.4.2 Count Type
The count parameter has already been used in some of the p revious codes. The
parameter count refers to the number of mpi_datatypes to be communicated.
The most common data types are mpi_real or mpi_int, and these are usually
stored in arrays whose components are addressed sequentially. In Fortran the
two dimensional arrays comp onents are listed by columns starting with the
leftmost column. For example, if the array is b(1:2,1:3), then the list for b is
b(1,1), b(2,1), b(1,2), b(2,2), b(1,3) and b(2,3). Starting at b(1,1) with count
= 4 gives the rst four components, and starting at b(1,2) with count = 4 gives
the last four components.
The code countmpi.f illustrates the count parameter method when it is used
in the subroutine mpi_bcast(). Lines 14-24 initialize in processor 0 two arrays
a(1:4) and b(1:2,1:3). All of the array a is broadcast, in line 29, to the other
processors, and just the rst four components of the two dimensional array b
are broadcast, in line 30, to the other processors. This is conrmed by the print
commands in lines 26, 32 and 33.
MPI/Fortran 9x Code countmpi.f
1. program countmpi
2.! Illustrates count for arrays.
© 2004 by Chapman & Hall/CRC
7.4. GROUPED DATA TYPES 295
3. implicit none
4. include ’mpif.h’
5. real, dimension(1:4):: a
6. integer, dimension(1:2,1:3):: b
7. integer:: my_rank,p,n,source,dest,tag,ierr,loc_n
8. integer:: i,j,status(mpi_status_size)
9. data n,dest,tag/4,0,50/
10. call mpi_init(ierr)
11. call mpi_comm_rank(mpi_comm_world,my_rank,ierr)
12. call mpi_comm_size(mpi_comm_world,p,ierr)
13.! Dene the arrays.
14. if (my_rank.eq.0) then
15. a(1) = 1.
16. a(2) = exp(1.)
17. a(3) = 4*atan(1.)
18. a(4) = 186000.
19. do j = 1,3
20. do i = 1,2
21. b(i,j) = i+j
22. end do
23. end do
24. end if
25.! Each processor attempts to print the array.
26. print*,’my_rank =’,my_rank, ’a = ’,a
27. call mpi_barrier(mpi_comm_world,ierr)
28.! The arrays are broadcast via count equal to four.
29. call mpi_bcast(a,4,mpi_real,0,&
mpi_comm_world,ierr)
30. call mpi_bcast(b,4,mpi_int,0,&
mpi_comm_world,ierr)
31.! Each processor prints the arrays.
32. print*,’my_rank =’,my_rank, ’a = ’,a
33. print*,’my_rank =’,my_rank, ’b = ’,b
34. call mpi_nalize(ierr)
35. end program countmpi
my_rank = 0 a = 1.000000000 2.718281746
3.141592741 186000.0000
my_rank = 1 a = -0.1527172301E+11 -0.1775718601E+30
0.8887595380E-40 0.7346867719E-39
my_rank = 2 a = -0.1527172301E+11 -0.1775718601E+30
0.8887595380E-40 0.7346867719E-39
my_rank = 3 a = -0.1527172301E+11 -0.1775718601E+30
0.8887595380E-40 0.7346867719E-39
!
© 2004 by Chapman & Hall/CRC
296 CHAPTER 7. MESSAGE PASSING INTERFACE
my_rank = 0 a = 1.000000000 2.718281746
3.141592741 186000.0000
my_rank = 0 b = 2 3 3 4 4 5
my_rank = 1 a = 1.000000000 2.718281746
3.141592741 186000.0000
my_rank = 1 b = 2 3 3 4 -803901184 -266622208
my_rank = 2 a = 1.000000000 2.718281746
3.141592741 186000.0000
my_rank = 2 b = 2 3 3 4 -804478720 -266622208
my_rank = 3 a = 1.000000000 2.718281746
3.141592741 186000.0000
my_rank = 3 b = 2 3 3 4 -803901184 -266622208
7.4.3 Derived Type
If the data to be communicated is either of mixed type or is not adjacent in the
memory, then one can create a user dened mpi_type. For example, the data
to be grouped may have some mpi_real, mpi_int and mpi_char entries and
be in nonadjacent locations in memory. The derived type must have four items
for each entry: blocks or count of each mpi_type, type list, address in memory
and displacement. The address in memory can be gotten by a MPI subroutine
called mpi_address(a,addresses(1),ierr) where a is one of the entries in the new
data type.
The following code dertypempi.f creates a new data type, which is called
data_mpi_type. It consists of four entries with one mpi_real, a, one mpi_real,
b, one mpi_int, c and one mpi_int, d. These entries are initialized on processor
0 by lines 19-24. In order to communicate them as a single new data type via
mpi_bcast(), the new data type is created in lines 26-43. The four arrays
blocks, typ elist, addresses and displacements are initialized. The call in line 42
to mpi_type_struct(4, blocks, displacements, typelist, data_mpi_type ,ierr)
enters this structure and identies it with the name data_mpi_type. Finally
the call in line 43 to mpi_type_commit(data_mpi_type,ierr) nalizes this user
dened data type. The call to mpi_bcast() in line 52 addresses the rst entry
of the data_mpi_type and uses count =1 so that the data a, b, c and d will
be broadcast to the other processors. This is veried by the print commands
in lines 46-49 and 54-57.
MPI/Fortran 9x Code dertypempi.f
1. program dertypempi
2.! Illustrates a derived type.
3. implicit none
4. include ’mpif.h’
5. real:: a,b
6. i nteger::c,d
7. integer::data_mpi_type
© 2004 by Chapman & Hall/CRC
7.4. GROUPED DATA TYPES 297
8. integer::ierr
9. integer, dimension(1:4)::blocks
10. integer, dimension(1:4)::displacements
11. integer, dimension(1:4)::addresses
12. integer, dimension(1:4)::typelist
13. integer:: my_rank,p,n,source,dest,tag,loc_n
14. integer:: i,status(mpi_status_size)
15. data n,dest,tag/4,0,50/
16. call mpi_init(ierr)
17. call mpi_comm_rank(mpi_comm_world,my_rank,ierr)
18. call mpi_comm_size(mpi_comm_world,p,ierr)
19. if (my_rank.eq.0) then
20. a = exp(1.)
21. b = 4*atan(1.)
22. c = 1
23. d = 186000
24. end if
25.! Dene the new derived type, data_mpi_type.
26. typelist(1) = mpi_real
27. typelist(2) = mpi_real
28. typelist(3) = mpi_integer
29. typelist(4) = mpi_integer
30. blocks(1) = 1
31. blocks(2) = 1
32. blocks(3) = 1
33. blocks(4) = 1
34. call mpi_address(a,addresses(1),ierr)
35. call mpi_address(b,addresses(2),ierr)
36. call mpi_address(c,addresses(3),ierr)
37. call mpi_address(d,addresses(4),ierr)
38. displacements(1) = addresses(1) - addresses(1)
39. displacements(2) = addresses(2) - addresses(1)
40. displacements(3) = addresses(3) - addresses(1)
41. displacements(4) = addresses(4) - addresses(1)
42. call mpi_type_struct(4,blocks,displacements,&
. typelist,data_mpi_type,ierr)
43. call mpi_type_commit(data_mpi_type,ierr)
44.! Before the broadcast of the new typ e data_mpi_type
45.! try to print the data.
46. print*,’my_rank =’,my_rank, ’a = ’,a
47. print*,’my_rank =’,my_rank, ’b = ’,b
48. print*,’my_rank =’,my_rank, ’c = ,c
49. print*,’my_rank =’,my_rank, ’d = ’,d
50. call mpi_barrier(mpi_comm_world,ierr)
51.! Broadcast data_mpi_type.
© 2004 by Chapman & Hall/CRC
298 CHAPTER 7. MESSAGE PASSING INTERFACE
52. call mpi_bcast(a,1,data_mpi_type,0,&
mpi_comm_world,ierr)
53.! Each processor prints the data.
54. print*,’my_rank =’,my_rank, ’a = ’,a
55. print*,’my_rank =’,my_rank, ’b = ’,b
56. print*,’my_rank =’,my_rank, ’c = ’,c
57. print*,’my_rank =’,my_rank, ’d = ’,d
58. call mpi_nalize(ierr)
59. end program dertypempi
my_rank = 0 a = 2.718281746
my_rank = 0 b = 3.141592741
my_rank = 0 c = 1
my_rank = 0 d = 186000
my_rank = 1 a = 0.2524354897E-28
my_rank = 1 b = 0.1084320046E-18
my_rank = 1 c = 20108
my_rank = 1 d = 3
my_rank = 2 a = 0.2524354897E-28
my_rank = 2 b = 0.1084320046E-18
my_rank = 2 c = 20108
my_rank = 2 d = 3
my_rank = 3 a = 0.2524354897E-28
my_rank = 3 b = 0.1084320046E-18
my_rank = 3 c = 20108
my_rank = 3 d = 3
!
my_rank = 0 a = 2.718281746
my_rank = 0 b = 3.141592741
my_rank = 0 c = 1
my_rank = 0 d = 186000
my_rank = 1 a = 2.718281746
my_rank = 1 b = 3.141592741
my_rank = 1 c = 1
my_rank = 1 d = 186000
my_rank = 2 a = 2.718281746
my_rank = 2 b = 3.141592741
my_rank = 2 c = 1
my_rank = 2 d = 186000
my_rank = 3 a = 2.718281746
my_rank = 3 b = 3.141592741
my_rank = 3 c = 1
my_rank = 3 d = 186000
© 2004 by Chapman & Hall/CRC
7.4. GROUPED DATA TYPES 299
7.4.4 Packed Type
The subroutine mpi_pack() relocates data to a new array, which is addressed
sequentially. Communication subroutines such as mpi_bcast() can be used with
the count parameter to send the data to other processors. The data is then
unpacked from the array created by mpi_unpack()
mpi_pack(locdata, count, mpi_datatype,
packarray, position, mpi_comm, ierr)
locdata array(*)
count integer
mpi_datatype integer
packarray array(*)
packcount int eger
position integer
mpi_comm integer
ierr integer
mpi_unpack(destarray, count, mpi_datatype,
locdata, position, mpi_comm, ierr)
packarray array(*)
packcount int eger
mpi_datatype integer
locdata array(*)
count integer
position integer
mpi_comm integer
ierr integer
In packmpi.f four variables on processor 0 are initialized in l ines 17-18 and
packed into the array numbers in lines 21-25. Then in lines 26 and 28 the
array number is broadcast to the other processors. In lines 30-34 this data is
unpacked to the original local variables, which are duplicated on each of the
other pro cessors. The print commands in lines 37-40 verify this.
MPI/Fortran 9x Code packmpi.f
1. program packmpi
2.! Illustrates mpi_pack and mpi_unpack.
3. implicit none
4. include ’mpif.h’
5. real:: a,b
6. integer::c,d,location
7. integer::ierr
8. character, dimension(1:100)::numbers
9. integer:: my_rank,p,n,source,dest,tag,loc_n
10. integer:: i,status(mpi_status_size)
11. data n,dest,tag/4,0,50/
© 2004 by Chapman & Hall/CRC