Fortran OpenMP program shows no speedup of CPU_TIME() -
the use of parallelism should lead minimizing time of program did not happened me. when programmed code in parallel using openmp, run time augmented, i.e. parallel time > serial time.
my code:
program main use omp_lib implicit none real*8 times1,times2 integer i,j real, allocatable, dimension(:) :: allocate(a(1000)) j = 1, 1000 a(j)=j enddo ! ***************no parallel code ************************************ call cpu_time(times1) write(*,*) 'cpu no parallel started:',times1 = 1, 1000 j = 1, 500000 a(i)=a(i)+0.0001 end a(i)=a(i)+a(i)+a(i) enddo call cpu_time(times2) write(*,*) 'cpu cpu no parallel finished:',times2 write(*,*) 'no parallel times:',times2-times1 write(*,*) '---------------------------------------------------' ! ***************parallel code ************************************ call cpu_time(times1) write(*,*) 'cpu parallel started:',times1 !$omp parallel default(shared), private(i,j) !$omp = 1, 1000 j = 1, 500000 a(i)=a(i)+0.0001 end a(i)=a(i)+a(i)+a(i) enddo !$omp end !$omp end parallel call cpu_time(times2) write(*,*) 'cpu parallel finished:',times2 write(*,*) 'parallel times:',times2-times1 deallocate(a) stop end
and result :
cpu no parallel started: 1.560010000000000e-002 cpu cpu no parallel finished: 4.86723120000000 no parallel times: 4.85163110000000
cpu parallel started: 4.86723120000000 cpu parallel finished: 9.89046340000000 parallel times: 5.02323220000000
why time measured cpu_time() increased openmp?
cpu_time()
takes time on cpu, not walltime. in parallel applications these not same. see here details.
using system_clock()
solves problem:
program main use omp_lib implicit none real*8 times1,times2 integer i,j, itimes1,itimes2, rate real, allocatable, dimension(:) :: allocate(a(1000)) call system_clock(count_rate=rate) j = 1, 1000 a(j)=j enddo ! ***************no parallel code ************************************ call cpu_time(times1) call system_clock(itimes1) write(*,*) 'cpu no parallel started:',times1 = 1, 1000 j = 1, 500000 a(i)=a(i)+0.0001 end a(i)=a(i)+a(i)+a(i) enddo call cpu_time(times2) call system_clock(itimes2) write(*,*) 'cpu cpu no parallel finished:',times2 write(*,*) 'no parallel times:',times2-times1, real(itimes2-itimes1)/real(rate) write(*,*) '---------------------------------------------------' ! ***************parallel code ************************************ call cpu_time(times1) call system_clock(itimes1) write(*,*) 'cpu parallel started:',times1 !$omp parallel default(shared), private(i,j) !$omp = 1, 1000 j = 1, 500000 a(i)=a(i)+0.0001 end a(i)=a(i)+a(i)+a(i) enddo !$omp end !$omp end parallel call cpu_time(times2) call system_clock(itimes2) write(*,*) 'cpu parallel finished:',times2 write(*,*) 'parallel times:',times2-times1, real(itimes2-itimes1)/real(rate) deallocate(a) stop end
then, can see parallel program indeed faster.
cpu no parallel started: 4.0000000000000001e-003 cpu cpu no parallel finished: 1.4600000000000000 no parallel times: 1.4560000000000000 1.45400000 --------------------------------------------------- cpu parallel started: 1.4600000000000000 cpu parallel finished: 5.1040000000000001 parallel times: 3.6440000000000001 0.920000017
Comments
Post a Comment