Fortran OpenMP program shows no speedup of CPU_TIME() -


the use of parallelism should lead minimizing time of program did not happened me. when programmed code in parallel using openmp, run time augmented, i.e. parallel time > serial time.

my code:

    program main     use omp_lib     implicit none     real*8 times1,times2     integer i,j     real, allocatable, dimension(:) ::     allocate(a(1000))     j = 1, 1000     a(j)=j       enddo !    ***************no parallel code ************************************     call cpu_time(times1)     write(*,*) 'cpu no parallel started:',times1     = 1, 1000     j = 1, 500000     a(i)=a(i)+0.0001     end      a(i)=a(i)+a(i)+a(i)     enddo     call cpu_time(times2)     write(*,*) 'cpu cpu no parallel finished:',times2     write(*,*) 'no parallel times:',times2-times1     write(*,*) '---------------------------------------------------' !    ***************parallel code ************************************     call cpu_time(times1)     write(*,*) 'cpu parallel started:',times1 !$omp parallel default(shared), private(i,j) !$omp     = 1, 1000     j = 1, 500000     a(i)=a(i)+0.0001     end      a(i)=a(i)+a(i)+a(i)     enddo !$omp end !$omp end parallel     call cpu_time(times2)     write(*,*) 'cpu parallel finished:',times2     write(*,*) 'parallel times:',times2-times1     deallocate(a)     stop     end 

and result :

 cpu no parallel started:  1.560010000000000e-002  cpu cpu no parallel finished:   4.86723120000000  no parallel times:   4.85163110000000 

 cpu parallel started:   4.86723120000000  cpu parallel finished:   9.89046340000000  parallel times:   5.02323220000000 

why time measured cpu_time() increased openmp?

cpu_time() takes time on cpu, not walltime. in parallel applications these not same. see here details.

using system_clock() solves problem:

    program main     use omp_lib     implicit none     real*8 times1,times2     integer i,j, itimes1,itimes2, rate     real, allocatable, dimension(:) ::     allocate(a(1000))      call system_clock(count_rate=rate)     j = 1, 1000     a(j)=j       enddo !    ***************no parallel code ************************************     call cpu_time(times1)     call system_clock(itimes1)     write(*,*) 'cpu no parallel started:',times1     = 1, 1000     j = 1, 500000     a(i)=a(i)+0.0001     end      a(i)=a(i)+a(i)+a(i)     enddo     call cpu_time(times2)     call system_clock(itimes2)     write(*,*) 'cpu cpu no parallel finished:',times2     write(*,*) 'no parallel times:',times2-times1, real(itimes2-itimes1)/real(rate)     write(*,*) '---------------------------------------------------' !    ***************parallel code ************************************     call cpu_time(times1)     call system_clock(itimes1)     write(*,*) 'cpu parallel started:',times1 !$omp parallel default(shared), private(i,j) !$omp     = 1, 1000     j = 1, 500000     a(i)=a(i)+0.0001     end      a(i)=a(i)+a(i)+a(i)     enddo !$omp end !$omp end parallel     call cpu_time(times2)     call system_clock(itimes2)      write(*,*) 'cpu parallel finished:',times2     write(*,*) 'parallel times:',times2-times1, real(itimes2-itimes1)/real(rate)     deallocate(a)     stop     end 

then, can see parallel program indeed faster.

 cpu no parallel started:   4.0000000000000001e-003  cpu cpu no parallel finished:   1.4600000000000000       no parallel times:   1.4560000000000000        1.45400000      ---------------------------------------------------  cpu parallel started:   1.4600000000000000       cpu parallel finished:   5.1040000000000001       parallel times:   3.6440000000000001       0.920000017   

Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

javascript - Complex json ng-repeat -

jquery - Cloning of rows and columns from the old table into the new with colSpan and rowSpan -