This assignment is to be done on the UTK SP2.
cpu command (complete path /export/usr/bin/cpu)
on both a high node and a thin node on the UTK SP2 and answer the following
questions for each:
xlf man page. Based on your answer to the third
part of the preceding question, what would you specify for the
-qcache option for
do i = 1,99
do j = 2,100
a(i,j) = a(i+1,j-1)*2.0
enddo
enddo
do i=1,n
do k=1,n
do j=1,n
c(j,i)=c(j,i)+a(j,k)*b(k,i)
enddo
enddo
enddo
do i=1,n
do j=1,n
do k=1,n
c(j,i)=c(j,i)+a(j,k)*b(k,i)
enddo
enddo
enddo
n for which
all the arrays will fit into cache (be sure to specify whether this
is for a high node or a thin node).
n and gives random values to the
a and n from fitting in cache to
overflowing the cache.
gprof without any optimization
and and use gprof to
report the times taken for each of the two routines and DGEMM for
the different values of n.
-O3 -qhot and -qreport=hotlist
and rerun. Report the times taken by the routines. Look at the
compiler report and report what blocking factors it chose for the
arrays.