Given the following code :
typedef int array[4][4];
void transpose2(array dst, array src)
{
int i, j;
for ( i=0; i<4; i++) {
for ( j=0; j<4; j++) {
dst[i][j] = src[j][i];
}
}
}
Assumptions :
int is 4 bytes
src
array starts at address0
,dst
starts at address64
the size of the cache is
32
bytes , at the beginning the cache is emptythere is a L1 cache working under direct mapping using write-through, write-allocate
the size of the block is 16 bytes
I'm trying to figure out the cache miss & cache hit of dst
and src
.
The question - to fill in the tables of src and dst arrays , where they're empty at the beginning : Before the run
First I'll present the solution of my professor : After the run
Here is my solution , but somewhere, I'm making a mistake :
Assuming that I run i
from 1 to 4
and not from 0 to 3
First iteration :
src dst
1,1-> 1,1
2,1-> 1,2
3,1-> 1,3
4,1-> 1,4
Second iteration:
src dst
1,2 ->2,1
2,2 ->2,2
3,2 ->2,3
4,2 ->2,4
Third iteration:
src dst
1,3 -> 3,1
2,3 -> 3,2
3,3 -> 3,3
4,3 -> 3,4
Fourth iteration:
src dst
1,4 -> 4,1
2,4 -> 4,2
3,4 -> 4,3
4,4 -> 4,4
I don't understand why are there HITS at all at the tables of dst
,
I know that I'm wrong , can someone please explain why are there indeed HITS in the solution above ?
Regards Ron