I'm trying to to do a 4x4 Matrix multiplication using Assembly in MIPS simulator (QtMips).
QtMips gives me Exception 4: Unaligned Address in inst/data fetch: 0x100100bb
This is where I get the error when I single step.
[00400070] c52b0000 lwc1 $f11, 0($9) ; 80: lwc1 $f11 0($t1) #load float from array1
The error happens when counter k = 2, meaning when it is at the third loop. I'm assuming something is wrong with 32-bit alignment at my third load, lwc1
Here's what I tried/read but didn't work:
- This suggests that I put .align 2 or .align 4 before my array (matrix) declaration in .data. Didn't work.
- This suggests that it could be the issue of the size value (defined after array3). But I'm loading this to s1 by
lw $s1 size
so I don't see this being a real issue for me.
I'm very lost on what to do. Please impart me some wisdom.
Below is my whole code:
# here's our array data, two args and a result
.data
.globl array1
.globl array2
.globl array3
.align 5 #align the data set
array1: .float 1.00, 0.00, 3.14, 2.72, 2.72, 1.00, 0.00, 3.14, 1.00, 1.00, 1.00, 1.00, 1.00, 2.00, 3.00, 4.00
.align 5 #align the data set
array2: .float 1.00, 1.00, 0.00, 3.14, 0.00, 1.00, 3.14, 2.72, 0.00, 1.00, 1.00, 0.00, 4.00, 3.00, 2.00, 1.00
.align 5 #align the data set
array3: .float 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00
size: .word 4 #store float in s2
.text
.globl main
main:
sw $31 saved_ret_pc
.data
lb_: .asciiz "Vector Multiplication\n"
lbd_: .byte 1, -1, 0, 128
lbd1_: .word 0x76543210, 0xfedcba98
.text
li $v0 4 # syscall 4 (print_str)
la $a0 lb_
syscall
# main program: multiply matrix 1 and 2, store in array3
la $t1 array1
la $t2 array2
la $t3 array3 ###load arrrays to registers
li $t4 4 # i loop counter -> I changed addi to li
li $t5 4 # j loop counter
li $t6 4 # k loop counter
lw $s1 size # load matrix(array) size
i_loop:
j j_loop
j_loop:
j k_loop
k_loop:
#f0 and f1 - float func return values
#f10 - multiplication return values
#f4, f5 - register to store addr offset
lwc1 $f11 0($t1) #load float from array1
lwc1 $f12 0($t2) #load float from array2
lwc1 $f13 0($t3) #load float from result array3
nop
mul.s $f10 $f11 $f12 #multiply floats, store result as temp in $f10
nop
add.s $f13 $f13 $f10 #add to multiplication result to resulting array3
swc1 $f13 0($t3) #store the resulting float in array3
#call index_of_A
move $s0 $ra #save return address into s0
nop
jal index_of_A #get addr offset for array1
nop
move $ra $s0 #restore return address that was saved into s0
#call index_of_B
move $s0 $ra #save return address into s0
nop
jal index_of_B #get addr offset for array2
nop
move $ra $s0 #restore return address that was saved into s0
add $t1 $t1 $s2 # next address in the array1
add $t2 $t2 $s3 # next address in the array2
addi $t3 $t3 4 # next address in the array3
addi $t6 $t6 -1 #decrease k counter
bne $t6 $0 k_loop #repeat k_loop
addi $t5 $t5 -1 #decrease j counter
bne $t5 $0 j_loop #repeat j_loop
addi $t4 $t4 -1 #decrease i counter
bne $t4 $0 i_loop #repeat i_loop
#used regs: f0-f5, f10-13
index_of_A: #function for array1 addr offset #may need to convert all to float first
#size*i + k #$f20*i + k
mul $s2 $s1 $t4 # 4*i,
add $s2 $s2 $t6 # + k, store in $s2
jr $ra #jump back to the caller
index_of_B: #function for array2 addr offset
#4*k + j
mul $s3 $s1 $t6 # 4*k,
add $s3 $s3 $t5 # + j, store in $s3
jr $ra #jump back to the caller
# Done multiplying...
.data
sm: .asciiz "Done multiplying\n"
.text
print_and_end:
li $v0 4 # syscall 4 (print_str)
la $a0 sm
syscall
# Done with the program!
lw $31 saved_ret_pc
jr $31 # Return from main
#Terminate the program
li $v0, 10
syscall
.end main
But I don't understand what's wrong since the same exact code works on my another example here:
.align 2
would be insufficient. Also, 0x100100bb is an odd number (literally, the 1 bit is set). That can't be a good thing for a system that requires aligned reads. I'd check the math that you use to increment this pointer to your array. – David Wohlferd.align 4
. However, that isn't your only problem. If your array is located at say 0x100, then you can load that value into a register and use it to read a value from your array, since it's nicely aligned. But if you increment it by 1 (0x101), that's not aligned anymore. You need to be incrementing your register by the size of the elements in your array (which I suspect is 4). – David Wohlferd.align
in GAS-like syntax either takes a power of 2, or an exponent. If your assembler didn't complain about 5, then it treats.align
as a synonym.p2align
, so you were aligning to a 2^5 = 32 byte boundary. – Peter Cordes.align 4
should be fine.1<<4 = 2^4 = 16
. 32 bits is 4 bytes. More alignment than necessary isn't going to hurt. – Peter Cordesadd $t1 $t1 $s2 # next address in the array1
(which I interpret to mean t1 = t1 + s2) to go to the next element. What does t1 contain before and after this instruction? Are they both evenly divisible by 4? And what's in s2? I doesn't see anything that explicitly assigns a value to it. Does this somehow happen implicitly on mips? – David Wohlferd