3
votes

I am trying to translate this for-loop from C to assembly using AT&T/GAS syntax:

for(int j = i; i*j < N; j++) {
    A[i*j] = 0;
}

I have i stored in %eax and j stored in %ebx. The problem I am having, is to actually multiply i and j, as the instruction imul "reg32", "reg32" stores the result in the second register, which I obviously don't want. What I do want, is the ability to store the result in another register, say %ecx for example, and then use this to access the value in the array at index i*j.

When I look up the usage for the instruction imul, there seems to be no way to actually multiply two registers and have the result stored in a third register. Off course, I could make a loop and do some addition and so on, but that seems ineffective and not the way to do go about this. Note that I am completely new to assembly (only used it for a couple of days) as we are just starting out with learning the basics at my CS course.

TL;DR

What is the best way to multiply the values stored in two registers like this: %eax * %ebx = %ecx ?

3
the instruction imul "reg32", "reg32" stores the result in the second register, which I obviously don't want. Why is it obvious you don't want this? The right way to do this would be imul %eax, %ebx then, if you need the result in %ecx, you could do mov %eax, %ecx. If you need to preserve the value in %eax you can save it (e.g., on the stack). When I look up the usage for the instruction imul, there seems to be no way to actually multiply two registers and have the result stored in a third register. This is true. But it's not cumbersome to work around.lurker

3 Answers

3
votes

x86 is a two operand architecture where most instructions take two operands, overwriting one of them. If you want to write the result to a third operand instead of overwriting one of the source operands, the standard solution is to first move one of the operands to the target and then use the target with a two operand instruction. For example, to multiply eax with ebx, placing the result in ecx, you'd do

mov %ebx, %ecx
imul %eax, %ecx

Though as others noted, for your loop it's best to forego multiplication entirely and instead recognise that you can make do with additions. Your loop

for (int j = i; i*j < N; j++) {
    A[i*j] = 0;
}

can be rewritten as

A_ = A + i * i;
N_ = N - i * i;
for (j = 0; j < N_; j += i)
    A_[j] = 0;

requiring no multiplications inside the loop.

2
votes

When I look up the usage for the instruction imul, there seems to be no way to actually multiply two registers and have the result stored in a third register.

This is true of most x86 instructions -- most arithmetic and logical operations take two operands and store the result back into one of the source registers. If you need to save one of the original values, copy it to another register.

imul is a particularly strange x86 instruction in that it has a one-argument form which multiplies the source register by eax, and writes the result to edx:eax. These register mappings are not flexible; if you need the full product, you will need to allocate your registers around this.

Off course, I could make a loop and do some addition and so on, but that seems ineffective and not the way to do go about this.

This is actually a great approach -- addition is faster than multiplication. A good optimizing compiler would probably do something along these lines.

2
votes

What you want to observe is the way that i*j changes as you increment j.  So, let's assume that i is 50, then initially j = 50, so i*j is 50*50.  The next iteration through the loop, j is 51, so i*j is 50*51, or, 50*(50+1), or, 50*50+50.  And the following iteration, i*j is 50*50+50+50, and so on. 

By keeping an accumulator, initialized outside/before the j loop with i*i, and maintained with one simple add instruction per loop iteration, you can have the value of i*j without multiplication.

See also induction variable.

I strongly suspect that if you look at the outer i loop (not shown in the question), you will be able to eliminate the initial multiplication (here the first i*i to initialize the accumulator).