2
votes

Is it possible to rewrite or improve this function to not require volatile or a generic memory clobber in its inline assembly?

// do stuff with the input Foo structure and write the result to the 
// output Bar structure.
static inline void MemFrob(const struct Foo* input, struct Bar* output) {
    register const Foo* r0 asm("r0") = input;
    register Bar* r1 asm("r1") = output;

    __asm__ __volatile__(
        "svc #0x0f0000 \n\t"
        : "+r" (r0), "+r" (r1)
        :
        : "r2", "r3", "cc", "memory"
        );
}

For this specific situation, the target platform is an ARM7 system, and the code is being compiled with GCC 5.3.0. The system call being performed has the same calling convention as a C function call. After some trial and error, I've arrived at the above which "works" but I am not yet confident that it is correct and will always work, subject to the whims and fancies of the optimizing compiler.

I'd like to be able to remove the "memory" clobber and tell GCC exactly which memory will be modified, but the GCC Extended Asm documentation discusses how to assign values to specific registers, and then memory constraints, but not if they both can be combined. As of now, removing the "memory" clobber from the above example can cause GCC to not use the output in proceeding code.

I'd also like to be able to remove the volatile in the cases where the output is not used. But as of now, removing volatile from the above example causes GCC to not emit the assembly at all.

Adding additional inline assembly to move the system call parameters into r0/r1 manually or un-inlining by moving the code to an external compilation unit are wasteful workarounds that I'd much rather avoid.

1
1) Yes, you can have 1 constraint be a register and another be a memory, and yes, two constraints can overlap. 2) Without the volatile, gcc sees the asm as the way to compute the new values for r0 and r1. But since the optimizers can see that the variables r0 and r1 are not used again after the asm before they go out of scope, the asm is discarded.David Wohlferd

1 Answers

2
votes

Long story short: this is what the "m" constraint is for. Usually, if you are using volatile or __volatile__, with asm, it is because there is an error in your code. One of the compiler's main jobs is flow analysis, so as long as you give it enough information to do the correct flow analysis, everything will work correctly.

Here is a fixed version:

void MemFrob(const struct Foo* input, struct Bar* output) {
    register const Foo* r0 asm("r0") = input;
    register Bar* r1 asm("r1") = output;
    __asm__ (
        "svc #0x0f0000"
        : "=m"(*r1) // writes data to *output (but does not read)
        : "m"(*r0), // reads data in *input
          "l"(r0), "l"(r1) // This is necessary to ensure correct register
        : "r2", "r3", "cc"
        );
}

You can test it on https://gcc.godbolt.org/ (-O2 compiler options recommended). The output is as follows:

svc #0x0f0000
bx lr

Obviously, when inlined, it should reduce to just the one instruction.

Unfortunately, I couldn't figure out how to specify specific registers when using inline ARM assembly, other than the method above which is a bit clumsy.