I'm trying to craft some inline assembly to test performance of rotate on ARM. The code is part of a C++ code base, so the rotates are template specializations. The code is below, but its producing messages that don't make a lot of sense to me.
According to ARM Assembly Language, the instructions are roughly:
# rotate - rotate instruction
# dst - output operand
# lhs - value to be rotated
# rhs - rotate amount (immediate or register)
<rotate> <dst>, <lhs>, <rhs>
They don't make a lot of sense because (to me), for example, I use g
to constrain the output register, and that's just a general purpose register per Simple Contraints. ARM is supposed to have a lot of them, and Machine Specific Constraints does not appear to change behavior of the constraint.
I'm not sure the best way to approach this, so I'm going to ask three questions:
- How do I encode the rotate when using a constant or immediate value?
- How do I encode the rotate when using a value passed through a register?
- How would thumb mode change the inline assembly
arm-linux-androideabi-g++ -DNDEBUG -g2 -Os -pipe -fPIC -mfloat-abi=softfp
-mfpu=vfpv3-d16 -mthumb --sysroot=/opt/android-ndk-r10e/platforms/android-21/arch-arm
-I/opt/android-ndk-r10e/sources/cxx-stl/stlport/stlport/ -c camellia.cpp
In file included from seckey.h:9:0,
from camellia.h:9,
from camellia.cpp:14:
misc.h: In function 'T CryptoPP::rotlFixed(T, unsigned int) [with T = unsigned int]':
misc.h:1121:71: error: matching constraint not valid in output operand
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
^
misc.h:1121:71: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotrFixed(T, unsigned int) [with T = unsigned int]':
misc.h:1129:71: error: matching constraint not valid in output operand
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
^
misc.h:1129:71: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotlVariable(T, unsigned int) [with T = unsigned int]':
misc.h:1137:72: error: matching constraint not valid in output operand
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "g1" ((int)(y%32)));
^
misc.h:1137:72: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotrVariable(T, unsigned int) [with T = unsigned int]':
misc.h:1145:72: error: matching constraint not valid in output operand
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "g1" ((int)(y%32)));
^
misc.h:1145:72: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotrFixed(T, unsigned int) [with T = unsigned int]':
misc.h:1129:71: error: matching constraint not valid in output operand
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
^
misc.h:1129:71: error: invalid lvalue in asm output 0
misc.h:1129:71: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotlFixed(T, unsigned int) [with T = unsigned int]':
misc.h:1121:71: error: matching constraint not valid in output operand
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
^
misc.h:1121:71: error: invalid lvalue in asm output 0
misc.h:1121:71: error: matching constraint references invalid operand number
// ROL #n Rotate left immediate
template<> inline word32 rotlFixed<word32>(word32 x, unsigned int y)
{
int z;
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
return static_cast<word32>(z);
}
// ROR #n Rotate right immediate
template<> inline word32 rotrFixed<word32>(word32 x, unsigned int y)
{
int z;
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
return static_cast<word32>(z);
}
// ROR rn Rotate left by a register
template<> inline word32 rotlVariable<word32>(word32 x, unsigned int y)
{
int z;
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "g1" ((int)(y%32)));
return static_cast<word32>(z);
}
// ROR rn Rotate right by a register
template<> inline word32 rotrVariable<word32>(word32 x, unsigned int y)
{
int z;
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "g1" ((int)(y%32)));
return static_cast<word32>(z);
}
template<> inline word32 rotlMod<word32>(word32 x, unsigned int y)
{
return rotlVariable<word32>(x, y);
}
template<> inline word32 rotrMod<word32>(word32 x, unsigned int y)
{
return rotrVariable<word32>(x, y);
}
g2
andM1
? The2
and the1
are the matching constraints that don't seem to make sense, and the compiler doesn't like them either. – Jester2
is the output operand numer. It needs to be in a register, hence theg2
. For1
, that's therhs
orshift amount
. For immediate, it needs to be constrained to immediate values, hence theM1
. – jwwx << y | x >> (32 - y)
idiom and emit a singleror
instruction, provided the arguments are unsigned. – Notlikethat2
and the1
? Those mean, put in the same place as the given other operand and you don't need that here. – Jesterx << y | x >> (32 - y)
- that's undefined behavior wheny=0
. That code should not show up anywhere in production. And GCC does not offer a rotate intrinsic that would lay waste to these questions I have. If they provided it, then I would have been done a long time ago. Related: Near constant time rotate that does not violate the standards. – jww