LLVM AMDGPU alloca address spaces

Question

I've been trying to get .NET CIL to run on multiple platforms (I'm particularly interested in GPUs) by means of LLVM. I've used Mono.Compiler for translating CIL to LLVM.

I'm having trouble getting AMDGCN to work. For a simple add function, I'm getting the following translated IR:

; ModuleID = 'bitout.bc'
source_filename = "llvmmodule_1"

define i32 @llvmmodule_1_AddMethod(i32, i32) {
entry:
  %A0 = alloca i32
  store i32 %0, i32* %A0
  %A1 = alloca i32
  store i32 %1, i32* %A1
  %T0 = load i32, i32* %A0
  %T1 = load i32, i32* %A1
  %T2 = add i32 %T0, %T1
  ret i32 %T2
}

I've tried emitting it directly through libLLVM's TargetMachineEmitTo{File, MemoryBuffer} as well as indirectly, via llc.

Emitting directly results in a SIGSEGV:

Thread 1 "mono" received signal SIGSEGV, Segmentation fault.
0x00007fffeddadb1a in llvm::AMDGPUInstPrinter::getRegisterName(unsigned int) () from /usr/lib/libLLVM.so

This seems to happen due to a (negative) buffer overflow in the above function (as far as I could tell from gdb).

llc fails on both amdgcn and r600 with:

Allocation instruction pointer not in the stack address space!
  %A0 = alloca i32
Allocation instruction pointer not in the stack address space!
  %A1 = alloca i32

Otherwise, llc compiles fine for all other platforms (except wasm64).

After some digging, I've been wondering whether this could be from not specifying the address space in alloca (though in the LLVM Guide for AMDGPU there's nothing really explained about this); so I got a copy of the translated IR and changed the address space. Turned out that llc compiles it if allocations are in the Private address space - which I guess works as the stack space. But I'm finding it weird though that neither the Global nor Region address spaces aren't working - Shouldn't I be able to allocate space in Global memory? What am I missing here?

On the same note, I can't find a way to create an alloca instruction that takes an address space (BuildAlloca doesn't take an address space as an argument and I couldn't find any documentation or examples that mention alternatives).

If it matters, I'm using the default libLLVM on ArchLinux (at this time, llvm 8.0.1).

Nuoji Nuoji · Accepted Answer · 2019-11-11T10:21:58

For AMDGPU Clang uses the following code to get the alloca:

  LangAS getASTAllocaAddressSpace() const override {
    return getLangASFromTargetAS(
        getABIInfo().getDataLayout().getAllocaAddrSpace());
  }

Ignore getLangASFromTargetAS here is just a way to be able to make it work with the LangAS enum.

The takeaway is that you need to get the address space from the DataLayout instead of setting it to zero – but only for AMDGPU.

LLVM AMDGPU alloca address spaces

1 Answers