2
votes

I am trying to use Metal argument buffers to access data in a Metal compute kernel. The buffer has an entry when I print out the value CPU-side, but the Xcode debugger shows my argument buffer as empty on the GPU.

I can see my buffer with the sentinel value as an indirect resource in the debugger but no pointer to it in the argument buffer.

Here is the Swift code:

import MetalKit

do {
    let device = MTLCreateSystemDefaultDevice()!
    let capture_manager = MTLCaptureManager.shared()
    let capture_desc = MTLCaptureDescriptor()
    capture_desc.captureObject = device
    try capture_manager.startCapture(with: capture_desc)
    
    let argument_desc = MTLArgumentDescriptor()
    argument_desc.dataType = MTLDataType.pointer
    argument_desc.index = 0
    argument_desc.arrayLength = 1024
    
    let argument_encoder = device.makeArgumentEncoder(arguments: [argument_desc])!
    
    let argument_buffer = device.makeBuffer(length: argument_encoder.encodedLength, options: MTLResourceOptions())
    argument_encoder.setArgumentBuffer(argument_buffer, offset: 0)
    
    var sentinel: UInt32 = 12345
    let ptr = UnsafeRawPointer.init(&sentinel)
    
    let buffer = device.makeBuffer(bytes: ptr, length: 4, options: MTLResourceOptions.storageModeShared)!
    argument_encoder.setBuffer(buffer, offset: 0, index: 0)
    
    let source = try String(contentsOf: URL.init(fileURLWithPath: "/path/to/kernel.metal"))
    let library = try device.makeLibrary(source: source, options: MTLCompileOptions())
    let function = library.makeFunction(name: "main0")!
    
    let pipeline = try device.makeComputePipelineState(function: function)
    
    let queue = device.makeCommandQueue()!
    let encoder = queue.makeCommandBuffer()!
    
    let compute_encoder = encoder.makeComputeCommandEncoder()!
    
    compute_encoder.setComputePipelineState(pipeline)
    compute_encoder.setBuffer(argument_buffer, offset: 0, index: 0)
    compute_encoder.useResource(buffer, usage: MTLResourceUsage.read)
    compute_encoder.dispatchThreads(MTLSize.init(width: 1, height: 1, depth: 1), threadsPerThreadgroup: MTLSize.init(width: 1, height: 1, depth: 1))
    
    compute_encoder.endEncoding()
    
    encoder.commit()
    encoder.waitUntilCompleted()

    capture_manager.stopCapture()
} catch {
    print(error)
    exit(1)
}

And the compute kernel:

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct Argument {
    constant uint32_t *ptr [[id(0)]];
};

kernel void main0(
    constant Argument *bufferArray [[buffer(0)]]
) {
    constant uint32_t *ptr = bufferArray[0].ptr;
    uint32_t y = *ptr;
}

If anyone has any ideas, I'd greatly appreciate it!

1
can you provide this as a question?DaveTheAl

1 Answers

1
votes

Seems that Metal optimizes out the kernel or something along that line since it only performs read operations.

Changing the kernel to write to the buffer makes everything work and show up properly in the Xcode debugger.