(Bit late but oh well)
There is no way you could make it work with only one texture, because the GPU is a highly parallel processor: Your kernel that you wrote for a single pixel gets called in parallel on all pixels, you can't tell which one goes first.
So you definitely need 2 textures. The way you probably should do it is by using 2 textures where one is the "old" one and the other the "new" one. Between passes, you switch the role of the textures, now old is new and new is old. Here is some pseudoswift:
var currentText = MTLTexture()
var nextText = MTLTexture()
let semaphore = dispatch_semaphore_create(1)
func update() {
dispatch_semaphore_wait(semaphore) // Wait for updating done signal
let commands = commandQueue.commandBuffer()
let encoder = commands.computeCommandEncoder()
encoder.setTexture(currentText, atIndex: 0)
encoder.setTexture(nextText, atIndex: 1)
encoder.dispatchThreadgroups(...)
encoder.endEncoding()
// When updating done, swap the textures and signal that it's done updating
commands.addCompletionHandler {
swap(¤tText, &nextText)
dispatch_semaphore_signal(semaphore)
}
commands.commit()
}
BlitCommandEncoderwas making call to another, time-consuming function, which caused the delay, the Blit operations weren't the primary source of delay – it was my fault. However, looking into GPU Frame Capture, I found that there isn't mentioned the duration of Blit operations (as opposed to Compute operations) – is there no way of measuring these (or am I missing something)? - sarasvati