11
votes

This is an absolute beginner question.

Background: I’m not really a game developer, but I’m trying to learn the basics of low-level 3D programming, because it’s a fun and interesting topic. I’ve picked Apple’s Metal as the graphics framework. I know about SceneKit and other higher level frameworks, but I’m intentionally trying to learn the low level bits. Unfortunately I’m way out of my depth, and there seems to be very little beginner-oriented Metal resources on the web.

By reading the Apple documentation and following the tutorials I could find, I’ve managed to implement a simple vertex shader and a fragment shader and draw an actual 3D model on the screen. Now I’m trying to draw a second a model, but I’m kind of stuck, because I’m not really sure what’s really the best way to go about it.

Do I…

  • Use a single vertex buffer and index buffer for all of my models, and tell the MTLRenderCommandEncoder the offsets when rendering the individual models?
  • Have a separate vertex buffer / index buffer for each model? Would such an approach scale?
  • Something else?

TL;DR: What is the recommended way to store the vertex data of multiple models in Metal (or any other 3D framework)?

2

2 Answers

7
votes

There is no one recommended way. When you're working at such a low level as Metal, there are many possibilities, and the one you pick depends heavily on the situation and what performance characteristics you want/need to optimize for. If you're just playing around with intro projects, most of these decisions are irrelevant, because the performance issues won't bite until you scale up to a "real" project.

Typically, game engines use one buffer (or set of vertex/index buffers) per model, especially if each model requires different render states (e.g. shaders, bound textures). This means that when new models are introduced to the scene or old ones no longer needed, the requisite resources can be loaded into / removed from GPU memory (by way of creating / destroying MTL objects).

The main use case for doing multiple draws out of (different parts of) the same buffer is when you're mutating the buffer. For example, on frame n you're using the first 1KB of a buffer to draw with, while at the same time you're computing / streaming in new vertex data and writing it to the second 1KB of the buffer... then for frame n + 1 you switch which parts of the buffer are being used for what.

6
votes

To add a bit to rickster's answer, I would encapsulate your model in a class that contains one buffer (or two, if you count the index buffer) per model, and pass an optional parameter with the number of instances of that model you want to create.

Then, keep an additional buffer where you store whatever variations you want to introduce per instance. Usually, it's just the transform and a different material. For instance,

struct PerInstanceUniforms {
  var transform : Transform
  var material : Material
}

In my case, the material contains a UV transform, but the texture has to be the same for all the instances.

Then your model class would look something like this,

class Model {
  fileprivate var indexBuffer : MTLBuffer!
  fileprivate var vertexBuffer : MTLBuffer!
  var perInstanceUniforms : [PerInstanceUniforms]
  let uniformBuffer : MTLBuffer!

  // ... constructors, etc.

  func draw(_ encoder: MTLRenderCommandEncoder) {
    encoder.setVertexBuffer(vertexBuffer, offset: 0, at: 0)
    RenderManager.sharedInstance.setUniformBuffer(encoder, atIndex: 1)
    encoder.setVertexBuffer(self.uniformBuffer, offset: 0, at: 2)
    encoder.drawIndexedPrimitives(type: .triangle, indexCount: numIndices, indexType: .uint16, indexBuffer: indexBuffer, indexBufferOffset: 0, instanceCount: self.numInstances)
  }

  // this gets called when we need to update the buffers used by the GPU
  func updateBuffers(_ syncBufferIndex: Int) {
    let uniformB = uniformBuffer.contents()
    let uniformData = uniformB.advanced(by: MemoryLayout<PerInstanceUniforms>.size * perInstanceUniforms.count * syncBufferIndex).assumingMemoryBound(to: Float.self)
    memcpy(uniformData, &perInstanceUniforms, MemoryLayout<PerInstanceUniforms>.size * perInstanceUniforms.count)
  }
}

Your vertex shader with instances will look something like this,

vertex VertexInOut passGeometry(uint vid [[ vertex_id ]],
                            uint iid [[ instance_id ]],
                            constant TexturedVertex* vdata [[ buffer(0) ]],
                            constant Uniforms& uniforms  [[ buffer(1) ]],
                            constant Transform* perInstanceUniforms [[ buffer(2) ]])
{
  VertexInOut outVertex;
  Transform t = perInstanceUniforms[iid];
  float4x4 m = uniforms.projectionMatrix * uniforms.viewMatrix;
  TexturedVertex v = vdata[vid];
  outVertex.position = m * float4(t * v.position, 1.0);
  outVertex.uv = float2(0,0);
  outVertex.color = float4(0.5 * v.normal + 0.5, 1);
  return outVertex;
}

Here's an example I wrote of using instancing, with a performance analysis: http://tech.metail.com/performance-quaternions-gpu/

You can find the full code for reference here: https://github.com/endavid/VidEngine

I hope that helps.