1
votes

I am trying to implement sprite batching but I am not quite sure how I should do it.

Texture batching is not very hard, I just group everything by texture id but I am not sure how I should handle the vertex data.

I could do it like this

texture.bind();
gl_quad.bind();
for(auto& quad: quads){
  send(quad.matrix);
  draw();
}

I would just upload 1 quad to the GPU and then send the matrix as a uniform variable and draw the quad but then I would have 1 draw call for every sprite that I want to draw which is probably not very clever.

Alternatively I could let every sprite have 4 vertices and then I would update them on the CPU, then I would gather all sprites and upload all vertices into one big buffer and bind it.

texture.bind();
auto big_buffer = create_vertex_buffers(quads).bind();
draw();
big_buffer.delete();

I could also use instanced rendering. Upload only one quad, every sprite would have a matrix and then upload all matrices into one buffer and call drawIndirect. I would have to send 9 floats instead of 8 (with the big_buffer version) and I think that drawIndirect is much more expensive than a simple drawcommand.

Are there any other ways that I have missed? What would you recommend?

2

2 Answers

4
votes

I can show you a few classes that works with batches and their implementations; but they do rely on other classes. This work is protected by copyright found in the header section of each file.

CommonStructs.h

// Version: 1.0
// Copyright (c) 2012 by Marek A. Krzeminski, MASc
// http://www.MarkeKnows.com

#ifndef COMMON_STRUCTS_H
#define COMMON_STRUCTS_H

namespace vmk {

// GuiVertex ------------------------------------------------------------------
struct GuiVertex {
    glm::vec2 position;
    glm::vec4 color;
    glm::vec2 texture;

    GuiVertex( glm::vec2 positionIn, glm::vec4 colorIn, glm::vec2 textureIn = glm::vec2() ) :
        position( positionIn ),
        color( colorIn ),
        texture( textureIn )
    {}
}; // GuiVertex

// BatchConfig ----------------------------------------------------------------
struct BatchConfig {
    unsigned    uRenderType;
    int         iPriority;
    unsigned    uTextureId;
    float       fAlpha;

    BatchConfig( unsigned uRenderTypeIn, int iPriorityIn, unsigned uTextureIdIn, float fAlphaIn ) :
        uRenderType( uRenderTypeIn ),
        iPriority( iPriorityIn ),
        uTextureId( uTextureIdIn ),
        fAlpha( fAlphaIn )
    {}

    bool operator==( const BatchConfig& other ) const {
        if ( uRenderType    != other.uRenderType ||
             iPriority      != other.iPriority   ||
             uTextureId     != other.uTextureId  ||
             glm::abs( fAlpha - other.fAlpha ) > 0.004f )
        {
            return false;
        }
        return true;
    }

    bool operator!=( const BatchConfig& other ) const {
        return !( *this == other );
    }
}; // BatchConfig

} // namespace vmk

#endif // COMMON_STRUCTS_H

Batch.h

// Version: 1.0
// Copyright (c) 2012 by Marek A. Krzeminski, MASc
// http://www.MarkeKnows.com

#ifndef BATCH_H
#define BATCH_H

#include "CommonStructs.h"

namespace vmk {

class ShaderManager;
class Settings;

class Batch sealed {
private:
    static Settings*        m_pSettings;
    static ShaderManager*   m_pShaderManager;

    unsigned    m_uMaxNumVertices;
    unsigned    m_uNumUsedVertices;
    unsigned    m_vao;
    unsigned    m_vbo;
    BatchConfig m_config;
    GuiVertex   m_lastVertex;

    // For Debugging Only
    unsigned                 m_uId; // Batch Id
    std::vector<std::string> m_vIds; // Id's Of What Is Contained In This Batch

public:
    Batch( unsigned uId, unsigned uMaxNumVertices );
    ~Batch();

    bool    isBatchConfig( const BatchConfig& config ) const;
    bool    isEmpty() const;
    bool    isEnoughRoom( unsigned uNumVertices ) const;
    Batch*  getFullest( Batch* pBatch );
    int     getPriority() const;

    void    add( const std::vector<GuiVertex>& vVertices, const BatchConfig& config );
    void    add( const std::vector<GuiVertex>& vVertices );
    void    addId( const std::string& strId );
    void    render();

private:
    Batch( const Batch& c ); // Not Implemented
    Batch& operator=( const Batch& c ); // Not Implemented

    void    cleanUp();

}; // Batch

} // namespace vmk

#endif // BATCH_H

Batch.cpp

// Version: 1.0
// Copyright (c) 2012 by Marek A. Krzeminski, MASc
// http://www.MarkeKnows.com

#include "stdafx.h"
#include "Batch.h"

#include "Logger.h"
#include "Property.h"
#include "Settings.h"
#include "ShaderManager.h"

namespace vmk {

Settings*       Batch::m_pSettings      = nullptr;
ShaderManager*  Batch::m_pShaderManager = nullptr;

// ----------------------------------------------------------------------------
// Batch()
Batch::Batch( unsigned uId, unsigned uMaxNumVertices ) :
m_uMaxNumVertices( uMaxNumVertices ),
m_uNumUsedVertices( 0 ),
m_vao( 0 ),
m_vbo( 0 ),
m_config(GL_TRIANGLE_STRIP, 0, 0, 1.0f ),
m_lastVertex( glm::vec2(), glm::vec4() ),
m_uId( uId ) {

    if ( nullptr == m_pSettings ) {
        m_pSettings = Settings::get();
    }
    if ( nullptr == m_pShaderManager ) {
        m_pShaderManager = ShaderManager::get();
    }

    // Optimal Size For A Batch Is Between 1-4MB In Size. Number Of Elements That Can Be Stored In A
    // Batch Is Determined By Calculating #Bytes Used By Each Vertex
    if ( uMaxNumVertices < 1000 ) {
        std::ostringstream strStream;
        strStream << __FUNCTION__ << " uMaxNumVertices{" << uMaxNumVertices << "} is too small. Choose a number >= 1000 ";
        throw ExceptionHandler( strStream );
    }

    // Clear Error Codes
    glGetError();

    if ( m_pSettings->getOpenglVersion().x >= 3 ) {
        glGenVertexArrays( 1, &m_vao );
        glBindVertexArray( m_vao );
    }

    // Create Batch Buffer
    glGenBuffers( 1, &m_vbo );
    glBindBuffer( GL_ARRAY_BUFFER, m_vbo );
    glBufferData( GL_ARRAY_BUFFER, uMaxNumVertices * sizeof( GuiVertex ), nullptr, GL_STREAM_DRAW );

    if ( m_pSettings->getOpenglVersion().x >= 3 ) {
        unsigned uOffset = 0;
        m_pShaderManager->enableAttribute( A_POSITION, sizeof( GuiVertex ), uOffset );
        uOffset += sizeof( glm::vec2 );
        m_pShaderManager->enableAttribute( A_COLOR, sizeof( GuiVertex ), uOffset );
        uOffset += sizeof( glm::vec4 );
        m_pShaderManager->enableAttribute( A_TEXTURE_COORD0, sizeof( GuiVertex ), uOffset );

        glBindVertexArray( 0 );

        m_pShaderManager->disableAttribute( A_POSITION );
        m_pShaderManager->disableAttribute( A_COLOR );
        m_pShaderManager->disableAttribute( A_TEXTURE_COORD0 );
    }

    glBindBuffer( GL_ARRAY_BUFFER, 0 );

    if ( GL_NO_ERROR != glGetError() ) {
        cleanUp();
        throw ExceptionHandler( __FUNCTION__ + std::string( " failed to create batch" ) );
    }
} // Batch

// ----------------------------------------------------------------------------
// ~Batch()
Batch::~Batch() {
    cleanUp();
} // ~Batch

// ----------------------------------------------------------------------------
// cleanUp()
void Batch::cleanUp() {
    if ( m_vbo != 0 ) {
        glBindBuffer( GL_ARRAY_BUFFER, 0 );
        glDeleteBuffers( 1, &m_vbo );
        m_vbo = 0;
    }
    if ( m_vao != 0 ) {
        glBindVertexArray( 0 );
        glDeleteVertexArrays( 1, &m_vao );
        m_vao = 0;
    }
} // cleanUp

// ----------------------------------------------------------------------------
// isBatchConfig()
bool Batch::isBatchConfig( const BatchConfig& config ) const {
    return ( config == m_config );
} // isBatchConfigh

// ----------------------------------------------------------------------------
// isEmpty()
bool Batch::isEmpty() const {
    return ( 0 == m_uNumUsedVertices );
} // isEmpty

// ----------------------------------------------------------------------------
// isEnoughRoom()
// Returns True If The Number Of Vertices Passed In Can Be Stored In This Batch
// Without Reaching The Limit Of How Many Vertices Can Fit In The Batch
bool Batch::isEnoughRoom( unsigned uNumVertices ) const {
    // 2 Extra Vertices Are Needed For Degenerate Triangles Between Each Strip
    unsigned uNumExtraVertices = ( GL_TRIANGLE_STRIP == m_config.uRenderType && m_uNumUsedVertices > 0 ? 2 : 0 );

    return ( m_uNumUsedVertices + uNumExtraVertices + uNumVertices <= m_uMaxNumVertices );
} // isEnoughRoom

// ----------------------------------------------------------------------------
// getFullest()
// Returns The Batch That Contains The Most Number Of Stored Vertices Between
// This Batch And The One Passed In
Batch* Batch::getFullest( Batch* pBatch ) {
    return ( m_uNumUsedVertices > pBatch->m_uNumUsedVertices ? this : pBatch );
} // getFullest

// ----------------------------------------------------------------------------
// getPriority()
int Batch::getPriority() const {
    return m_config.iPriority;
} // getPriority

// ----------------------------------------------------------------------------
// add()
// Adds Vertices To Batch And Also Sets The Batch Config Options
void Batch::add( const std::vector<GuiVertex>& vVertices, const BatchConfig& config ) {
    m_config = config;
    add( vVertices );
} // add

// ----------------------------------------------------------------------------
// add()
void Batch::add( const std::vector<GuiVertex>& vVertices ) {
    // 2 Extra Vertices Are Needed For Degenerate Triangles Between Each Strip
    unsigned uNumExtraVertices = ( GL_TRIANGLE_STRIP == m_config.uRenderType && m_uNumUsedVertices > 0 ? 2 : 0 );
    if ( uNumExtraVertices + vVertices.size() > m_uMaxNumVertices - m_uNumUsedVertices ) {
        std::ostringstream strStream;
        strStream << __FUNCTION__ << " not enough room for {" << vVertices.size() << "} vertices in this batch. Maximum number of vertices allowed in a batch is {" << m_uMaxNumVertices << "} and {" << m_uNumUsedVertices << "} are already used";
        if ( uNumExtraVertices > 0 ) {
            strStream << " plus you need room for {" << uNumExtraVertices << "} extra vertices too";
        }
        throw ExceptionHandler( strStream );
    }
    if ( vVertices.size() > m_uMaxNumVertices ) {
        std::ostringstream strStream;
        strStream << __FUNCTION__ << " can not add {" << vVertices.size() << "} vertices to batch. Maximum number of vertices allowed in a batch is {" << m_uMaxNumVertices << "}";
        throw ExceptionHandler( strStream );
    }
    if ( vVertices.empty() ) {
        std::ostringstream strStream;
        strStream << __FUNCTION__ << " can not add {" << vVertices.size() << "} vertices to batch.";
        throw ExceptionHandler( strStream );
    }

    // Add Vertices To Buffer
    if ( m_pSettings->getOpenglVersion().x >= 3 ) {
        glBindVertexArray( m_vao );
    }
    glBindBuffer( GL_ARRAY_BUFFER, m_vbo );

    if ( uNumExtraVertices > 0 ) {
        // Need To Add 2 Vertex Copies To Create Degenerate Triangles Between This Strip
        // And The Last Strip That Was Stored In The Batch
        glBufferSubData( GL_ARRAY_BUFFER,         m_uNumUsedVertices * sizeof( GuiVertex ), sizeof( GuiVertex ), &m_lastVertex );
        glBufferSubData( GL_ARRAY_BUFFER, ( m_uNumUsedVertices + 1 ) * sizeof( GuiVertex ), sizeof( GuiVertex ), &vVertices[0] );
    }

    // TODO: Use glMapBuffer If Moving Large Chunks Of Data > 1MB
    glBufferSubData( GL_ARRAY_BUFFER, ( m_uNumUsedVertices + uNumExtraVertices ) * sizeof( GuiVertex ), vVertices.size() * sizeof( GuiVertex ), &vVertices[0] );

    if ( m_pSettings->getOpenglVersion().x >= 3 ) {
        glBindVertexArray( 0 );
    }
    glBindBuffer( GL_ARRAY_BUFFER, 0 );

    m_uNumUsedVertices += vVertices.size() + uNumExtraVertices;

    m_lastVertex = vVertices[vVertices.size() - 1];
} // add

// ----------------------------------------------------------------------------
// addId()
void Batch::addId( const std::string& strId ) {
    m_vIds.push_back( strId );
} // addId

// ----------------------------------------------------------------------------
// render()
void Batch::render() {
    if ( m_uNumUsedVertices == 0 ) {
        // Nothing In This Buffer To Render
        return;
    }

    bool usingTexture = INVALID_UNSIGNED != m_config.uTextureId;
    m_pShaderManager->setUniform( U_USING_TEXTURE, usingTexture );
    if ( usingTexture ) {
        m_pShaderManager->setTexture( 0, U_TEXTURE0_SAMPLER_2D, m_config.uTextureId );
    }

    m_pShaderManager->setUniform( U_ALPHA, m_config.fAlpha );

    // Draw Contents To Buffer
    if ( m_pSettings->getOpenglVersion().x >= 3 ) {
        glBindVertexArray( m_vao );
        glDrawArrays( m_config.uRenderType, 0, m_uNumUsedVertices );
        glBindVertexArray( 0 );

    } else { // OpenGL v2.x
        glBindBuffer( GL_ARRAY_BUFFER, m_vbo );

        unsigned uOffset = 0;
        m_pShaderManager->enableAttribute( A_POSITION, sizeof( GuiVertex ), uOffset );
        uOffset += sizeof( glm::vec2 );
        m_pShaderManager->enableAttribute( A_COLOR, sizeof( GuiVertex ), uOffset );
        uOffset += sizeof( glm::vec4 );
        m_pShaderManager->enableAttribute( A_TEXTURE_COORD0, sizeof( GuiVertex ), uOffset );

        glDrawArrays( m_config.uRenderType, 0, m_uNumUsedVertices );

        m_pShaderManager->disableAttribute( A_POSITION );
        m_pShaderManager->disableAttribute( A_COLOR );
        m_pShaderManager->disableAttribute( A_TEXTURE_COORD0 );

        glBindBuffer( GL_ARRAY_BUFFER, 0 );
    }

    if ( m_pSettings->isDebugLoggingEnabled( Settings::DEBUG_RENDER ) ) {
        std::ostringstream strStream;

        strStream << std::setw( 2 ) << m_uId << " | "
            << std::left << std::setw( 10 );

        if ( GL_LINES == m_config.uRenderType ) {
            strStream << "Lines";
        } else if ( GL_TRIANGLES == m_config.uRenderType ) {
            strStream << "Triangles";
        } else if ( GL_TRIANGLE_STRIP == m_config.uRenderType ) {
            strStream << "Tri Strips";
        } else if ( GL_TRIANGLE_FAN == m_config.uRenderType ) {
            strStream << "Tri Fan";
        } else {
            strStream << "Unknown";
        }

        strStream << " | " << std::right
            << std::setw( 6 ) << m_config.iPriority << " | "
            << std::setw( 7 ) << m_uNumUsedVertices << " | "
            << std::setw( 5 );

        if ( INVALID_UNSIGNED != m_config.uTextureId ) {
            strStream << m_config.uTextureId;
        } else {
            strStream << "None";
        }
        strStream << " |";

        for each( const std::string& strId in m_vIds ) {
            strStream << " " << strId;
        }
        m_vIds.clear();

        Logger::log( strStream );
    }

    // Reset Buffer
    m_uNumUsedVertices = 0;
    m_config.iPriority = 0;
} // render

} // namespace vmk

BatchManager.h

// Version: 1.0
// Copyright (c) 2012 by Marek A. Krzeminski, MASc
// http://www.MarekKnows.com
#ifndef BATCH_MANAGER_H
#define BATCH_MANAGER_H

#include "Singleton.h"
#include "CommonStructs.h"

namespace vmk {

class Batch;

class BatchManager sealed : public Singleton {
private:
    std::vector<std::shared_ptr<Batch>> m_vBatches;

    unsigned    m_uNumBatches;
    unsigned    m_maxNumVerticesPerBatch;

public:
    BatchManager( unsigned uNumBatches, unsigned numVerticesPerBatch );
    virtual ~BatchManager();

    static  BatchManager* const get();

    void    render( const std::vector<GuiVertex>& vVertices, const BatchConfig& config, const std::string& strId );
    void    emptyAll();

protected:
private:
    BatchManager( const BatchManager& c ); // Not Implemented
    BatchManager& operator=( const BatchManager& c); // Not Implemented

    void    emptyBatch( bool emptyAll, Batch* pBatchToEmpty );
    //void  renderBatch( const std::vector<GuiVertex>& vVertices, const BatchConfig& config );

}; // BatchManager

} // namespace vmk

#endif // BATCH_MANAGER_H

BatchManager.cpp

// Version: 1.0
// Copyright (c) 2012 by Marek A. Krzeminski, MASc
// http://www.MarekKnows.com

#include "stdafx.h"
#include "BatchManager.h"

#include "Batch.h"
#include "Logger.h"
#include "Settings.h"

namespace vmk {

static BatchManager* s_pBatchManager = nullptr;
static Settings*     s_pSettings     = nullptr;

// ----------------------------------------------------------------------------
// BatchManager()
BatchManager::BatchManager( unsigned uNumBatches, unsigned numVerticesPerBatch ) : 
Singleton( TYPE_BATCH_MANAGER ),
m_uNumBatches( uNumBatches ),
m_maxNumVerticesPerBatch( numVerticesPerBatch ) {

    // Test Input Parameters
    if ( uNumBatches < 10 ) {
        std::ostringstream strStream;
        strStream << __FUNCTION__ << " uNumBatches{" << uNumBatches << "} is too small.  Choose a number >= 10 ";
        throw ExceptionHandler( strStream );
    }

    // A Good Size For Each Batch Is Between 1-4MB In Size. Number Of Elements That Can Be Stored In A
    // Batch Is Determined By Calculating #Bytes Used By Each Vertex
    if ( numVerticesPerBatch < 1000 ) {
        std::ostringstream strStream;
        strStream << __FUNCTION__ << " numVerticesPerBatch{" << numVerticesPerBatch << "} is too small. Choose A Number >= 1000 ";
        throw ExceptionHandler( strStream );
    }

    // Create Desired Number Of Batches
    m_vBatches.reserve( uNumBatches );
    for ( unsigned u = 0; u < uNumBatches; ++u ) {
        m_vBatches.push_back( std::shared_ptr<Batch>( new Batch( u, numVerticesPerBatch ) ) );
    }

    s_pSettings     = Settings::get();
    s_pBatchManager = this;
} // BatchManager

// ----------------------------------------------------------------------------
// ~BatchManager()
BatchManager::~BatchManager() {
    s_pBatchManager = nullptr;

    m_vBatches.clear();
} // ~BatchManager

// ----------------------------------------------------------------------------
// get()
BatchManager* const BatchManager::get() {
    if ( nullptr == s_pBatchManager ) {
        throw ExceptionHandler( __FUNCTION__ + std::string( " failed, BatchManager has not been constructed yet" ) );
    }
    return s_pBatchManager;
} // get

// ----------------------------------------------------------------------------
// render()
void BatchManager::render( const std::vector<GuiVertex>& vVertices, const BatchConfig& config, const std::string& strId ) {

    Batch* pEmptyBatch   = nullptr;
    Batch* pFullestBatch = m_vBatches[0].get();

    // Determine Which Batch To Put The Vertices Into
    for ( unsigned u = 0; u < m_uNumBatches; ++u ) {
        Batch* pBatch = m_vBatches[u].get();

        if ( pBatch->isBatchConfig( config ) ) {
            if ( !pBatch->isEnoughRoom( vVertices.size() ) ) {
                // First Need To Empty This Batch Before Adding Anything To It
                emptyBatch( false, pBatch );
                if ( s_pSettings->isDebugLoggingEnabled( Settings::DEBUG_RENDER ) ) {
                    Logger::log( "Forced batch to empty to make room for vertices" );
                }
            }
            if ( s_pSettings->isDebugLoggingEnabled( Settings::DEBUG_RENDER ) ) {
                pBatch->addId( strId );
            }
            pBatch->add( vVertices );

            return;
        }

        // Store Pointer To First Empty Batch
        if ( nullptr == pEmptyBatch && pBatch->isEmpty() ) {
            pEmptyBatch = pBatch;
        }

        // Store Pointer To Fullest Batch
        pFullestBatch = pBatch->getFullest( pFullestBatch );
    }

    // If We Get Here Then We Didn't Find An Appropriate Batch To Put The Vertices Into
    // If We Have An Empty Batch, Put Vertices There
    if ( nullptr != pEmptyBatch ) {
        if ( s_pSettings->isDebugLoggingEnabled( Settings::DEBUG_RENDER ) ) {
            pEmptyBatch->addId( strId );
        }
        pEmptyBatch->add( vVertices, config );
        return;
    }

    // No Empty Batches Were Found Therefore We Must Empty One First And Then We Can Use It
    emptyBatch( false, pFullestBatch );
    if ( s_pSettings->isDebugLoggingEnabled( Settings::DEBUG_RENDER ) ) {
        Logger::log( "Forced fullest batch to empty to make room for vertices" );

        pFullestBatch->addId( strId );
    }
    pFullestBatch->add( vVertices, config );
} // render

// ----------------------------------------------------------------------------
// emptyAll()
void BatchManager::emptyAll() {
    emptyBatch( true, m_vBatches[0].get() );

    if ( s_pSettings->isDebugLoggingEnabled( Settings::DEBUG_RENDER ) ) {
        Logger::log( "Forced all batches to empty" );
    }
} // emptyAll

// ----------------------------------------------------------------------------
// CompareBatch
struct CompareBatch : public std::binary_function<Batch*, Batch*, bool> {
    bool operator()( const Batch* pBatchA, const Batch* pBatchB ) const {
        return ( pBatchA->getPriority() > pBatchB->getPriority() );
    } // operator()
}; // CompareFunctor

// ----------------------------------------------------------------------------
// emptyBatch()
// Empties The Batches According To Priority. If emptyAll() Is False Then
// Only Empty The Batches That Are Lower Priority Than The One Specified
// AND Also Empty The One That Is Passed In
void BatchManager::emptyBatch( bool emptyAll, Batch* pBatchToEmpty ) {
    // Sort Bathes By Priority
    std::priority_queue<Batch*, std::vector<Batch*>, CompareBatch> queue;

    for ( unsigned u = 0; u < m_uNumBatches; ++u ) {
        // Add All Non-Empty Batches To Queue Which Will Be Sorted By Order
        // From Lowest To Highest Priority
        if ( !m_vBatches[u]->isEmpty() ) {
            if ( emptyAll ) {
                queue.push( m_vBatches[u].get() );
            } else if ( m_vBatches[u]->getPriority() < pBatchToEmpty->getPriority() ) {
                // Only Add Batches That Are Lower In Priority
                queue.push( m_vBatches[u].get() );
            }
        }
    }

    // Render All Desired Batches
    while ( !queue.empty() ) {
        Batch* pBatch = queue.top();
        pBatch->render();
        queue.pop();
    }

    if ( !emptyAll ) {
        // When Not Emptying All The Batches, We Still Want To Empty
        // The Batch That Is Passed In, In Addition To All Batches
        // That Have Lower Priority Than It
        pBatchToEmpty->render();
    }
} // emptyBatch

} // namespace vmk

Now these classes will not compile directly for they depend and rely on other class objects: Settings, Properties, ShaderManager, Logger, And those objects depend on other objects as well. This is coming from a large scale working OpenGL Graphics Rendering & Game Engine using OpenGL Shaders. This is working source code, optimally bug free.

This may serve as a guide as to how one would design a batch process. And may give insight into the things to consider for example: The types of vertices being rendering { Lines, Triangles, TriangleStrip, TriangleFan etc. }, Priority of where to draw an object based on if it has transparencies or not, Handling Degenerate Triangles with the vertices when creating a batch object.

The way that this is designed is that only matching batch types will fit in the same bucket, and the bucket will try to fill itself, if it is too full to hold the vertices it will then look for another bucket to see if one is available, if no buckets are available it will then search to see which is the fullest and it will empty them from the priority queue to send the vertices to the video card to be rendered.

This is tied into a ShaderManager that manages how OpenGL defines and sets up shader programs and linking them to a program, it is also tied in to an AssetStorage class which is not found here but found in the ShaderManager. This system handles a complete custom GUI, Sprites, Fonts, Textures etc.

If you would like to learn more I would highly suggest visiting www.MarekKnows.com and checking out his Video Tutorial Series on OpenGL; for this specific application you would need to follow his Shader Engine series!

2
votes

It's worth noting that sprite rendering is only really expensive from the standpoint of the context switches in between each sprite rendered. Rendering a quad for each sprite tends to be a trivial expense in comparison.

Instancing here of the geometry data is likely to hinder more than help, since the cost of using a separate transformation matrix per quad rendered tends to outweigh the expense of just uploading a fresh set of vertex attributes per quad. Instancing works best when you have at least moderately complex geometry, like hundreds to thousands of vertices.

Typically if speed is your primary goal, the top priority here is to coalesce your texture data into "sprite sheet"-style texture atlases. The first goal is to have as few texture context switches as possible, and typically far beyond a separate texture image per sprite/frame. This would also make instancing further impractical because each quad or pair of triangles you render would then tend to vary wildly in terms of their texture coordinates.

If you actually reach this point where you have as few texture context switches as possible and want more speed for a bunch of dynamic sprites, then the next practical step (but with diminishing returns) might be to use streaming VBOs. You can fill a streaming VBO with the vertex attributes required to render the tris/quads for the current frame (with different vertex positions and texture coordinates) and then draw the VBO. For the best performance, it might help to chunk the VBOs and not fill them with all the geometry data of your entire scene per frame with a strategy where you fill and draw, fill and draw, fill and draw multiple times per frame.

Nevertheless, since you asked about instancing (which implies to me that you're using a separate image per sprite), your first and biggest gain will probably come from using texture atlases and reducing the texture context switches even further. The geometry-side optimization is a totally separate process and you might do fine for quite a while even using immediate mode here. It would be towards a finishing touch for optimization where you start optimizing that towards streaming VBOs.

This is all with the assumption of dynamic sprites that either move around on the screen or change images. For static tile-style images that never change, you can store their vertex attributes into a static VBO and potentially benefit from instancing (but there we're instancing a boatload of tiles per static VBO, and therefore each static VBO might have hundreds to thousands of vertices each).