VAO/VBO indices [re]ordering


I begin to play a little with “advanced” things like Vertex Buffer Objects, normal/light/shadow mappings and others instancing or render to texture methods

I have founded a tutorial repository at that have tutorials for them and found on them a function indexVBO() on the common/vboindexer.cpp file that have given my interrest

This fonction translate vertices, textures, normals coordinates and “multi-indexed” triangles as founded into .OBJ files into a VBO that contain a common array of vertices / textures / normals that is make for that to can be directly used by a index array that contain a list of “optimised” indices to use for to render in OpenGL the object stored into this .obj file

// Returns true iif v1 can be considered equal to v2 
bool is_near(float v1, float v2){ 
    return fabs( v1-v2 ) < 0.01f; 
// Searches through all already-exported vertices 
// for a similar one. 
// Similar = same position + same UVs + same normal 
bool getSimilarVertexIndex(  
    glm::vec3 & in_vertex,  
    glm::vec2 & in_uv,  
    glm::vec3 & in_normal,  
    std::vector<glm::vec3> & out_vertices, 
    std::vector<glm::vec2> & out_uvs, 
    std::vector<glm::vec3> & out_normals, 
    unsigned short & result 
    // Lame linear search 
    for ( unsigned int i=0; i<out_vertices.size(); i++ ){ 
        if ( 
            is_near( in_vertex.x , out_vertices[i].x ) && 
            is_near( in_vertex.y , out_vertices[i].y ) && 
            is_near( in_vertex.z , out_vertices[i].z ) && 
            is_near( in_uv.x     , out_uvs     [i].x ) && 
            is_near( in_uv.y     , out_uvs     [i].y ) && 
            is_near( in_normal.x , out_normals [i].x ) && 
            is_near( in_normal.y , out_normals [i].y ) && 
            is_near( in_normal.z , out_normals [i].z ) 
            result = i; 
            return true; 
    // No other vertex could be used instead. 
    // Looks like we'll have to add it to the VBO. 
    return false; 
void indexVBO_slow( 
    std::vector<glm::vec3> & in_vertices, 
    std::vector<glm::vec2> & in_uvs, 
    std::vector<glm::vec3> & in_normals, 
    std::vector<unsigned short> & out_indices, 
    std::vector<glm::vec3> & out_vertices, 
    std::vector<glm::vec2> & out_uvs, 
    std::vector<glm::vec3> & out_normals 
    // For each input vertex 
    for ( unsigned int i=0; i<in_vertices.size(); i++ ){ 
        // Try to find a similar vertex in out_XXXX 
        unsigned short index; 
        bool found = getSimilarVertexIndex(in_vertices[i], in_uvs[i], in_normals[i],     out_vertices, out_uvs, out_normals, index); 
        if ( found ){ // A similar vertex is already in the VBO, use it instead ! 
            out_indices.push_back( index ); 
        }else{ // If not, it needs to be added in the output data. 
            out_vertices.push_back( in_vertices[i]); 
            out_uvs     .push_back( in_uvs[i]); 
            out_normals .push_back( in_normals[i]); 
            out_indices .push_back( (unsigned short)out_vertices.size() - 1 ); 

(the fast version is identical but use a std::map and his iterator for to more speedly found the SimilarVertexIndex)

This remember me a discussion about a method for to optimally organize vertices/textures/normals/colors coordinates into vertex arrays and that using separate indices for each set of vertices, textures, normals and colors coordinates isn’t a good thing because of the method used for the vertex caching into the GPU

But have the order on which the “desindexed before but reindexed after” final indices are generated can have a big impact on performance ?

I think that yes because for example one triangle can use the first vertice, one the middle and the last vertice generated, so the GPU cache is certainly very too small for to can cache alls differents vertices used by a big and/or detailled object :frowning:

Can we ask the size of the vertices cache of the used GPU on OpenGL ?

If not, what are the miminal and maximal sizes of vertex caches that we find on various GPU cards ?

Can we ask the size of the vertices cache of the used GPU on OpenGL ?

As far as I know no. From what I have read the cache is usally around 20-30 vertices.

Here is an article of some algorithms

Thanks, Tonyo_au

So I think that I have to implement something like a mixt of Forsyth’s or Tipify’s algos (for to handle vertices reordering) and an extended vboindexer (for to transform v/t/n[/c] multi-indexed coordinates to singles uniques vertices)

But I have first to implement Forsyth’s and Tipify algos on my actuals Linux/Android platforms (+ a Xilink Spartan FPGA devel board in some days) for to begin to benchmark the difference of performances between them :slight_smile: