Thank you so much @GClements, @Alfonse_Reinheart and @Dark_Photon for your very helpful suggestions.
I’d like to summarise what I’ve done, both for myself and others reading this topic in the future:
-
Instead of allocating a PBO each time, I allocated a larger PBO at the start of my program. I keep this PBO around at all times, and I load all textures into/from it. This has sped things up and also simplified my code. I bound it with the flags “GL_MAP_WRITE_BIT | GL_MAP_PERSISTENT_BIT | GL_MAP_COHERENT_BIT”. As far as I understand, I should still implement some synchronisation, so that data is not being written and read at the same time
-
I’ve changed from a format of GL_RGBA to GL_BGRA. This has sped things up noticeably. For now I’m swapping R and B when I need to in my shaders, but I think it would probably be better/faster to read the data from the image directly as BGRA?
-
Instead of creating texture names each time, I’ve created 128 at the start of my program, which I then use whenever I need a new texture. As some of you have helpfully pointed out, the large stutter on glGenTextures() was simply due to that fact that this call was blocking to wait for all the texture copying done before it. As a result, the stuttering has moved to my glfwSwapBuffers() call.
Overall, the stutter has reduced to around 700ms. As this now happens in the glfwSwapBuffers() call, I assume this is simply how long it takes to copy and prepare ~100MB of image data from the PBO to the textures. Annoyingly, this stutter still happens even if I don’t use the textures afterwards at all.
I’m loading 15 textures, so even if I were to load a single texture each frame, it would presumably still stutter for ~46ms, which would be visible. Therefore, I’m going to keep looking for ways to speed up this upload. I hope I don’t have to resort to creating another OpenGL context on a second thread!
If anyone has other suggestions, I’d naturally love to hear them!
Edit: I’ve looked it up, and it seems like a normal texture upload speed should be around 5+GB/s, so I must still be doing something very wrong!
Edit: Some of my current code:
Creating a persistent PBO:
glGenBuffers(1, &this->pbo);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, this->pbo);
GLbitfield flags = GL_MAP_WRITE_BIT | GL_MAP_PERSISTENT_BIT | GL_MAP_COHERENT_BIT;
glBufferStorage(GL_PIXEL_UNPACK_BUFFER, this->total_size, 0, flags);
this->memory = glMapBufferRange(
GL_PIXEL_UNPACK_BUFFER, 0, this->total_size, flags
);
Copying image data to the PBO:
// For all 5 textures.
unsigned char *image_data = ResourceManager::load_image(
this->albedo_texture_path, &this->albedo_data_width,
&this->albedo_data_height, &this->albedo_data_n_components, true
);
this->albedo_pbo_idx = persistent_pbo->get_new_idx();
memcpy(
persistent_pbo->get_memory_for_idx(this->albedo_pbo_idx),
image_data,
persistent_pbo->texture_size
);
ResourceManager::free_image(image_data);
Copying from the PBO to the texture:
this->material_texture = global_texture_pool[global_texture_pool_next_idx++];
glBindTexture(GL_TEXTURE_2D_ARRAY, this->material_texture);
glTexImage3D(
GL_TEXTURE_2D_ARRAY, 0, GL_RGBA,
persistent_pbo->width, persistent_pbo->height,
5, 0, GL_BGRA, GL_UNSIGNED_BYTE, 0
);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_T, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, persistent_pbo->pbo);
// This happens 5 times (once for each sub-texture).
glTexSubImage3D(
GL_TEXTURE_2D_ARRAY, 0, 0, 0, 0,
this->albedo_data_width, this->albedo_data_height,
1, GL_BGRA, GL_UNSIGNED_BYTE,
persistent_pbo->get_offset_for_idx(this->albedo_pbo_idx)
);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);
glGenerateMipmap(GL_TEXTURE_2D_ARRAY);