I am trying to make a custom tile map editor using OpenGL, C++ with GLFW(Version 3.3.8) library but I am having major issues with generation speeds but I can run games that have large tile maps such as Terraria and large worlds such as MineCraft. I have an NVidia GT 610 and Intel I3-3220 (2 cores) as GPU and CPU. I do not think the issue is GPU at all since once the map is loaded the frame rate is not choppy(in the game engines).
When I create a tile generator that covers a certain width and height in a game engine such as Unity or Godot it takes like 8 minutes to generate a 400x400 tile map using a basic nested for loops with draw(“set tile”) calls in it. This was also with Rule Tiles and Terrain Tiles respectively.
I decided to put the game engines to the side and learned the basics of OpenGL to make a tile map editor.The results were almost strikingly the same as the game engines though as it is taking me around 8 mins to produce a 400x400 tile map in my custom renderer as well. I thought that was a strange part because I thought there would be way less stuff involved on my part since I’m not even managing game object data at this point yet.
The much stranger thing comes into play when I researched into MonoGame and how they render 2D which is drawing using OpenGL or more specifically MonoGame.GL. I set up a sprite batch and drew a 1000x1000 map of 16x16 tiles (this tile size was used previously) instantly (less then 2 seconds). I looked into their code base and did not see anything out of the norm or vastly different from what I was doing in OpenGL.Looking further at their code base it seems they are even using more function calls per draw then me I think. So far MonoGame has rendered the most tiles at the fastest speed out of my custom renderer and the 2 game engines.
If you follow the draw call’s code definitions you can see the execution flow of the entire draw call and it is pretty loaded imo. There are while loops used for iteration and arrays used for collections. Then it is a OpenGL call in the end.
SpriteBatch.Draw(…) → FlushIfNeeded() → SpriteBatcher.DrawBatch() → FlushVertexArray() → GraphicsDevice.DrawUserIndexedPrimitives() → GraphicDevice.PlatformDrawUserIndexedPrimitives() ->GL.DrawElements(…)
The primary function calls for every sprite giving me the fastest speed.
So in OpenGL I set up my shader, VBO,VAO and EBO classes and was able to render the triangle so I moved towards tiles.
Here is the Main function when I declare the tilemap and generate it.
int main()
{
glfwInit();
TileMap tiles;
TileMapRenderer tmRenderer;
Camera camera(Vector2f(0.0f, 0.0f), 1.0f);
int windowHeight;
int windowWidth;
Batch2D batch;
glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3);
glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
GLFWwindow* window = glfwCreateWindow(800, 600, "Game Engine", NULL, NULL);;
if (window == NULL)
{
std::cout << "Failed to create window" << std::endl;
glfwTerminate();
return -1;
}
glfwMakeContextCurrent(window);
gladLoadGL();
glfwGetWindowSize(window, &windowWidth, &windowHeight);
glViewport(0, 0, 800, 800);
tiles.height = 400;
tiles.width = 400;
tiles.tileSetTexWidth = GetTileSetTextureWidth("C:/Users/Michael/source/repos/GameEngine/Graphics/grassTilesTest.png");
tiles.tileSetTexHeight = GetTileSetTextureHeight("C:/Users/Michael/source/repos/GameEngine/Graphics/grassTilesTest.png");
tiles.tileSize = 16;
//Start Generation
auto start = std::chrono::system_clock().now();
tiles.GenerateMap();
auto end = std::chrono::system_clock().now();
std::chrono::duration<double> elapsed = end - start;
std::cout << "Time : " << elapsed.count() << " seconds" << std::endl;
tiles.tileSetHandle = LoadTileSetTexture();
tmRenderer.Initialize(tiles);
GLuint tex0Uni = glGetUniformLocation(tmRenderer.rendererShader.Id, "texture0");
assert(tex0Uni != -1);
glUniform1f(tex0Uni, 0);
while (!glfwWindowShouldClose(window))
{
glClearColor(0.4f, .04f, .35f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT);
if (glfwGetKey(window, GLFW_KEY_UP))
{
camera.position.y += .025f;
}
else if (glfwGetKey(window, GLFW_KEY_DOWN))
{
camera.position.y -= .025f;
}
if (glfwGetKey(window, GLFW_KEY_RIGHT))
{
camera.position.x += .025f;
}
else if (glfwGetKey(window, GLFW_KEY_LEFT))
{
camera.position.x -= .025f;
}
//Render map
tmRenderer.Render(camera, windowWidth, windowHeight);
glfwSwapBuffers(window);
glfwPollEvents();
}
tmRenderer.Dispose();
glfwDestroyWindow(window);
glfwTerminate();
return 0;
}
I have a breakpoint after the time stamp message because the tiles render just fine and I do not need help with that part but that GenerateMap() function takes around 5 seconds for 40x40 map and 408.106 seconds (8.1 minutes) for 400x400 (The same as Unity and Godot with less functions being called since autotiles are involved?)
Here is the GenerateMap() function:
void TileMap::GenerateMap()
{
for (float x = 0; x < width; x++)
{
for (float y = 0; y < height; y++)
{
int tileId = GenerateTileId();
int texX = tileId * tileSize;
int texY = 0;
while (true)
{
if (texX > tileSetTexWidth)
{
texX -= tileSetTexWidth;
texY += tileSize;
}
else
{
break;
}
}
//Arithmetic Computations
float tX = (float)texX / tileSetTexWidth;
float tY = (float)texY / tileSetTexHeight;
float tXSpan = (float)tileSize / tileSetTexWidth;
float tYSpan = (float)tileSize / tileSetTexHeight;
float xPadding = (float)1 / tileSetTexWidth;
float yPadding = (float)1 / tileSetTexHeight;
std::cout << "X: " << x << std::endl;
std::cout << "Y: " << y << std::endl;
std::cout << "TexX: " << texX << std::endl;
std::cout << "TexY: " << (float)texY << std::endl;
std::cout << "TX: " << tX << std::endl;
std::cout << "TY: " << (float)tY << std::endl;
std::cout << "X Span: " << tXSpan << std::endl;
std::cout << "Y Span: " << (float)tYSpan << std::endl;
tiles.push_back(x/10);
tiles.push_back(y/10);
tiles.push_back(tX);
tiles.push_back(tY);
tiles.push_back((x + 1)/10);
tiles.push_back(y/10);
tiles.push_back(tX + tXSpan);
tiles.push_back(tY);
tiles.push_back(x/10);
tiles.push_back((y + 1)/10);
tiles.push_back(tX);
tiles.push_back(tY + tYSpan);
tiles.push_back((x + 1)/10);
tiles.push_back((y + 1)/10);
tiles.push_back(tX + tXSpan);
tiles.push_back(tY + tYSpan);
std::cout << "Generating... : " << tiles.size() << std::endl;
}
}
}
int TileMap::GenerateTileId()
{
//Insert tileId generation code here eventually
return 3;
}
I am using for loops to create/store VBO data transfered as a vector to upload to the VAO. Some notes: Removing the while loop only takes off 1.3 seconds at 40x40. Removing the vector pushback methods takes off 40 seconds at 400x400 (7 minutes instead of 8 yay) Removing the arithmetic computations with the vector pushback methods takes off 450 seconds at 400x400 (30 seconds instead of 8 minutes…still slower than Monogame 2 seconds @ 1000x1000) I use for loops while Monogame uses while loops (I do not think that is giving me 8 minute differences though) Monogame uses arrays instead of vectors (I don’t think collection assignement (std::vector.pushback() vs array[i] = val) is much different) Monogame uses pointers to arrays Monogame sprite drawing and basic for loop testing shows that I have no problem computing 1000x1000 item iterations on my CPU. I’ve tried setting up my project similar to their code base and even expanded my functions like them using custom batch classes with buffers but I got the same results (8 minutes). I have not tried an array yet or replacing the for loops with a while loop.
I guess I’m going to try using arrays next maybe but I do not think that is it.
I’m not sure what about my generation code is giving 8 minutes at 400x400 compared with their Monogame.GL draw code data generation giving less than 2 seconds with 1000x1000.
I want to be able to eventually have chunk loading and taking more than a few seconds to load chunks sounds laggy and annoying.
Any optimizations, suggestions and questions are more than welcome. Go in on it! Thank you.