Using gl_VertexID as an array index?

I came across some unexpected behavior when trying to use gl_VertexID as an array index, and was hoping somebody could tell me whether the problem is mine or OpenGL’s.

I was originally trying to write a GLSL program to draw a procedural full-screen quad with no vertex input; all the application would do is bind the program and call glDrawArrays(GL_TRIANGLE_STRIP, 0, 4) and the shaders would use gl_VertexID to look up the correct positions/UVs for each vertex. Apparently that doesn’t work, because GL requires at least a one-element VBO to be bound (at least, that’s what other threads on this forum tell me). However, I tried modifying/simplifying my program until eventally it reached the following state.

At this point, I’m feeding the input positions and colors for the full-screen quad in through a VBO. The vertex shader can either pass through through as-is, or use gl_VertexID to look up the values in local constant arrays. The results should be identical in either case.

#version 150 core

in vec4 in_Position;
in vec4 in_Color;
out vec4 vert_Color;
void main(void)
{
	const vec4 positions[4] = vec4[4](
		vec4(-1, -1, 0, 1),
		vec4( 1, -1, 0, 1),
		vec4(-1,  1, 0, 1),
		vec4( 1,  1, 0, 1)
	);

	const vec4 colors[4] = vec4[4](
		vec4(1,0,0,1),
		vec4(0,1,0,1),
		vec4(1,0,1,1),
		vec4(1,1,0,1)
	);

	// Uncomment one line from each group.

	gl_Position = in_Position; // Read from VBO
	//gl_Position = positions[gl_VertexID]; // read from array

	//vert_Color = in_Color; // Read from VBO
	//vert_Color = colors[gl_VertexID]; // Read from array
	vert_Color = vec4(0.25*gl_VertexID); // Generate procedurally
}

The fragment program just interpolates the output vertex colors:

#version 150 core
in vec4 vert_Color;
out vec4 out_Color;
void main(void)
{
    out_Color = vert_Color;
}

If I read all the vertex data from the VBOs, I get the expected multi-colored full-screen quad. If I read the position from the VBO and the color using gl_VertexID, I get a solid red quad (more generally, all four vertices take the color from the first element of the colors[] array). If I read the vertex position from the VBO, then nothing is drawn at all. My guess is that in this case, all four vertices are rendering to the same point, giving a degenerate quad.

As a third test, I tried using gl_VertexID to calculate the vertex color directly (not going through an array). This gives a smooth gradient as expected, so gl_VertexID does seem to be taking non-zero values. It just seems to have trouble when used as an index to an array.

Am I missing something obvious? Here’s the full program source, with as many exterior dependencies as possible removed (except for GLUT):

#include "GL/openglut.h"
#include "GL/glext.h"

#include <assert.h>
#include <stdio.h>

PFNGLUSEPROGRAMPROC					glUseProgram = 0;
PFNGLGENVERTEXARRAYSPROC			glGenVertexArrays = 0;
PFNGLBINDVERTEXARRAYPROC			glBindVertexArray = 0;
PFNGLGENBUFFERSPROC					glGenBuffers = 0;
PFNGLBINDBUFFERPROC					glBindBuffer = 0;
PFNGLBUFFERDATAPROC					glBufferData = 0;
PFNGLVERTEXATTRIBPOINTERPROC		glVertexAttribPointer = 0;
PFNGLENABLEVERTEXATTRIBARRAYPROC	glEnableVertexAttribArray = 0;
PFNGLCREATEPROGRAMPROC				glCreateProgram = 0;
PFNGLCREATESHADERPROC				glCreateShader = 0;
PFNGLSHADERSOURCEPROC				glShaderSource = 0;
PFNGLCOMPILESHADERPROC				glCompileShader = 0;
PFNGLGETSHADERIVPROC				glGetShaderiv = 0;
PFNGLGETSHADERINFOLOGPROC			glGetShaderInfoLog = 0;
PFNGLATTACHSHADERPROC				glAttachShader = 0;
PFNGLDELETESHADERPROC				glDeleteShader = 0;
PFNGLBINDATTRIBLOCATIONPROC			glBindAttribLocation = 0;
PFNGLLINKPROGRAMPROC				glLinkProgram = 0;
PFNGLGETPROGRAMIVPROC				glGetProgramiv = 0;
PFNGLGETPROGRAMINFOLOGPROC			glGetProgramInfoLog = 0;
PFNGLVALIDATEPROGRAMPROC			glValidateProgram = 0;

const char *vpSource =
"#version 150 core
"
"
"
"in vec4 in_Position;
"
"in vec4 in_Color;
"
"out vec4 vert_Color;
"
"void main(void)
"
"{
"
"	const vec4 positions[4] = vec4[4](
"
"		vec4(-1, -1, 0, 1),
"
"		vec4( 1, -1, 0, 1),
"
"		vec4(-1,  1, 0, 1),
"
"		vec4( 1,  1, 0, 1)
"
"	);
"
"
"
"	const vec4 colors[4] = vec4[4](
"
"		vec4(1,0,0,1),
"
"		vec4(0,1,0,1),
"
"		vec4(1,0,1,1),
"
"		vec4(1,1,0,1)
"
"	);
"
"
"
"	// Uncomment one line from each group.
"
"
"
"	gl_Position = in_Position; // Read from VBO
"
"	//gl_Position = positions[gl_VertexID]; // read from array
"
"
"
"	vert_Color = in_Color; // Read from VBO
"
"	//vert_Color = colors[gl_VertexID]; // Read from array
"
"	//vert_Color = vec4(0.25*gl_VertexID); // Generate procedurally
"
"}
";
const char *fpSource =
"#version 150 core
"
"in vec4 vert_Color;
"
"out vec4 out_Color;
"
"void main(void)
"
"{
"
"    out_Color = vert_Color;
"
"}
";

GLuint prog_id = 0;
GLuint vao = 0;
GLuint vbo = 0;
static void init(void)
{
	// Create vertex array object
	glGenVertexArrays(1, &vao);
	glBindVertexArray(vao);

	// Create vertex buffer
	float vertices[] = 
	{
		-1, -1, 0, 1,		1, 0, 0, 1,
		 1, -1, 0, 1,		0, 1, 0, 1,
		-1,  1, 0, 1,		1, 0, 1, 1,
		 1,  1, 0, 1,		1, 1, 0, 1
	};
	glGenBuffers(1, &vbo);
	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, 4*8*sizeof(float), vertices, GL_STATIC_DRAW);
	const GLuint positionAttribIndex = 0;
	const GLuint colorAttribIndex = 1;
	glVertexAttribPointer(positionAttribIndex, 4, GL_FLOAT, GL_FALSE, 8*sizeof(float), 0);
	glVertexAttribPointer(colorAttribIndex, 4, GL_FLOAT, GL_FALSE, 8*sizeof(float), (void*)(4*sizeof(float)));
	glEnableVertexAttribArray(positionAttribIndex);
	glEnableVertexAttribArray(colorAttribIndex);

	glBindVertexArray(0);

	prog_id = glCreateProgram();
	// Vertex program
	GLuint vpShader = glCreateShader(GL_VERTEX_SHADER);
	assert(vpShader != 0);
	glShaderSource(vpShader, 1, &vpSource, NULL);
	glCompileShader(vpShader);
	int vpStatus = 0;
	glGetShaderiv(vpShader, GL_COMPILE_STATUS, &vpStatus);
	{
		char infoLog[8192];
		glGetShaderInfoLog(vpShader, 8192, NULL, infoLog);
		printf("vertex shader compile log:
%s
", infoLog);
	}
	assert(vpStatus == 1);
	glAttachShader(prog_id, vpShader);
	glDeleteShader(vpShader);
	// Fragment program
	GLuint fpShader = glCreateShader(GL_FRAGMENT_SHADER);
	assert(fpShader != 0);
	glShaderSource(fpShader, 1, &fpSource, NULL);
	glCompileShader(fpShader);
	int fpStatus = 0;
	glGetShaderiv(fpShader, GL_COMPILE_STATUS, &fpStatus);
	{
		char infoLog[8192];
		glGetShaderInfoLog(fpShader, 8192, NULL, infoLog);
		printf("fragment shader compile log:
%s
", infoLog);
	}
	assert(fpStatus == 1);
	glAttachShader(prog_id, fpShader);
	glDeleteShader(fpShader);
	// Attribute indices
	glBindAttribLocation(prog_id, positionAttribIndex, "in_Position");
	glBindAttribLocation(prog_id, colorAttribIndex, "in_Color");
	// Link the program object
	glLinkProgram(prog_id);
	int status = 0;
	glGetProgramiv(prog_id, GL_LINK_STATUS, &status);
	{
		char infoLog[8192];
		glGetProgramInfoLog(prog_id, 8192, NULL, infoLog);
		printf("program link log:
%s
", infoLog);
	}
	assert(status == 1);

	// cleanup
	glUseProgram(0);
	{
		GLenum errorCode = glGetError();
		assert(errorCode == GL_NO_ERROR);
	}
}


static void renderScene(void)
{
	glClearColor(1.0f, 0.0f, 1.0f, 1.0f);
	glClear(GL_COLOR_BUFFER_BIT);
	glDisable(GL_DEPTH_TEST);

	// set state
	glUseProgram(prog_id);
	glBindVertexArray(vao);

	// Validate before use
	static bool firstTime = true;
	if (firstTime)
	{
		firstTime = false;
		glValidateProgram(prog_id);
		GLint status = 0;
		glGetProgramiv(prog_id, GL_VALIDATE_STATUS, &status);
		{
			char infoLog[8192];
			glGetProgramInfoLog(prog_id, 8192, NULL, infoLog);
			printf("program validation log:
%s
", infoLog);
		}
		assert(status == 1);
	}

	// Draw full-screen quad, pre-projected into clip space.
	glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);

	// cleanup
	glBindVertexArray(0);
	glUseProgram(0);
	{
		GLenum errorCode = glGetError();
		assert(errorCode == GL_NO_ERROR);
	}
	glFlush();
}


int main(int argc, char **argv) {
	glutInit(&argc, argv);
	glutInitDisplayMode(GLUT_DEPTH | GLUT_SINGLE | GLUT_RGBA);
	glutInitWindowPosition(100,100);
	glutInitWindowSize(320,320);
	glutCreateWindow("Basic GLUT Project");
	glutDisplayFunc(renderScene);

	// Get extension pointers
#if defined(_MSC_VER)
	glUseProgram = (PFNGLUSEPROGRAMPROC)wglGetProcAddress("glUseProgram");
	glGenVertexArrays = (PFNGLGENVERTEXARRAYSPROC)wglGetProcAddress("glGenVertexArrays");
	glBindVertexArray = (PFNGLBINDVERTEXARRAYPROC)wglGetProcAddress("glBindVertexArray");
	glGenBuffers = (PFNGLGENBUFFERSPROC)wglGetProcAddress("glGenBuffers");
	glBindBuffer = (PFNGLBINDBUFFERPROC)wglGetProcAddress("glBindBuffer");
	glBufferData = (PFNGLBUFFERDATAPROC)wglGetProcAddress("glBufferData");
	glVertexAttribPointer = (PFNGLVERTEXATTRIBPOINTERPROC)wglGetProcAddress("glVertexAttribPointer");	
	glEnableVertexAttribArray = (PFNGLENABLEVERTEXATTRIBARRAYPROC)wglGetProcAddress("glEnableVertexAttribArray");
	glCreateProgram = (PFNGLCREATEPROGRAMPROC)wglGetProcAddress("glCreateProgram");
	glCreateShader = (PFNGLCREATESHADERPROC)wglGetProcAddress("glCreateShader");
	glShaderSource = (PFNGLSHADERSOURCEPROC)wglGetProcAddress("glShaderSource");
	glCompileShader = (PFNGLCOMPILESHADERPROC)wglGetProcAddress("glCompileShader");
	glGetShaderiv = (PFNGLGETSHADERIVPROC)wglGetProcAddress("glGetShaderiv");
	glGetShaderInfoLog = (PFNGLGETSHADERINFOLOGPROC)wglGetProcAddress("glGetShaderInfoLog");
	glAttachShader = (PFNGLATTACHSHADERPROC)wglGetProcAddress("glAttachShader");
	glDeleteShader = (PFNGLDELETESHADERPROC)wglGetProcAddress("glDeleteShader");				
	glBindAttribLocation = (PFNGLBINDATTRIBLOCATIONPROC)wglGetProcAddress("glBindAttribLocation");
	glLinkProgram = (PFNGLLINKPROGRAMPROC)wglGetProcAddress("glLinkProgram");
	glGetProgramiv = (PFNGLGETPROGRAMIVPROC)wglGetProcAddress("glGetProgramiv");
	glGetProgramInfoLog = (PFNGLGETPROGRAMINFOLOGPROC)wglGetProcAddress("glGetProgramInfoLog");
	glValidateProgram = (PFNGLVALIDATEPROGRAMPROC)wglGetProcAddress("glValidateProgram");
#endif

	init();

	glutMainLoop();
	return 0;
}

I forgot to mention the obligatory platform specs: this is under Windows, ATI Radeon 3450 card, Catalyst 10.4 drivers.

I was originally trying to write a GLSL program to draw a procedural full-screen quad with no vertex input; all the application would do is bind the program and call glDrawArrays(GL_TRIANGLE_STRIP, 0, 4) and the shaders would use gl_VertexID to look up the correct positions/UVs for each vertex.

I’m curious; why do you think this is a good idea?

From a performance standpoint, the time the system spends loading the 24 (at most) floats from your buffer objects is inconsequential. And since this is a full-screen quad, you can even upload these vertices in clip-space coordinates (-1,1, with a W of 1), thus making your vertex shader entirely pass-through. So I can’t imagine how you think that this index method will be faster than pulling 1 32-byte cache line from memory. Especially since the performance will be dominated by fragment shader performance, since you’re using 4 vertices to generate millions of fragments.

So if there is no performance to be gained, what is it that you expect to be gained, exactly?

Yes, I understand it doesn’t make a huge difference from a performance standpoint; the vertex processing workload is inconsequential either way. It was purely a matter of convenience; if all the data is stored in shader constants, there’s less code to write on the CPU side to set up the necessary state for the draw call.

But ultimately that’s not the point I wanted to make; regardless of the overall merits of the shader, it doesn’t seem to be working th way it should. Am I doing something wrong, or is this a driver bug?

Also FWIW, this code seems to all work correctly on NVIDIA hardware. So, the options seem to be a) my code is correct, and ATI has a bug in their drivers, or b) my code is invalid OpenGL, and the results are undefined, but NVIDIA happens to handle it more gracefully.

Hrm.

This is a bug in 10.4 driver but now is fixed. You have to wait for one or two monthes to try the new driver. Thanks for your feedback.

Frank

Aha, the fix is in Catalyst 10.5. Thanks Frank!