ARBvbo posted

DarkWIng · March 22, 2003, 11:31am

Will newer builds also reduce the cost of ARV_VBO binding? I’m using 43.30 and reducing binding calls increased by framerate by 15%. I still cant get it to work faster than 70% speed of VAR.

JD1 · March 22, 2003, 1:41pm

PK, yes, a better perhaps custom doc tool is needed. That’s why I think sgi/ihv should head it. The nice thing about the microsoft’s compressed htlm format is that it offers search, indexing and bookmarking abilities which I find nifty. Though MS has abandoned html help for its next xml format in longhorn os there are tools out there that can convert .chm into pdf but loose some functionality in the conversion. The .chm would be a temporary bandage, the help files are written in html so they are not hard to port to some other docs formats/tools.

SirKnight · March 22, 2003, 2:35pm

JD, I really like the format of your docs. If all extensions were setup in a format like this it would be much easier to get to the particular info I want to look at right now instead of scrolling through a huge document of pure text. Plus the bolding, coloring and all that is a nice touch. Maybe this could be a project we all contribute to to have all the extentsions in there. Because it would take a bit of work to get all the extensions in there. I’d be willing to help though.

-SirKnight

fritzlang · March 22, 2003, 2:47pm

Originally posted by DarkWIng:
Will newer builds also reduce the cost of ARV_VBO binding? I’m using 43.30 and reducing binding calls increased by framerate by 15%. I still cant get it to work faster than 70% speed of VAR.

I would like to hear how fast your VBO is compared to regular non-extended vertex arrays. As I mentioned I get the exact same result. But if you get significantly faster result you must be getting AGP or VRAM mem.

Thanks.

Korval · March 22, 2003, 10:33pm

And will I be guranteed to get fastest possible video mem if there is sufficent on board?

I imagine that either nVidia’s VBO uses (or, will use when properly optimized) VAR internally, or it will use its own direct access. If it is the latter, understand that most GL high-performance development will shift to the accepted VBO. As such, nVidia will have no choice but to make VBO as fast as possible.

As for the debate on .chms… I have yet to see a better format for on-line programming documentation (which this qualifies as) than a good .chm file. If .chms can’t be used on non-Windows systems, maybe somebody ought to write a .chm viewer for them. In fact, I thought that there already was a .chm viewer for Linux.

.pdf’s are good for printing; not reading/searching/etc that an on-line document needs.

[This message has been edited by Korval (edited 03-23-2003).]

DarkWIng · March 22, 2003, 11:10pm

fritzlang : VBA is much faster than plain VA. I can’t give you exact numbers but I would say about 2x.

fritzlang · March 23, 2003, 12:55am

Thanks DarkWIng,
I must be doing something wrong. VAR is fast as always (wglAllocateMem(…, 0, 0, 1)) but I cannot get VBO to at all improve over std gl arrays.

This code uses buffered vertices and unbuffered indices, like I do it with VAR.

// Init
glBindBufferARB(GL_ARRAY_BUFFER_ARB, 1);
glBufferDataARB(GL_ARRAY_BUFFER_ARB, m_uiNumVertices * 2 * sizeof(float), m_pfVertices, GL_STATIC_DRAW_ARB);

// Draw
glBindBufferARB(GL_ARRAY_BUFFER_ARB, 1);
glVertexPointer(2, GL_FLOAT, 0, BUFFER_OFFSET(0));
glEnableClientState(GL_VERTEX_ARRAY);
glDrawElements(GL_TRIANGLE_STRIP, m_uiNumIndices, GL_UNSIGNED_SHORT, m_pusIndices);
glDisableClientState(GL_VERTEX_ARRAY);

Thanks.

[This message has been edited by fritzlang (edited 03-23-2003).]

Humus · March 24, 2003, 1:01pm

I can’t see any direct problems with your code, but I would recommend storing the indices in a ELEMENT_ARRAY too. Will help performance.

glBindBufferARB(GL_ELEMENT_ARRAY_BUFFER_ARB, vboIndexBuffer);
glBufferDataARB(GL_ELEMENT_ARRAY_BUFFER_ARB, nIndices * sizeof(short), indices, GL_STATIC_DRAW_ARB);

// draw
glBindBufferARB(GL_ELEMENT_ARRAY_BUFFER_ARB, vboIndexBuffer);
glDrawElements(GL_TRIANGLE_STRIP, nIndices, GL_UNSIGNED_SHORT, BUFFER_OFFSET(0));

system · March 25, 2003, 2:39pm

Is it OK to create about 200 buffers in tearms of performance? I can’t think of any reason why it should be slower than just a few buffers. Or is binding very expensive?

Korval · March 25, 2003, 4:25pm

Originally posted by KRONOS:
Is it OK to create about 200 buffers in tearms of performance? I can’t think of any reason why it should be slower than just a few buffers. Or is binding very expensive?

Considering that binding could do anything from moving a pointer to uploading that memory from system RAM to the video card… I would consider binding a buffer to be approximately as painful as binding a texture. In short; the fewer the better.

Pop_N_Fresh1 · March 25, 2003, 11:54pm

I think i’ve encountered a driver bug changing my code over to vbo from var. The following code repeatedly draws the vertices set with the first calls to gl*Pointer within the loop. The offsets do not update correctly. If I change the if( b > 0) code to something like if( b == 45 ) it will rendering the block of vertices correctly. Moving around the various gl calls and added glFlush or glFinish after rendering every block has no effect. This happens with NVidia’s 4303 4330 and 4345 drivers.

	glBindBufferARB( GL_ARRAY_BUFFER_ARB, svbo_blockvertices );

	glEnableClientState(GL_VERTEX_ARRAY);
	glEnableClientState(GL_NORMAL_ARRAY);
	glEnableClientState(GL_TEXTURE_COORD_ARRAY);

	int br = 0;	// blocksrendered

	for( int x = 0; x < 32; x++ )
	{
		for( int z = 0; z < 32; z++ )
		{
			byte b = s_header->mIndexBlock[x*32+z];
			if( b > 0 )
			{
				LandBlock& lb = s_blocks[b-1];

			
				int off = sizeof(BlockVertices) * (b-1);
			
				glVertexPointer(3, GL_FLOAT, sizeof(TerrainVertex), BUFFER_OFFSET(off+20) );
				glNormalPointer(GL_FLOAT, sizeof(TerrainVertex), BUFFER_OFFSET(off+8) );
				glTexCoordPointer(2, GL_FLOAT, sizeof(TerrainVertex), BUFFER_OFFSET(off+0) );

				for( int strip = 0; strip < 32; strip++ )
				{
					glDrawElements(GL_TRIANGLE_STRIP, 18, GL_UNSIGNED_SHORT, &(s_blockelements[strip*18]));
				}

				br++;
			}
			// done drawing block
		}
	}

Graham · March 26, 2003, 4:35pm

Originally posted by DarkWIng:
Will newer builds also reduce the cost of ARV_VBO binding? I’m using 43.30 and reducing binding calls increased by framerate by 15%. I still cant get it to work faster than 70% speed of VAR.

well…

the extension specs:
Buffer changes (glBindBufferARB) are generally expected to be very lightweight, rather than extremely heavyweight (glVertexArrayRangeNV).

so I guess yes.

vincoof · March 26, 2003, 10:56pm

My guess is that performance difference between the two extensions depends on the underneath link to the graphics hardware. NV_vertex_array_range is designed for and by NVIDIA, so I would expect NV’s extension to be a little bit faster than ARB’s at least for current cards. The implementation of the ARB spec will probably be better in later drivers, so DarkWIng’s 70% could reach 80-90%, but I doubt it will ever be better than 100% (on GF1-4).

cass · March 27, 2003, 1:22am

Originally posted by Korval:
Considering that binding could do anything from moving a pointer to uploading that memory from system RAM to the video card… I would consider binding a buffer to be approximately as painful as binding a texture. In short; the fewer the better.

No, binding a buffer is a cheap operation. The reason we chose this API was that we get to re-use all of the glPointer() calls and glDraw() calls and (when PBO arrives) glReadPixels().

We considered adding VBO-style entry points for every call that takes a pointer, but in the end we decided that this way was nice and orthogonal, and thus will be more easily integrated into OpenGL implementations and application code.

Thanks -
Cass

CybeRUS · March 27, 2003, 7:04am

I’m usign GeForce FX 5800 Ultra and driver 43.40, WinXP

I can’t reach perfomance of VAR about 10%. I guess it will be fixed

My game is using about 2000 buffers (vertex&index). It’s landscape.
When i deformate landcape blocks i delete buffer and create new.

And after 500 deformations it’s crash in driver, maybe bug in manager.

All blocks is about 2Mb of video. My manager for VAR work perfectly and using fences (maybe bug in sync).

Korval · March 27, 2003, 8:40am

The reason we chose this API was that we get to re-use all of the glPointer() calls and glDraw() calls and (when PBO arrives) glReadPixels().

We considered adding VBO-style entry points for every call that takes a pointer, but in the end we decided that this way was nice and orthogonal, and thus will be more easily integrated into OpenGL implementations and application code.

I don’t think you quite understood what I was suggesting. I was simply suggesting that the data a buffer stores could possibly be in system memory and, upon being bound, would have to be uploaded to either AGP or video RAM. However, if a bind is going to be lightweight, I presume the driver won’t be using regular RAM for VBO’s.

imported_jwatte · March 27, 2003, 9:07am

I think data will conceptually be put where it belongs the first time it’s used. It might get kicked out of there if it hasn’t been used in a long while, somewhat like texture data. I would guess that almost all implementations will put the data in AGP memory, as they’ll need video memory for texture and framebuffer traffic.

vincoof · March 27, 2003, 10:30am

For those of you interested by NVIDIA drivers, new ones are available today. I haven’t checked the support for ARBvbo but I guess something has been done. Maybe someone from NVIDIA can confirm if ARBvbo support has been enhanced in these drivers ?

MZ1 · March 27, 2003, 1:48pm

43.45? VBO is absent in extension string. I Didn’t check function pointers.

Pop_N_Fresh1 · March 27, 2003, 5:36pm

I’m using 43.45 and it still only allows me to set the gl*Pointers (offsets) a single time after binding a buffer object. Still a ways to go I suspect.