Geoemtry Instancing

How can I use geometry instancing in opengl? I checked the nv extensions spec. and it should support special advanced extensions such as gpu_shader_4, draw_instanced…now I checked my hardware if it supports them, it does not. However geoemtry instancing samples that come with DX SDK work, which means it’s supported by the hw, then how come the functionality it’s not available to opengl?
is it different in gl and dx how they “interpret” instancing?

Geometry instancing is available in GL with the extensions… basically, you use glDrawArraysInstancedEXT or glDrawElementsInstancedEXT and it makes gl_InstanceID available to your vertex shader (with the proper #version and #extension lines as documented in the specs).

If your card does not support the newer extensions, perhaps you were looking at the pseudo-instancing examples and not the true hardware instancing examples?

The only performance gain in instancing is pre-caching the geometry information at a pre-vertex processing level, which makes no big difference but for D3D maybe where the overhead is cumbersome.
Correct me if I’m wrong, we send the geometry once, and still sending the transformation for each instance.

For true hardware instancing, you basically send all of the transformation data over in one step (via (bindable) uniforms, texture buffer objects, etc.), and use a single draw call to draw all of the instances. Then you use gl_InstanceID in the shader as an index into the transformation data (or you can use it to procedurally transform the data).

I believe that with pseudo-instancing, you need one draw call per-instance (as well as uploading to the shader).

You may also use shader instancing which needs only shaders 2.0. You make multiple copies of the same mesh in a vertex buffer and for each of them you save its index in vertices (e.g. in the position.w coordinate). This will be your instance ID for you to use it as an index to your constant (uniform) buffer in the vertex shader. The major performance gain comes from lowering the number of draw calls when you are cpu-bound, not from pre-caching the geometry information.

Pseudo-instancing is just duplicating of vertex attributes in the immediate mode, isn’t it? (like in the nvidia sample)

Instancing can give a huge speed increase. It is one of the greatest extensions ever. Unfortunately I think ATI does not support it at all.

Why that?

Tell me please which method of instancing is greatest:
1)pseudo-instancing | uniform-instancing (non pseudo-instancing)
4)instancing with geometry shader

P.S. Use simple English words, please.

It depends.
From what I read in these forums, instancing is rarely useful.

Do not bother unless you have concrete needs for it, like super simple geometry to be repeated a very very large number of times.

I have experimented with instancing for OBJ models.
I can tell you there is no point doing it unless you are CPU bound AND you are drawing 10,000+ instances. I also noted that fustrum culling each instance saved FPS rather than just drawing all instances in 1 batch. This means that you must ‘repackage’ some data per frame to draw only visible instances. I found this rather disapointing - I had hoped we could do away with that. Perhaps that’s where geometry shaders could help?

Anyway, Instanced Arrays are the most obvious choice and require little alteration to an existing framework. However, there is overhead setting up visible instances and therefore repopulating the per-instance vertex buffer instance data - in my case that means a modelview matrix per visible instance must be insetred into a list in memory and re-uploaded to a VBO. This was something I never bothered with as it was just a benchmarking excerise and the performance of instancing sucked anyway.

Depending upon the nature of the data your trying to instance, Texture Buffer Objects and Uniform Buffer Objects are very flexible. Of the two, texture Buffer Objects offer a huge data store and adding this to an engine is fairly straight forward. Re packaging the visible instance data into the TBO is easy too. In fact you use two TBOs - once to store the data and the other as an indirect index list of integers.

Uniform Buffer Objects - these are a much smaller array of data (implementation specific) of about 64K bytes. Access to this memory is very slightly quicker but the limited memory size resticts its usage. I found altering my entire shader code to support UBO and Uniform Block a very painful excerise and quite fruitless also!

I have benchmarked TBO, UBO, Uniform Arrays and Instanced Arrays. The results were inconclusive due to various lack of driver support and bugs. Also, the codeing effort vs speed benefit is not worth it - unless your heavily CPU bound and drawing many thousands of simple objects. In the end - the old way - draw each object 1 at a time is the quickest!

BTW. Instanced Arrays_EXT is not a technique - its just a means to an end. It extends the vertex shader language so you can get a handle on the vertex instance and you use a new drawElements API to send the geomoetry.

“Do not bother unless you have concrete needs for it, like super simple geometry to be repeated a very very large number of times.”
I watch pseudo-instancing demo by NVidia. Objects were not ~ enough. When I check mode without instancing, fps were down very math!!!

I want to know best way for instancing because I and some man were argue: what metod is better: ARB_instanced_arrays or instancing with geometry shader. He sad that if we use ARB, we can not use fustrum culling (and fps were go dawn in big scenes), but I think that Khronos Group didn’t created ARB_instanced_arrays than.

P.S. Do you get my English?

The advantage of the geometry shader for instancing is that it can compliment any instancing method. It can be used instead of a software CPU loop for view fustrum culling - and as such it would be a really good choice. I don’t think it’s a technique for instancing in it’s own right - because there is no way to send the geometry in the first place. I’m no expert at Gemoetry shaders!