asynchronous compilation

While I understand that it is currently not possible to compile GLSL shaders offline, is there anyway to perform compilation asynchronously? My application has very high performance requirements (i.e. I need to maintain 60Hz rendering performance), but our databases are also large so I need to page data dynamically. Currently the geometry is loaded in an asynchronous thread to avoid impacting rendering. While it works to compile shaders introduced by freshly paged geometry the first time they’re rendered, it can impact the frame rate.

Is it possible to create a dummy OpenGL context for the asynchronous thread and use it to compile / link the shaders that will be used in the main draw thread?

Don’t have a solution for you, but we have precisely the same need as you do.

If OpenGL/GLSL doesn’t provide this by the time we need it, we will switch over to NVidia Cg to get it. We have already started moving our shader infrastructure in that direction.

Besides that, even compilation of our startup shader permutations is crazy-expensive, and it’s just senseless to eat the time for every run for every shader re-compiling these shaders on every single PC in the cluster. I mean seriously, how many of your performance-critical apps invoke the C++ compiler on a end-user’s machine at run-time, and in the foreground thread?

Would prefer an EXT/ARB solution of course, but if we have to go NV-specific to get it, we will. Breaking 60Hz frame and visually distracting artifacts are the two guaranteed priority #1 DRs in our world.

At the very least, we need support for compile/optimization/link in a background thread. But ideally we’d also want the ability to fetch the compiled result (vendor assembly or binary blob), store it on disk, and then re-use it later. So after DB publishing, we don’t even need to mess with compiling at run-time while the end user is sitting there watching.

+1 to that.

Cg can do this? great!
can somebody tell me how? just tell me the name of the key function please :slight_smile:

If OpenGL/GLSL doesn’t provide this by the time we need it, we will switch over to NVidia Cg to get it. We have already started moving our shader infrastructure in that direction.

Here’s a question. All of the GLSL functions take object names, so none of the compiling/linking works through context binds. Can’t you just create a shared context that you use on another thread solely for compiling/linking?

I’ve never tried this, but it might actually work.

just tell me the name of the key function please

There’s no “key function”. Cg isn’t compiled by the OpenGL driver; it’s a stand-alone library. So you can use it as regular old code. Spawn a thread and have it do whatever. Then you feed the results to GL in the main thread (these results going through NV assembly, as compiling Cg to GLSL gains you nothing).

http://nvidia.fullviewmedia.com/GPU2009/0930-gold-1407.html
at 7:30, shader binaries are mentioned to be on the roadmap. Though, maybe will come after GTX300 becomes mass-market. All while ATi still don’t have 3.2 support.

If Cg is viable for you, why not go nvAsm (+arbAsm if non-nV is among the targets), for fastest shader-load,uniform-upload and everything? (GLSL with Cg semantics, compiled by cgc). Going back to pure GLSL from there isn’t problematic, ime.

Thank you!

Though, maybe will come after GTX300 becomes mass-market.

It should be a core extension, just like UBOs and ARB_sync. There’s no reason to limit it by hardware.

I didn’t mean it as being bound to hw, it’s just that it seems like a good time for new bundle of features.

http://nvidia.fullviewmedia.com/GPU2009/0930-gold-1407.html

That is a very long and boring presentation. But 7:30 is the good stuff. If they can get EXT_separate_shader_objects without the crappy parts, and image/sampler object distinctions, then I’m happy. The other stuff is good too, but those are the features that I really want to see.

They mention the introduction of semantics in GLSL, so maybe it’ll be like Cg-style GLSL (varying vec4 var1 : SLOT7).

That’s what I’m thinking. Cg->NV Asm.

(GLSL with Cg semantics, compiled by cgc). Going back to pure GLSL from there isn’t problematic, ime.

Thanks for that. So GLSL->NV Asm? That’d be convenient. Do you know of an example you could point me to, which uses Cg semantics in GLSL?

I do test-compile our ubershaders using cgc -oglsl now, but haven’t loaded GLSL into the runtime to compile and drive.

I’m unclear about what additionally is needed to do this, and have the runtime provide access to all uniforms and vertex attributes, with appropriate vertex attribute name-to-number mappings. Haven’t seen a doc or tutorial on this.

Something a bit clean would be:


#include "system.h"

varying0(vec3 varN);
varying1(vec2 varCoord);
varying2(vec3 varPos);


#if IS_VERTEX
attribute1(vec3 inN);
attribute2(vec2 inCoord);

uniform mat4 mvpx : C0;
uniform mat4 mvx : C4;

void main(){
	varN = inN;
	varCoord = inCoord;
	varPos = glVertex;
	gl_Position = mvpx * glVertex;
}

#endif

#if IS_FRAGMENT
texunit0(sampler2D tex);

uniform vec4 AllLights[8] : C0;
uniform vec4 SomethingElse: C8;


void main(){
	...
	
	glFragColor = ...;
}
#endif

<--------- cut ----------------><----------- cut ----------------------------->

//==============[ example system.h if Cg-style ]=====================[

#if IS_VERTEX
	#define attribute0(what) attribute what : ATTR0;
	#define attribute1(what) attribute what : ATTR1;
	#define attribute2(what) attribute what : ATTR2;
	...
	
	attribute0(vec4 glVertex);
#endif
#if IS_FRAGMENT
	#define glFragData  gl_FragData
	#define glFragColor gl_FragColor
#endif

#define varying0(what) varying what : TEX0
#define varying1(what) varying what : TEX1
#define varying2(what) varying what : TEX2
...

#define texunit0(what) uniform what : TEXUNIT0
#define texunit1(what) uniform what : TEXUNIT1
#define texunit2(what) uniform what : TEXUNIT2
...
//===================================================================/

//==============[ example system.h if pure GLSL ]=====================[
// your shader-loader has to bind attribs and samplers
#if IS_VERTEX
	#define attribute0(what) in what
	#define attribute1(what) in what
	#define attribute2(what) in what
	...
	
	
	#define varying0(what) out what
	#define varying1(what) out what
	#define varying2(what) out what
	...
	
	attribute0(vec4 glVertex);
#endif
#if IS_FRAGMENT
	#define varying0(what) in what
	#define varying1(what) in what
	#define varying2(what) in what
	...
	
	#ifndef FRAG_OUT_SIZE
		out vec4 glFragColor;
	#else
		out vec4 glFragData[FRAG_OUT_SIZE];
		#define glFragColor glFragData[0]
	#endif
#endif

#define texunit0(what) uniform what
#define texunit1(what) uniform what
#define texunit2(what) uniform what
...
//====================================================================/


It’s all about setting those :TEXn , ATTRn, TEXUNITn semantics. The pure-GLSL way would require some nasty uniforms-handling, though. I prefer to pack uniforms in a mat4[] array. Will probably make some code-preprocessor to do this automatically for me at some time (decoding a C-style struct into those :C0 for Cg-style or #define’s for GLSL-style).
It’s not exactly a rosy road, but at least it’s not too hard, either.

They mention the introduction of semantics in GLSL, so maybe it’ll be like Cg-style GLSL (varying vec4 var1 : SLOT7).

Wow, that’s hideous. Why not make it be “in vec4 var1: 7;”? The “semantic” doesn’t need to be a string or anything special; it’s just a resource number.

Thank you, Ilian! Good stuff.

Yeah, same mention was made at SIGGRAPH '09 BOF (pg 12/45 on the 4th presentation here). However there it was specifically caveated these were things they might do (were under consideration), not things they would do.

Speaking of which, these streaming audio presentations NVidia’s posting (like above) … anybody found if/where NVidia is posting just the PPTs or PDFs for the presentations? Generally would prefer a PPT/PDF, with only occasional desire for the streaming audio. Must faster to skim/absorb.