NVIDIA releases OpenGL 4.0 drivers

Are there any plans to support the GL_ARB_shading_language_include extension in the near future?

Seems to me it’s pretty trivial to implement such functionality in the text reading portion of your code. Why do you need an OpenGL extension for it?

The ease of implementation is low if your source doesn’t nest includes, and particularly if your source doesn’t bracket #includes with #ifs based on expressions. Bring those idioms into the mix and the problem grows in difficulty…

GLSL already has a preprocessor, the include feature just makes it more complete.

I have a custom preprocessor based on boost::wave, but this way it is really complicated if glsl internal macros are used. I am asking this because i want to get rid of the custom preprocessor.

What i think is still missing is a way to get custom macros to the shader preprocessor (i.e. MY_CUSTOM_NUM_LIGHTS). I want to get away from fiddling with the shader source based on certain assumptions.

Can’t you just declare:

#include “internal.h”

at the top of your shaders, then use glNamedStringARB to pass in your parameters? (i.e. generate an include file at runtime)

Regards
elFarto

http://blogs.nvidia.com/ntersect/2010/05/introducing-the-new-256-driver-release.html

OpenGL 4.0 – While we currently support OpenGL 4.0 in developer drivers, Release 256, brings full OpenGL 4.0 support to our unified consumer drivers. GeForce GTX 400 series customers can immediately take advantage of the tessellation support in OpenGL 4.0 by downloading Unigine’s latest release of their Heaven benchmark, version 2.1, which adds support for OpenGL 4.0 tessellation and 3D Vision technology. GeForce GTX 400 series GPUs are tessellation monsters, feed them highly tessellated objects and they’ll chew them up at an incredible speed.

very nice drivers.

but, as it always is after new releases, the requests: please support the GL_ARB_shading_language_include extension in the near future.

hi,
i found that the following calls crash on the latest OpenGL 4.0 drivers (257.15) using windows 7.


int                 num_comp_routines = 0;
scoped_array<int>   comp_routines;
glGetActiveSubroutineUniformiv(_gl_program_obj, GL_FRAGMENT_SHADER, i, GL_NUM_COMPATIBLE_SUBROUTINES, &num_comp_routines);

if (0 < num_comp_routines) {
    comp_routines.reset(new int[num_comp_routines]);
    glGetActiveSubroutineUniformiv(_gl_program_obj, GL_FRAGMENT_SHADER, i, GL_COMPATIBLE_SUBROUTINES, comp_routines.get());
}

both calls to glGetActiveSubroutineUniformiv crash with an access violation in nvoglv64.dll, i tried a very large fixed number for the number of compatible routines to get around the first crash but the second also crashed…

-chris

ok,
after some trying to work around this issue i think subroutines are just broken in current nvidia drivers.

shader snippet:


subroutine vec4 generate_output_color(in vec2 cp_tc, in vec2 pp_tc, in vec4 cp_col, in vec4 pp_col, in float b);

subroutine uniform generate_output_color output_generator;

subroutine (generate_output_color)
vec4 output_blended_coordinate(in vec2 cp_tc, in vec2 pp_tc, in vec4 cp_col, in vec4 pp_col, in float b)
{
    return (mix(vec4(cp_tc, 0.0, 1.0), vec4(pp_tc, 0.0, 1.0), b));
}

subroutine (generate_output_color)
vec4 output_blended_color(in vec2 cp_tc, in vec2 pp_tc, in vec4 cp_col, in vec4 pp_col, in float b)
{
    return (mix(cp_col, pp_col, b));
}
...
    return (output_generator(a, b, c, d, e));

as in my last post said, i am unable to use the reflection api to retrieve all the information i need.

so i tried the direct way:


unsigned  rl = glGetSubroutineIndex(_gl_program_obj, GL_FRAGMENT_SHADER, "output_blended_color");
rl = glGetSubroutineIndex(_gl_program_obj, GL_FRAGMENT_SHADER, "output_blended_coordinate");

in every case glGetSubroutineIndex returns complete garbage. and when trying to force an index (0 or 1) on the subroutine uniform (for which i easily get the location using glGetSubroutineUniformLocation). i get an invalid value gl error.

if someone got subroutines to work, please let me know if o did something wrong.

regards
-chris

Subroutines work just fine with 257.15 drivers on WinXP x32!

Indices returned by glGetSubroutineIndex are not 0 and 1. glGetSubroutineUniformLocation returns 0 if there is one subroutine uniform, and glGetSubroutineIndex-s return 2 and 1 respectively (reverse order compared to the order it is defined in the shader). Don’t ask me why. I also have to figure it out. Your problem makes me curious to see what values are returned.

I’ll install Win7 x64 on the same machine and tell you if there is any problem with x64 implementation of the drivers.

Chris, I think I have discovered what is your problem!

Few hours ago I have installed Win7 x64, and surprisingly shader subroutines … work perfectly. :slight_smile:

I have tried to gain symptoms like yours by changing the shader code making intentional errors. After a while it happened. Then I checked your cod again. In the code fragment you have posted there is no definition of subroutine uniform variable. This kind of error should be reported by the GLSL compiler. Probably you have neglected error messages posted by the compiler.

If shader is not compiled correctly than it cannot be linked also. In that case location retrieved by functions like glGetSubroutineIndex are undetermined.

The only thing that is not like spec says is indexing in NV drivers. Instead of 0 based it is 1 based. Further more, it seems that some kind of stack is used for storing functions’ names, because the order of indices is inverted.

hi,
thanks for the help. i am in contact with nvidia about this. the shader compiles and links without any errors or warnings.

I discovered that the functions crash when the program is not bound to the current state. according to the spec this should not be necessary for the reflection API. But I am still not able to use the GetActiveSubroutineName and -Index functions. I am currently not at my workstation. I will post a complete simple shader that shows these errors everytime

-chris

there is a subroutine uniform variable declared (line 3 in my above post).

The only thing that is not like spec says is indexing in NV drivers. Instead of 0 based it is 1 based. Further more, it seems that some kind of stack is used for storing functions’ names, because the order of indices is inverted.

you can find my test code in this posting. i am able to retrieve all subroutine uniform names and the count of subroutines in the shader stages. but when trying to retrieve the function names and indices the x64 driver does not work correct currently (error for indices confirmed by nvidia).

in [1] you can find my code to retrieve the active subroutines in the fragment shader. glGetActiveSubroutineName does not return something useful and throws an invalid value error. as you said the indices currently are 1-based instead of 0-based. so i tried passing i+1 to glGetActiveSubroutineName but the problem persists on my end.

in [2] i attempt to retrieve the compatible subroutines for a specific subroutine uniform. the indices returned from glGetActiveSubroutineUniformiv with GL_COMPATIBLE_SUBROUTINES are 1 and 2 in my test case. But again when passing these indices to glGetActiveSubroutineName i get the same begavior as before.

[1] retrieve active subroutines


int act_routines = 0;
int act_routine_max_len = 0;
char*  temp_name = 0;
glapi.glGetProgramStageiv(_gl_program_obj, GL_FRAGMENT_SHADER,
                          GL_ACTIVE_SUBROUTINES, &act_routines);
glapi.glGetProgramStageiv(_gl_program_obj, GL_FRAGMENT_SHADER,
                          GL_ACTIVE_SUBROUTINE_MAX_LENGTH, &act_routine_max_len);
if (act_routine_max_len > 0) {
    temp_name = new char[act_routine_max_len + 1]; // reserve for null termination
}
for (int i = 0; i < act_routines; ++i) {
    std::string         actual_routine_name;
    unsigned            actual_routine_index = 0;

    int ret_size = 0;

    glapi.glGetActiveSubroutineName(_gl_program_obj, GL_FRAGMENT_SHADER,
                                    i, act_routine_max_len, &ret_size, temp_name);
    gl_assert(glapi, program::retrieve_uniform_information() after retrieving subroutine info);

    actual_routine_index =
        glapi.glGetSubroutineIndex(_gl_program_obj, GL_FRAGMENT_SHADER,
                                   temp_name);
    gl_assert(glapi, program::retrieve_uniform_information() after retrieving subroutine info);
}
delete [] temp_name;

[2] retrieve compatible subroutines


i is the index of the current subroutine uniform
// compatible routines
glapi.glGetActiveSubroutineUniformiv(_gl_program_obj, GL_FRAGMENT_SHADER,
                                     i, GL_NUM_COMPATIBLE_SUBROUTINES, &num_comp_routines);

if (0 < num_comp_routines) {
    comp_routines.reset(new int[num_comp_routines]);
    glapi.glGetActiveSubroutineUniformiv(_gl_program_obj, GL_FRAGMENT_SHADER,
                                         i, GL_COMPATIBLE_SUBROUTINES, comp_routines.get());
}

for (int r = 0; r < num_comp_routines; ++r) {
    // here comp_routines contains 1 and 2
    glapi.glGetActiveSubroutineName(_gl_program_obj, GL_FRAGMENT_SHADER,
                                    comp_routines[r], max_act_routine_len, 0, temp_name.get());
}

vertex shader


#version 400 core

out vec3 normal;
out vec2 texture_coord;
out vec3 view_dir;

uniform mat4 projection_matrix;
uniform mat4 model_view_matrix;
uniform mat4 model_view_matrix_inverse_transpose;

layout(location = 0) in vec3 in_position;
layout(location = 1) in vec3 in_normal;
layout(location = 2) in vec2 in_texture_coord;

void main()
{
    normal        =  normalize(model_view_matrix_inverse_transpose * vec4(in_normal, 0.0)).xyz;
    view_dir      = -normalize(model_view_matrix * vec4(in_position, 1.0)).xyz;
    texture_coord = in_texture_coord;

    gl_Position = projection_matrix * model_view_matrix * vec4(in_position, 1.0);
}

fragment shader


#version 400 core

in vec3 normal;
in vec2 texture_coord;
in vec3 view_dir;

uniform vec3    light_ambient;
uniform vec3    light_diffuse;
uniform vec3    light_specular;
uniform vec3    light_position;

uniform vec3    material_ambient;
uniform vec3    material_diffuse;
uniform vec3    material_specular;
uniform float   material_shininess;
uniform float   material_opacity;

layout(location = 0) out vec4        out_color;

subroutine vec3 color_me(in vec3 col);
subroutine uniform color_me color_sample;

subroutine (color_me)
vec3 phong_light(in vec3 col)
{
    vec4 res;
    vec3 n = normalize(normal);
    vec3 l = normalize(light_position); // assume parallel light!
    vec3 v = normalize(view_dir);
    vec3 h = normalize(l + v);

    return (  light_ambient * material_ambient
            + light_diffuse * col * max(0.0, dot(n, l))
            + light_specular * material_specular * pow(max(0.0, dot(n, h)), material_shininess));
}

subroutine (color_me)
vec3 const_color(in vec3 col)
{
    return (col);
}

void main()
{
    vec4 res;

    res.rgb = color_sample(material_diffuse);
    res.a = material_opacity;

    out_color = res;
}

I’m sorry, I have overlooked the declaration.
Yes, it looks fine.
You should change this line:

for (int i = 0; i < act_routines; ++i) {

with

for (int i = 1; i <= act_routines; ++i) {

because currently on NV the range is [1…GL_ACTIVE_SUBROUTINE].

I’ll try to recompile my code (and your too) with VS2008/2010 x64 to see if it works. I did try it on Win7 x64, but the application was compiled as u 32bit.

that is the point ;), the x64 part of the ICD has some bugs.

as I wrote, I tried i+1 as index with the same behavior…

-chris

Hi, I get an “undefined variable” when using a memoryBarrier() (from EXT_shader_image_load_store) in a fragment program with NVIDIA 257.29/258.49 drivers, is it a known unimplemented feature in R256 ?
Other features (image load/store, atomics) work correctly.

Thanks

I notice that on an 8600M with forceware 257.21 on Windows 7 64 bit, that EXT_shader_image_load_store is not a supported extension. Is this correct?

We could sure use it.

Yes, EXT_shader_image_load_store is Fermi only, you need a GTX4xx to use it (or an ATI HD 5xxx, but not sure if it is supported in current drivers).

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.