OpenGL + parallelism shader

today everywhere you can read about new mobile GPUs like Tegra and the number of vertex and fragment shaders.

For example Tegra can only use OpenGL and no CUDA and no OpenCL. I ask me how to use more than one vertex
and fragment shader? In Opengl the programmer only has to write code for the vertex and fragment shader but
he has not to choose options for more shaders.On the other hand CUDA is made for parallism programming
and it is easy to use.

But my main problem is which part in modern OpenGL Systems is proper for parallelism execution on the shaders?

You have posted this question in the drivers section.
Please don’t cross-post! Have you read the posting guidelines?

What do you mean by more shaders? The other kinds of shaders like geometry and tessellation? Those aren’t available on OpenGL ES. As for GPGPU on mobile devices, you can use the ‘old’ way and use the fragment shader for this. Some mobile GPUs would be capable of OpenCL (embedded) (ImgTec SGX) but there are no drivers available right now.

You write your vertex and fragment shaders with one vertex/fragment in mind and the driver will run those on as many cores as it has availabe without you needing to know how much is done in parallel and how much sequentially.

And yes, please no cross posts.

thank you for the answer and sorry for cross-post. I’ll never do that again.

To my question: I read that Tegra 2 has 4 Pixelshader and 4 Vertexshader and i try to understand how to use more than one pixelshader and vertexshader. Because if I program OpenGL Code i only can programm one vertex shader and one pixelshader. But how can i use the 4 Pixelshader? Or does the compiler solves this problem depending on the used hardware?
Thank you for your answers…

These hardware specs are features used being the scenes to compute faster, you have nothing to do to take advantage of it.

You realise that a single vertex shader is often applied to a large number of vertices at the same time right ? Same for fragment shader.

Ari, you have to understand what shader means.

Shader is a program that executes on a shader processing unit. When you say the GPU has 4 shaders, you actually mean 4 processing units (PU). On each PU you can execute arbitrary number of shaders, the same way you can execute numerous procedures on the CPU, but one at the time.

A shader executes upon all “elements” (vertices, patch control points, drawing primitives, fragments) being drawn by a current draw command. A driver and accompanying (controlling) code allocates as much as possible PUs in order to process all “elements” as fast as possible (and in parallel). In graphics API you cannot control the parallelism of the execution.

So, write your fragment shader (since transform feedback is not available on OpenGL ES afaik) and catch results in a texture or a frame-buffer without worrying about the parallelism. The drivers will give their best. :slight_smile:

Thank you for the explanatory notes.
I think I misunderstood shaders and the German literature helped me to misunderstand :slight_smile:

Is there a way to influence the PUs as for me without graphics API?

I think you need to explain what it is you really want to do.

Isn’t the Tegra for mobile systems?

and mobile systems do not come with OpenGL.
They have OpenGL ES 1.0 or 1.1 or 2.0.

This is the Tegra 3 technical page

and they say that it supports OpenGL ES 2.0.
In that case, you are probably better off asking questions in the forums at (since your q is about GL ES).

Yes, the Tegra is a mobile system and yes I know i have to use OpeglES but in this forum is more traffic than in other OpenglES ones. Because of this I asked here with succeed to expand my basic knowledge. Thank you very much.

@ Kopelrativ: I would like to use a modern GPU to calculate Mathmatrices. In this way i would use as much as possible parallelism.

You want to use API that you don’t even understand, and there are pitfalls even for experts. I have to warn you about the precision problems. Graphics hardware/API is designed for speed not for precision. Although the precision can be squeezed out, it requires skills and mathematical knowledge. :slight_smile:

Those PUs are part of GPU, so if CUDA or OpenCL are not supported in drivers graphics APi is the only alternative.