Additional calculation messes up result

Originally posted by Korval:
That is, in fact, the point of a high level language.
Not if you want to achieve good performance. Do you ignore the hardware when you write C/C++ code? I don’t, therefore my code runs fast. 30 years after the language was introduced you still cannot ignore the hardware if you want to achieve good performance. Code close to the hardware will still run much faster. You can still download optimization guides from Intels website telling you everything from how to declare your arrays to how to use intrinsics to get closer to hardware.

but isn’t the direct analogy more like having to worry about register allocation in compiled c++ code? this is a scenario i think most would agree is preposterous in all but the most critical areas, areas that would probably see some low level, hand crafted asm anyway.

i’d be the first to cut the compiler writers some slack, though. it’s no easy feat. and there’s no question that coders should keep the hardware in mind, good practices, even if it’s only subliminal, as long as it’s documented and predictable (as it is for intel).

Do you ignore the hardware when you write C/C++ code? I don’t, therefore my code runs fast. 30 years after the language was introduced you still cannot ignore the hardware if you want to achieve good performance.
Yes, I frequently “ignore the hardware”. I’m not interested in register counts or structuring my code for best in-order execution. I don’t shirk from C++ features that degrade performance, and only occasionally do I even concern myself with overuse of those features. That’s why I don’t code in assembly. It isn’t for cross-platform stuff, it’s to get away from needing to care about low-level internals.

Oh, and I’ve shipped 3 games. On consoles. 2 of them, by design, ran at 60fps.

I expect compilers to be competent. And if they’re not, I don’t use them. It’s one of the reasons that I avoid glslang like the plague: it’s just not trustworthy. And it never will be until IHVs start making real compilers for it.

This isn’t hard stuff we’re talking about here. We’re not talking about trying to recognize a sin-approximation in software or something; we’re talking about basic compiler optimizations here. I could understand if it were something that was truly difficult or required a week of developer time. But if your compiler writers are at all competent, it shouldn’t take more than a day (tops) for one guy to hook this in.

For God’s sake, the ARB gave ATi everything they could give in terms of the architecture of glslang. Had 3D Labs had their way, everything would be counted in floats, and these exposed float limitations would be required (ie, if your hardware organizes attributes or uniforms into vec4’s rather than floats, you have to pretend that it doesn’t). They only asked for 1 instance of this: varyings. Where it matters the most because the hardware limits are so significant.It can’t be that hard to count up the number of floats for varyings and assign them as needed.

Basically, I’m just accusing ATi of incompetence. But, then again, that’s nothing new; I do it every time they do something stupid :smiley:

Though if nVidia compilers can’t do any better, then they too are incompetent…

No vendor’s compiler is optimal in all cases, and none will ever be.

You are talking about GLSL compilers, but I guess you are not aware that the asm compilers can have issues as well. I have written ARB_vp/fp for a pet project a long time ago. It ran beautifully on ATI and then one day, it performed really bad after a driver update. I didn’t bother with it cause I had other things to do.

Secondly, I have encountered the case where my ARB_vp/fp was performing better than the GLSL equivalent.
It’s dissapointing. Even after multiple driver updates, it happens.

Giving us the ability to hand feed the GPU low level shaders (that the driver won’t attempt to screw up) might not be a bad idea.
Having the ability to give compiler flags would be nice.

Originally posted by bonehead:
but isn’t the direct analogy more like having to worry about register allocation in compiled c++ code?
In a way yes, but the difference is that a GPU is much more of a fixed platform than the CPU. If the compiler does a poor job on register allocation on the CPU for some piece of code it will still be able to run it, it just means it will resort to system memory more often and thus run slower. Today’s GPU doesn’t have that ability. If you’re out of interpolators, temporaries, samplers, attributes, constants, instructions or any other resource, there’s no option other than going to software. The good news is that compilers do get better and hardware get better capabilities as well, so this will be less of a problem in the future.

Originally posted by bonehead:
i’d be the first to cut the compiler writers some slack, though. it’s no easy feat. and there’s no question that coders should keep the hardware in mind, good practices, even if it’s only subliminal, as long as it’s documented and predictable (as it is for intel).
Absolutely, I agree. I’m not saying you should be thinking like in assembly, but at least having some high level understanding of how the hardware works will help writing fast code.

I would have to agree with Humus although i find ATi guilty of writing crappy GL drivers (and sometimes nVidia as well, but they are much better than ATi). GPU compilers are still in their infancy. Consider yourself programming in C/C++ some 15 (or more) years back! You can’t expect them to magically produce a compiler that generates optimal code. Since C++ compilers have matured so much over the past many years, we have started taking them for granted. But in case any of you know, the VC compiler still doesn’t support 100% C++, AND upto VC 6.0 their template support for so ****ty that one was better off without using them!

As for ATi, they really need to put their act together and come up with better drivers. Its not just a matter of GLSL compiler, their GL drivers in general suck crap!

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.