Ok, most my programming with graphics in the past has been strait to hardware. You know using the programmers guide to ega, vga svga…
Ok, so I am trying to rewrite or improve an old project I wrote almost 15years back. You would think wow that should be easy with newer technology opengl and so on. All I will say is I wish. So far massive disappointment. The Z-buffer implement in opengl in my opinion is atrocious at best. Great for something written 10+ years back but not what we could or should be at today.
Anyway. I need to display a massive seen. To give you a scale idea. Say the opengl value of 1 is equal to 3 feet. A person in game height would be 2 or the earth for instance would be 6,700,000 radius. With the Z-buffer that is implemented this is a massive issue because it does not have the precision and wastes a lot of what it does have. float *zbuffer; would be better than it is currently.
So my solutions so far I have came up with are based around 2 possible areas. Either using a loose Octree or sphere tree system along with cone shaped view frustum. The tree system either I use would also have to handle relative distance to size potential as well not just if the object is potentially visible. If you are sitting on mars looking at earth it is highly unlikely you would see the shuttle orbiting earth even if it was in front of earth in your view volume.
Basically because the Z-buffer has such a very large loss in precision in the distance I need to off load that work to the CPU and create a fairly fast way ensuring no surface will be behind another surface that would even include back face culling.
Which kind of all seems to defeat the point of why have a GPU to start if you have to off load so much work to the CPU. Simply making a better z-buffer would eliminate 90% of the issues and problems with complex and large scenes.
At least the good part is with multiprocessor cores we can farm that work to them.
So my two questions that I have is what workarounds have others came up with. No scaling does not fix the problem and what I am describing would require znear to be like 0.1 and zfar to be like a billion or more even.
Drawing the scene in multipass seems retarded since we already need to do multipass rendering for a host of other issues. The work even to do the multi pass seems to out weight the work to figure out what to display and not. Because you would still have to sort all the objects near to far and what about object that are both to other object then you have to split them up.
My second question. Why on earth are we still having such an antiquated z-buffer. I spent the last several years working in the processor industry and with the advancements we have gone through it makes no sense to me. Just the work load difference it would remove is far reason enough it could only improve the visual quality of what we have. So what if frame rate drops from 200+ down to 100 even 60 you can’t see it. You shouldn’t be trying to force the entire Z-buffer to fit in one scale either leave them as pure floats or doubles preferably 64 to 128 bit.
Well if you know better, why not design yourself such a chip with 128bit float zbuffer ? Just joking.
To be more precise, it is not OpenGL that locks you on a 24bits zbuffer. It is only the hardware.
It has been almost 10 years of 24bits zbuffer on consumer cards, not sure why no progress have been made on this front. I guess all the depth tests optimisations such as compression, early-z, hierachical culling does not scale well with higher precision.
Some remarks :
back face culling is done independantly from depth testing, don’t worry about it
doing multipass is not so retarded. You can simply provide the same scene multiple times, with disjoint znear/zfar couples. Using VBO will help reduce the performance cost (as geometry will not change during these passes)
Can you describe roughly this old project ? Maybe more specific hacks are possible ?
I don’t think the Z-buffer is antiquated in any way. It’s done the way it is for a reason. We’re trading non-optimal Z-distribution for simpler and faster hardware. Z is linear in screen-space (rather than eye-space), which makes it cheaper to interpolate the Z values, and very importantly for modern hardware, makes the Z values very compressible, which allows for bandwidth savings optimizations, which would be very hard and costly to implement for W-buffering. For most normal scenes the precision is good enough anyway.
Now if you’re trying to render everything from 0.1 to a billion, then you’re clearly falling outside of what’s normal and what OpenGL was designed for. Maybe using a floating point buffer will help. Don’t forget to reverse Z to get the Z distribution go hand in hand with float value distribution. If you can’t use that, you could implement your own Z-buffer-like behaviour in the shader.
What is normal is changing. Look at the vast number of projects that people are attempting planet rendering and even entire universes for game play.
As I made the point it isn’t like we need 200fps anyway and sure as heck not 400fps. 99% of people can not see flicker above 30fps.
I think we should be doing object culling on cpu then surface or face culling and so on, on the GPU.
Just look at the large number of First Person shooters trying to go to larger and larger terrains. That alone or any other game for that matter needing large terrain.
Anyway I wasn’t trying to create an argument just stating an opinion.
But I am more interested in what solutions people have tried besides the few bad choices I seen offered.
One idea on the sphere tree I had was to make this an orbital sphere tree. Objects have one of 3 primary states. Attached(such as attached to the surface), in orbit, or free. I figure most objects in the universe fall into those areas if not all.
The trick I think is to create a scene management / render manager that is intuitive. Such as what object may or may not be visible regardless if in line of site. Such as some stars can not be seen because they are dim, objects that are small or beyond X distance may not be seen, object behind other object or facing away may not be seen, and so on.
Even with such a system I am going to have to use a lot of false movement and scaling to get stuff done. Such as moving from one galaxy to another. Or even displaying a galaxy to be visible from another one will require using something like a projected back ground. Once an object comes in range then it is removed from the back ground and actually displayed. The problem with that is Every face on the projected back ground will have to get treated as an object so that if it sits behind another object it can be culled and modified to display the portion that isn’t behind the object.
All of which becomes massive work wise. But would be extremely simplified with a proper z-buffer.
Anyway interested in what workarounds people have come up with for this issue.
Well apparently the back face culling utterly fails when you have your znear and zfar to far apart from each other. Otherwise the objects that are culled should never bleed through and they do.
Why I say doing multipass is retarded. First it requires you to divide or order your scene from far to near. CPU intensive and time consuming especially if you have stuff moving in it which most games do.
You may already get limited to multipass for dealing with things like textures and shading, lighting and so on depending on the persons card to start with. It could take something that could be extremely playable and just make it not feasible at all.
The old project: was a simple display system. Used a pure float for the z buffer. it was done in 256 bit color so only used a byte for the color buffer. Back face culling was done using float vectors. It didn’t use a sphere tree, quad tree or any of that not even a frustum to cull objects or surfaces. Very simple. It had one purpose to display the largest objects I could build from micro detail to massive detail. It wasn’t designed to be fast.
I figured with the advancements we have had in CPU tech I would see what current hardware and APIs are capable of. Kind of disappointing to be honest. Some things I find awesome shader tech great. However, I still see issues with it.
Say you want to use the shaders to build some really nice land great. If you do then the CPU has to read from the Video card the height of the land or it can’t place objects on it properly or it has to work it out itself also which may not be as accurate.
Reading from the video card should be a no-no unless copying for video or something.
Personally I think that one improvement fixing the z-buffer will be the biggest change they can make for next generation games they can possibly make. It would bring the level of detail to a entire new place and make entire universes feasible with relative ease.
To the OP. I guess you’re a young 'un. You seem to believe that a GRAPHICS API (ie putting pixels on the screen) is responsible for your physics engine. Well, that ain’t gonna happen for a while.
The whole world as of now manages quite well with the current limitations. And from what you decribe, what you want is extremely easy. You just don’t seem to have the ability to ask politely.
I think your inability to ask properly is hinderinhg you more than anything else. You’re obviously not english/american because of your limited ability to speak the language - that’s fine. But simple courtesy will take you a long way. I’ve already done what you’re asking for. And I’ll happily share mu knowledge if you can ask in a civilised manner.
George, the solution to your problem is to render only visible objects, recoding the coordinates on-the-fly. I fail to see problems with terrain rendering and z-buffer. Of course you don’t just feed the whoel terrain into the renderer. But if you render only the visible part with appropriate cordinates, even the 24-bit z-buffer precision should be enough.
Many of us write projects that struggle with real-time performance on highend hardware. I wouldn’t want the API to chance so that molecule to universe scale application can render effortlessly at the expense of rendering performance. We do have floating point Z-buffers available in hardware these days. That plus a reverse Z mapping actually even gives better precision than what linear distribution would. We don’t have doubles though, if that’s what you require. The first hardware to support doubles in any shape or form was released very recently, but that’s in the shader only. I doubt we’ll see double Z-buffers anytime soon.
Actually, I don’t think you’ve defined what a “proper z-buffer” is in your opinion.
Backface culling is done before rasterization, so z-buffering have nothing to do with any artifacts you may see.
The only reason I can see why that would happen is because your projection matrix might get messed up if you go really extreme, but then I’d be surprised if your rendering looks right otherwise. Have you tried reversing your depth range and using GL_GREATER for your depth test? Even if you use a fixed point z-buffer this helps with precision (at the expense of somewhat worse z-culling performance).
My instinct says that it would be faster to just add z’s of the triangle’s 3 normal vectors together and compare the result against < 0.
Calculating clockwise or counter-clockwise winding is more computationally intensive than that.
My idea assumes that the back-face culling (skipping the rasterization of triangles based on whether or not they face the camera or not) is happening after all vertices and vectors have been translated into screen-space.
It may be faster, but it would also be unreliable as there’s no guarantee that the provided normals really match the input geometry. Any form of smoothed normals would be prone to artifacts. Besides, in the age of vertex shaders, the shader may not be outputting any normal at all.
sqrt[-1] There is no project in existence that has the ability to traverse a planet to another to 1 meter or yard or less on a PC.
I am familiar with those few works none of which render detail or move into less than 1 meter.
The new extension may just work for this project. But it isn’t standard in fact it is written against the standards. Which I can’t blame them on. All of which means large compatibility problems and can’t be used on other manufacture cards.
Most peoples speed problems come from trying to get the work done on the CPU and then getting the correct stuff to the display. In all likelihood a lot of projects would become simpler and there for faster.
As to the Z-Buffer and back face culling. I thought as I guess most people do that when they back faced culled an object it removed it. Apparently the back face culling is tied into the Z-buffer system on both Nvidia and ATI cards.
The so called artifacts you claimed I saw are as I said the reverse facing surfaces blended into the front facing surfaces.
I made sure of this. By reducing the scale of the object then producing it normal then turned on blending and it was identical.
When the object is scaled to a smaller size it displays perfect. When enlarged to the proper scale the back face culling fails and the front and back surfaces are blended.
The solution was to remove the back faces before sending it to the display routine.
To make the point. The current Z-buffer that is a standard is fixed point math and is perspective orientated. meaning more bits are used up in the near field of view than the far which is why the errors I am talking about happen.
Xmas: I tried pretty much everything to see what stuff would happen and to localize the failure to ensure it was z-buffer and not my own code or coords being the problem.
Screen Space winding. For those of you who didn’t read the name of the term. It says screen space winding not world coordinate winding. My point yes it has been processed at that point to screen space and with the lack of depth precision at far distances it fails.