I studied “3D Engine Design for Virtual Globes” book and learned a lot about rendering planet with RTE method (eliminating jittery due to 32-bit float inaccurate). However, book said that RTE method only work with static vertices (not rotation). I tried that with planet rotation but it did not work with RTE method (still show jittery) due to rotation.I think that I now find solution by applying per-tile vertices with rotation matrix in double precision and divide into two single precision values through CPU process each tick. Then subtracting camera position from vertices and applying with final RTE matrix through GPU process. I learned something from Orbiter OVP client source code about per-tile world matrix function call for jittery elimination. I had not implemented them in my program yet.

Does anyone have alternative methods (other than book’s RTE methods for large world rotation)? What technique does they implement in KSP program for rendering large world?

Since OpenGL 4.x now support double precision through GPU process, GPU hardware vendors did not provide FP64 hardware operations for consumer GPU cards - nerfed FP64 hardware performance. If they implement FP64 hardware operations, it would improve performance so much and eliminate jittery issues and complicated calculations.

It’s unlikely that anyone else here has that book. However, the general approach for dealing with large scenes is always the same: calculate the position of the chunk origin relative to the viewpoint using double-precision arithmetic and use single-precision for everything else.

You aren’t going to see a “graphics” card which has no speed penalty for double precision any time soon. A double-precision multiply requires either four times as many cycles or four times as many gates as a single-precision multiply. Using four times as many gates for the ALU basically means “wasting” gates that could have been used for more cores, or a larger cache, or something else which is going to have a greater impact upon graphics than double-precision multiplies. The only GPUs with fast double-precision arithmetic are those designed for numerical computation rather than graphics.

Represent your MODELING and VIEWING transforms with double-precision.

During your cull traversal, for each object…
a. Combine MODELING and VIEWING matrices into a MODELVIEW matrix using double-precision.
b. Convert this matrix to single-precision for uploading to the GPU.

With this, you don’t need/care about fp64 support/performance on the GPU, and you get the precision you need when combining the MODELING and VIEWING transforms.

Rotation isn’t the problem. Large translations (in MODELING and VIEWING transforms) are the problem.

Specifically, large translations which cancel out to give a small translation (i.e. both object and viewpoint are far from the origin but close to each other).

The other problem is if vertex positions are much farther (orders of magnitude farther) from the origin (of their coordinate system) than they are from the viewpoint. Which is why you need to split data into chunks with the origin within the chunk. This is essentially the same as having a modelling transformation with a large translation being “baked” into the vertex positions.