I am implementing FFT Convolution on GPU so the first step is to do padding by zeros on the 3d image texture to the closest power of 2 dimensions, What is the fastest way to do so, and is it better to do the padding on CPU then passing it to the GPU ?
ps: The original image texture is not used in the rest of the algorithm so if there is a way to just pad zeros on the same texture it would be better
?? you mean fill the texture with ‘color’ ?
I think it would be relevant to reveal the inner color-format of the texture you have in mind. The gpu/cpu will understand/read/write the data according to the format you use. An image-texture typically uses a 3 or 4 byte format. An RGB (red, green, blue) data-point intermixes the basic colors; with RGB of (0,0,0) as black.
When filling the texture with data, you’ll need a byte data-pointer on the cpu-side
std::vextor<byte> v(500,0) … or something in the neighborhood, will erect a vector of 500 bytes of value 0. The pointer will be v.data().
It could be quicker to
and transfer this inner GPU data to the texture.
Maybe an image-buffer will have the default value of 0,0,0. I don’t know.
or am I perfectly off the track here?
For a 3D texture, it’s hard to say which would be better. For 2D, you can use render-to-texture (
glFrameBufferTexture*) to efficiently initialise portions of a texture, but for 3D you’d have to do it one layer at a time and the added cost may outweigh the benefits of using the GPU.
If you’re starting with (non-power-of-two) data in CPU memory, it would be better to create the texture with power-of-two dimensions then initialise portions using
glTexSubImage3D. Note that you have to provide data for each call; there’s no way to fill a region with a constant value. If you’re starting with a texture in GPU memory, keeping the data in GPU memory using
GL_PIXEL_PACK_BUFFER for reads and
GL_PIXEL_UNPACK_BUFFER for writes would be better than going through CPU memory.
For filling the unused portions with zeros, one option is to use
glTexSubImage3D with a sufficiently large block of zeroes which is already on the GPU. A compute shader (with
imageStore) could be used, but I don’t know whether a trivial compute shader will be efficient. If you aren’t doing the FFT in-place, accepting a non-power-of-two source texture and forcing out-of-texture values to zero in the first pass may be more efficient than padding.
To understand my case better I want my compute shader to:
1- receive a 3d image which will have one channel (real value of the pixel).
2- create a new 3d image with 2 channels (one for the real value and the other for the imaginary value), the new 3d image will have new dimensions which will be larger than the old 3d image.
4- clear the second channel with zeros.
5- clear the padded part with zeros in both channels.
6- deleting the old texture as I don’t need it anymore.