OK, there are two fundamental misunderstandings at play here. Part of it is word choice, and part of it is just not realizing what different things are. But I think I understand where they are coming from.
Issue One
The first misunderstanding is in the nature of GPUs and GLSL. You seem to think of GPUs as near-exact analogs of CPUs. CPUs have memory; GPUs have memory. CPUs run programs; GPUs run programs. You code CPUs in C or C++; you code GPUs in GLSL.
Therefore, you have come to the conclusion that all GPU operations are governed by GLSL, that all GPU operations are just different functions executing and such. On CPUs, C and C++ code allocate CPU-accessible memory. Therefore, on GPUs, GLSL allocates GPU-accessible memory. OpenGL thus acts as some kind of “blending” agent between C/C++ and GLSL.
That is not how it works.
GLSL does not control what gets rendered. GLSL does not control when rendering happens. The only thing GLSL is used for is to determine shaders. And shaders are very limited in what they do. They operate on certain parts of the rendering pipeline, and they cannot (directly) affect any other parts of that pipeline, outside of the specific values they write as outputs.
They don’t manage the GPU’s memory. They do not schedule GPU operations. All they do happens in certain discrete, fixed locations.
GLSL’s control over GPU operations is in no way similar to C or C++'s control over CPU operations. GLSL is much more like a scripting language than a regular programming language. Like most scripting languages, you compile them as part of executing the application, not as part of compiling it. Like most scripting languages, they operate in a sandboxed environment, which has very limited access to the outside world. Like most embedded scripting languages, they execute exactly and only when the execution environment decides they get to execute.
So if GLSL doesn’t manage GPU resources, who does? The CPU, through the OpenGL implementation, manages them. CPU operations allocate GPU memory. CPU operations decide what gets rendered. CPU commands determine when resources are released. And so forth.
Issue Two
Memory architecture. Let’s say I have a file-static variable in C/C++:
static float myValue = 28.321f;
I can expose a function in that file that allows you to fetch the value of that variable. I can expose a function that allows you to set the value of that variable. But unless I expose a function that returns a pointer or C++ reference to it, your code cannot directly access that memory.
‘myValue’ is in CPU-accessible memory. It is in your process’s address space. But by the rules of C/C++, there is no way to get a valid pointer or reference to it (without breaking those rules).
OpenGL can allocate CPU memory or GPU memory, if a GPU even allows such a distinction. Therefore, “OpenGL memory” could be either one. Or some of both.
Similarly, OpenGL is capable of undertaking “CPU operations”. OpenGL implementations are, at the end of the day, just libraries that you (dynamically) link to. They execute code on the CPU. Some of that code kicks off GPU operations. But other code is purely a matter of CPU work.
Shader compilation is one of those CPU-only things. OpenGL implements compilation, certainly. But it all happens on the CPU, not the GPU.