You may get better answers for this on the OpenGL section (https://community.khronos.org/c/opengl-general/34) since the technique is not specific to glTF or any particular file format. But in general terms:
The “keyframes” in the glTF animation data are sparse, usually < 30 FPS, and may be even sparser if the author has made efforts to optimize the file for size. Most viewers want to render at >= 60 FPS, i.e. to render more “frames” than you have “keyframes.” This is accomplished with interpolation, and glTF defines three interpolation modes: STEP, LINEAR, or CUBICSPLINE. You can find an explanation of the animation data here in the glTF tutorials.
Once you’ve interpolated “keyframe” values to get a value for the current “frame”, it’s time to render. There are three basic types of animation: Translation/Rotation/Scale (TRS), Morph Targets, and Skinning. TRS is the simplest, and just requires updating the object matrix, however you are sending that to the vertex shader.