Object space is usually the space your vertices for your model sit in before you do anything like translate your object, move it in front of the “camera”, etc. Say you have a model that has a point (1,1,1) when you load it. That point is in model space. When you move the whole model over 10 points in the “world” y that point (1,1,1) in model space is (1,11,1) in world space. In OpenGL you usually keep the model and view (camera) matrices as one so when that point is multiplied by the modelview matrix, it moves that point into “camera” or “eye” space. The “camera” in OpenGL is always pointing down the negative z-axis so things like gluLookAt are spinning things around to be in front of the camera. Clipping space is after the projection, where everything in the frustum is squeezed into the [-1,1] box.
a2v is just a structure that gets declared on page 13 “struct a2v”. That structure’s data is automatically filled based on the semantics attached to them. So when using a2v IN, IN.pos or IN.normal, it’s just pulling up the references to the normal or position that the vertex came in with. Again, if you look at the POSITION in a2v, it is in object space which means it has the value that you sent down with glVertex or whatever is in your buffer before any transformations.
The normal vector you probably know. The binormal and tangent are a basis for orienting your normal. When read from a normal map, that normal is usually pointing in the Z direction (why most normal maps look blue when you see them). You need the binormal and tangent to spin it around to a useful space. If you want more on that, you should look up a full tutorial on normal mapping, which is it’s own deal.