No - you should duplicate the vertex if the texture coord is different (note also that your surface normal vector will also be different - which gives you two reasons to duplicate it).
The way to think about this is that each “vertex” is a data packet that you’re sending to the vertex shader. The vertex shader processes that data packet and produces some kind of output. One vertex goes in - and one comes out the other end, over and over until the mesh is rendered.
Because the shader works like this - and because it has no “memory” from one vertex to the next, there is no way to share some part of the per-vertex data from one run of the shader to the next.
This sounds like a horrible waste - but in practice it usually isn’t. For a teeny-tiny number of vertices like in a cube, the amount of time it takes to set up to draw the mesh totally dwarfs the processing time inside the GPU. In fact, with most GPU’s any mesh with less than around 256 vertices results in the GPU stalling to wait for the next object to be set up. An object with 256 vertices takes literally no more time to draw than an object with 8 vertices.
The problem you describe might become an issue on a gigantically complex model which is all hard angles and has more than 256 verts. But in many applications such things tend to be fairly rare - most large vertex count objects are smooth, rounded things like humans that share normal and texcoord data very well.
Also, in modern applications, you very often want to do advanced per-pixel lighting using normal mapping. That entails sending (per vertex):
position, one or two sets of texture coordinate, normal, binormal, tangent
So you’re sending 14 to 16 numbers per vertex…saving three by somehow sharing the vertex coordinate isn’t such a big deal.
If performance with large meshes is a big deal, there are other tricks that can save much of the cost of per-vertex data if you know what your application is. One is to ask whether you really need a full 32 bit “float” for your vertices. In the game I’m writing now, the action all takes place inside a single building - which is a 50meter cube. I don’t need more than about a millimeter of positional accuracy in my source models. Hence, there are only 50,000 distinct positions along the X,Y and Z axes - and I can store my coordinates in a “short”. This halves the amount of data I need to send compared to using floating point texture coordinates. I also send normals and binormals as signed bytes and omit the tangent vector altogether (recomputing it with a cross-product in the shader).