I’ve been struggling to wrap my brain around the right way to make device-independent, future-proofed, “pose + boolean” action, like a “teleport” action that needs to know when a user presses a button, and also needs to know the pose of the action source at the time the button was pressed.
It seems like the recommended way to do this is to have two XrActions, a Boolean and a Pose, that are queried at the same time. The problem here is that users can bind any number of action sources onto the Boolean XrAction, and I need some way to associate a Pose with the Boolean action. This is where subaction paths come in.
If this is an XrSystem composed of a current-day VR HMD with two 6-DOF tracked controllers, this isn’t really a problem. The suggested way to support this is to create a Boolean “teleport” XrAction, as well as a Pose “teleport_pose” XrAction. Bind “teleport” to a button on the left and right controllers, bind “teleport_pose” to a Pose source on the left and right controllers. Then, in the application loop, use subaction paths “/user/hand/left” and “/user/hand/right” whenever querying the state of the “teleport” and “teleport_pose” actions. This tells you which “teleport_pose” to use for the ray cast if you ever see the “teleport” action is active.
This works fine, but requires the application make hard-coded assumptions that the only subaction paths that it will ever care about will be “/user/hand/right” and “/user/hand/left”. This might not be true! Consider the following hypothetical scenarios:
- A user wants to bind the “teleport_pose” to the head pose, so they teleport to where their head is pointing. The application can’t detect this situation since it’s testing the “teleport_pose” action against “/user/hand/left” and “/user/hand/right” subaction paths, which would exclude the head pose!
- A user has an eye tracking system and wants to bind the “teleport_pose” to their eye gaze pose, so that they teleport where they’re looking. Again, the application can’t detect this situation since it’s testing the “teleport_pose” action against “/user/hand/left” and “/user/hand/right”, which would exclude the eye gaze pose!
You could argue that the application should also be testing “/user” as one of its subaction paths, but this just kicks the can down the road a bit longer: it buckets all pose sources that “aren’t the user’s hands” into a single subaction. You’re now effectively limiting the user to only be able to bind 3 pose sources for the “teleport_pose” action, since it can only test 3 subaction paths, and if you ended up with two Pose sources on your head (head pose and eye gaze) you’re out of luck: you can only bind one of them to that action, or the system can’t tell them apart with subaction paths alone.
This is a somewhat contrived example, but I’m using it to try and illustrate that problems arise when using subaction paths to correlate a Pose action with any of the other action types.
A potential way to solve this would be to have a way to ask the XrSystem what subaction paths it should be checking for a particular action. The binding UI might need to provide a way for the user to specify these subaction paths for each action, and the XrSystem would need to provide enough valid subaction paths to be able to differentiate between devices in similar body locations. For example, you might need a “/user/head” and a “/user/head/eye” subpath to be able to tell the head and eye poses appart.
An alternate solution would be to have combined PoseAndXXX action types that provide a combined Pose + action value as a single action. The user could still bind multiple Pose sources to this action, and could bind multiple value sources as well. The binding UI would provide a way for the user to map (value source) -> (pose source) for the action, and would allow the user to bind multiple (value source) -> (pose source) combinations to the same action. For example, the user might bind the same pose source multiple times with different value sources if they wanted multiple buttons on the same controller to trigger the “teleport” action. The binding UI would not be allowed to bind multiple pose sources to the same value source on a single action: an action cannot simultaneously be occurring at two different locations!
Has anyone else thought about this problem? Anyone have any thoughts? Am I way off base with my thinking about how the OpenXR Input system works?