Semantic path for head pose

twhall-umich · November 5, 2022, 1:18am

This is essentially an opinion question.

We have OpenXR working in our application, running on Oculus Rifts and Microsoft HoloLenses, so this is not a question about how to get and use head poses from OpenXR.

Our application is very modular, with an open-ended collection of optional subsystems that may need to share pose information for the head, hands, gamepad, and so on, and we don’t want to propagate OpenXR code dependencies throughout those subsystems.

We maintain our own database of named transforms, and for the most part we use the OpenXR specified semantic path strings as the names. But it seems that there is no specified semantic path string for the head pose, which might come from a head-mounted display like the Rift or HoloLens, or a Vicon, Polhemus, or other mocap system, or even a Microsoft Kinect. (Our application also supports “VR cave” environments, and I’m specifically working on a Vicon subsystems for tracking and reporting the positions of the shutter glasses and gamepad to the rendering and manipulation subsystems.) We can use any string we want, of course, but I like to conform to standards as much as possible, since the string may appear in disparate places.

We’ve been using “/user/head/pose”

Looking at some BlenderXR source code on GitHub, I see that we’re not the only ones pondering this. I see:

xrStringToPath(m_instance, "/user/head/input/pose" /*"/user/head/pose"*/, &headPosePath);

The author evidently started with “/user/head/pose” as we have, but then commented that out and replaced it with “/user/head/input/pose”. But that still isn’t strictly conformant, since there should be an <identifier> between “input” and “pose”, and none of the standard identifiers apply.

Thoughts? Opinions? Thanks for your time.

twhall-umich · November 5, 2022, 1:38am

I guess maybe “/user/head/input/aim/pose”. Although the spec defines “aim” only for “tracked hands” and “handheld motion controllers”, I suppose that doesn’t necessarily exclude the aim of the head.

Rectus · November 5, 2022, 10:55am

Using “input” in the path would at least to me indicate that it refers to something in an interaction profile

yl_msft · November 5, 2022, 3:35pm

In openxr, the head pose is not considered as an “action pose” since it’s also a fundamental part in many rendering methods. It is not participating any action binding mechanism and doesn’t follow XR input focus rule. To the the forward looking pose that’s attached to users head and origin at the middle point of two eyes, you should use the XR_REFERENCE_SPACE_VIEW

twhall-umich · November 7, 2022, 9:04pm

But it could participate. In fact, it’s a critical input to the first-generation HoloLens “air tap” action, is it not? An application might also use it e.g. to display meta data for whatever is being gazed at, or along with an untracked clicker (another option with the gen-1 HoloLens) to trigger various actions.

Our application isn’t limited to things explicitly described by the OpenXR specification, but there are a lot of good organizing concepts in there. Having a semantic path string to identify and communicate the head pose between modules, in a fashion that’s consistent with other poses – e.g., gamepad – without propagating OpenXR code dependencies, is useful to us.

Again, we have a working OpenXR module. We’re passing semantic path strings between modules to communicate pose and other input information. These modules – including even our renderer – are not constrained to working only with OpenXR. I’m currently working on “VR cave” support that will use Vicon DataStreamSDK input to acquire the head pose. I don’t need to implement the whole OpenXR API around this; I need only to get the pose and give it a consistent name that other modules can reference, and not care whether it came through OpenXR, or Vicon DataStream, or any other particular device API.

We’d like to treat the head pose consistently. “/user/head/input/aim/pose” seems to fit the prescribed form and doesn’t seem to violate anything, even though it’s not explicitly mentioned in the spec.

ryliepavlik · November 8, 2022, 7:28pm

So you are somewhat describing eye gaze interaction, though without eye tracking - see that extension for details. You would need an interaction profile for which the path you want would be applicable: you can’t just write an app that says “tell me /user/head/input/aim/pose”, that’s not how OpenXR works. That said, I think you probably want to just use XR_REFERENCE_SPACE_VIEW for your needs.

Are you writing your own runtime from scratch or building on Monado? (Actually it’s hard to tell whether you’re writing an application or a runtime… probably both, I assume, since I suspect this is academic research) I know Bill Sherman (NIST, previously of Indiana University) has done some plumbing to run Monado and OpenXR apps on CAVEs, and was involved in getting ParaView support for OpenXR. (And this in particular sounds a lot like your use case in terms of the modularity.) Personally (as co maintainer of Monado and PhD graduate of Iowa State University, which at least until the latest gen of HMDs was a huge center for CAVE based research) I’m also quite interested in making CAVE support more of a thing in OpenXR.

/user/head/pose is not a thing and has not been a thing in OpenXR for a very long time. (We might have discussed it before release, but I think it was already gone by the 0.9 provisional release) However, I think it’s a thing in SteamVR/OpenVR, and /me/head (same idea) is a thing in OSVR, both of which strongly influenced the OpenXR input system. In addition lots of apps ported from SteamVR/OpenVR to OpenXR, so there are often some leftover SteamVR idioms in code bases.

If you’re working on a runtime, please use the open source OpenXR conformance test suite regularly, even if you don’t plan to apply to be formally conformant adopters of the spec. (Though note that you can’t say your runtime is OpenXR without that ) Everybody benefits when we keep developer expectations consistent across OpenXR (and OpenXR-like) environments. If there’s functionality not present that you want to see, get in touch and propose an extension, that’s how the spec grows to encompass new uses.

twhall-umich · November 8, 2022, 8:31pm

Writing from scratch, not building on any other framework. We have a core EXE and an open-ended set of DLLs that may or may not be be dynamically loaded at run-time, depending on how an individual project use-case is configured.

In this scheme, OpenXR is itself one of those modules. We have another module for the Oculus SDK, that’s obsolescent now that Oculus has gone all-in with OpenXR. (And I applaud that.)

Our OpenXR module follows the spec closely. It uses XR_REFERENCE_SPACE_TYPE_* (depending on configuration options) to get the parameters for rendering.

Our renders are separate modules, with alternatives for DirectX or Vulkan, gamer PC GPU or lightweight mobile.

We have our own “display” abstraction that serves as the glue between the renderers and the underlying graphics I/O API, whether Its OpenXR, Oculus, or something else (as with frame-sequential stereo in a “cave”, or old-fashioned red-cyan anaglyph in a plain old desktop window). We don’t need to expose all of the complex details of those APIs to the other modules.

It’s sometimes useful for other modules besides the renderers to know how things are posed, including the gamepad and the head. We assign names to objects, including transforms, as a way to specify connections. The OpenXR semantic paths are a nice standard paradigm for naming things. But there’s no standard name for the head pose, evidently.

ryliepavlik · November 15, 2022, 4:06pm

The standard name for head pose is REFERENCE_SPACE_VIEW. However, you shouldn’t be rendering using that, you should be rendering with xrLocateViews. I would definitely suggest looking at how OpenXR was incorporated into VTK/ParaView becuase it sounds like a very similar architecture.

twhall-umich · November 16, 2022, 3:24am

[Sorry if this appears twice. I replied once, tried to edit my reply, and the whole thing disappeared. So here we go again …]

Yes, we are using xrLocateViews, XR_REFERENCE_SPACE_TYPE_*, and so on, to acquire the view information and render the scene.

REFERENCE_SPACE_VIEW is not a semantic path.

I appreciate all of the replies, but I think that some of them are over-thinking the original question, or have lost track of it. Maybe I haven’t stated it clearly enough, or it’s just tl;dr. If so, I apologize.

Again, this is not a question about the fundamentals of interacting with OpenXR, how to acquire the view, or how to render it. We have all of that working very well.

We have modules that could use the head pose for purposes other than rendering. They don’t care how the pose was obtained, they care only about the result. Also, our system is not designed exclusively around OpenXR.

We have named transforms for sharing such information, and OpenXR semantic paths seem like a nice scheme to name things. There is no defined semantic path string for the head pose, but “/user/head/input/aim/pose” conforms to all of the rules for composing a semantic path – OpenXR/specs/1.0/html/xrspec #path-atom-type – OpenXR/specs/1.0/html/xrspec #semantic-path-input – it doesn’t seem to break any rules, so I’ve pretty much settled on using that.

Dark_Photon · November 16, 2022, 6:54pm

Looks like it got caught in the spam filter. Let me know if you’d prefer it to your repost and I’ll swap them.

twhall-umich · November 16, 2022, 7:05pm

Looks like it got caught in the spam filter. Let me know if you’d prefer it to your repost and I’ll swap them

Thanks – no need. I saw the robo-mail about the moderator intervention after I re-posted. I had originally attempted to include links to sections of the OpenXR spec, which evidently triggered the “moderation” alarm.

ryliepavlik · November 22, 2022, 3:53pm

Ah, I understand now. Yes, do feel free to use semantic paths like that, as long as they never make it into an OpenXR call other than pathtostring/stringtopath sounds good to me!

yl_msft · November 22, 2022, 8:00pm

I agree with @ryliepavlik that however you want to use sematic path to communicate head pose is the app’s choice. You can use this path system as you wish.

Though I think it’s debatable to add head pose into openxr standard action paths. But if this is app’s behavior, I don’t have concern.

system · May 24, 2023, 8:00pm

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.