OpenCL floating rounding specification oddities

I was trying to learn how floating point rounding modes are handled in OpenCL and I came across some things in the spec that seems really odd or poorly specified. For reference I’m looking at OpenCL 2.0 (document revision 29)

## Setting rounding modes

Section 7.1 Rounding Modes states

Round to nearest even is currently the only rounding mode required 67 by the OpenCL
specification for single precision and double precision operations and is therefore the default
rounding mode. In addition, only static selection of rounding mode is supported. Dynamically
reconfiguring the rounding modes as specified by the IEEE 754 spec is unsupported.

This misleadingly hints at the possibility that the rounding mode can be changed at compile time. It seems devices can advertise what rounding modes they support ( clGetDeviceInfo(…, CL_DEVICE_SINGLE_FP_CONFIG, …)) however the OpenCL runtime API provides no way of setting this at compile time :doh: .

It seems that OpenCL 1.0 had support for setting the rounding mode via pragmas with the cl_khr_select_fprounding_mode extension but this disappeared in OpenCL 1.1.

I think mentioning that “only static selection of rounding mode is supported” should be dropped because even if a device supports multiple rounding modes the user has no way to change the default.

## Ambiguity for vstore_halfX() built-ins

There are several groups of functions that have a default rounding mode (e.g. vstore_half) which state the following

vstore_half uses the default rounding
mode. The default rounding mode is round
to nearest even.

This is confusing in the case of the OpenCL embedded profile because the default rounding mode of the environment can also be CL_FP_ROUND_TO_ZERO so “round to nearest even” is not necessarily the default. The text should probably just refer to section 7.1 to avoid confusion.

## Conclusion

It seems like there is no control over the rounding mode of the environment at all so unless a device uses the embedded profile and does not support CL_FP_ROUND_TO_NEAREST then all rounding is always done using CL_FP_ROUND_TO_NEAREST (except for when conversion functions are used).