I am trying to use the Khronos sample implementation of OpenVX 1.3 for Raspberry Pi and I have the following questions:
I want to implement on a Raspberry Pi 3B an application (that will be fed with a simple CNN trained in Tensorflow Keras) using the sample implementation of OpenVX 1.3 from KhronosGroup, together with the Neural Network extension. The weights and activations of the Keras model are 32-bit floats and so are not supported in this implementation of OpenVX. I quantized the model using Tensorflow Lite Converter, which resulted in a tflite file with uint8- and int32-typed tensors. My first question is: How can I map (if that’s possible) these tensors to the int8 implementation of OpenVX?
I converted the above tflite model to NNEF format in order to use the parser provided in NNEF-Tools to import the quantized model into the OpenVX-based application. Unfortunately the converted model to NNEF format is not parsed by the parser provided in NNEF-Tools. The resulting graph contains a “tflite_quantize” compound fragment and it seems that NNEF does not support the tflite quantization scheme (see https://github.com/KhronosGroup/NNEF-Tools/issues/118). In NNEF-1.0 specs I read that a “linear_quantize” operation is specified (also as compound fragment), so the question is: would it be possible to use this “linear_quantize” operation somehow instead of “tflite_quantize”? I am newbie in quantization, am I missing or misunderstanding something?
Is the above approach viable using current development of both NNEF-tools and this OpenVX sample implementation? Is there a misconception in using OpenVX NN extension with a quantized CNN?