0
votes

I am trying to understand quantization in tensorflow and I am following this tutorial.

https://heartbeat.fritz.ai/8-bit-quantization-and-tensorflow-lite-speeding-up-mobile-inference-with-low-precision-a882dfcafbbd

In the tutorial it says that, the quantization equation is:

enter image description here

  • r is the real value (usually float32)
  • q is its quantized representation as a B-bit integer (uint8, uint32, etc.)
  • S (float32) and z (uint) are the factors by which we scale and shift the number line. z is the quantized ‘zero-point’ which will always map back exactly to 0.f.

I am struggling to understand the meaning of the zero point and was hoping somebody could explain it with an example?

1

1 Answers

4
votes

If you have values with negative data, then the zero point can offset the range. So if your zero point was 128, then negative values -128 to -1 would be represented by 0 to 127, and positive values 0 to 127 would be represented by 128 to 255.

Given an input tensor with data ranging from -1000 to +1000, and an element with value 39.215686275:

realValue = 39.215686275
zeroPoint = 128   // 256/2, which is symmetric
realRangeMinValue = -1000
realRangeMaxValue = 1000
integerRangeMinValue = 0
integerRangeMaxValue = 2 ^ 8 - 1 = 255
quantizedRangeMinValue = integerRangeMinValue - zeroPoint = -128
quantizedRangeMaxValue = integerRangeMaxValue - zeroPoint = 127

scale = integerRangeMaxValue / (realRangeMaxValue - -realRangeMinValue) = 0.1275
// scale = 255 / (1000 - -1000)
quantizedValue = realValue * 255 / (1000 - -1000) + 128 = 133
// quantizedValue = 39.215686275 * 255 / (1000 - -1000) + 128 = 133

Conversely:

realValue = (1000 − -1000) / 255 * (133 - 128) = 39.215686275