SYQ:学习对称量化的高效深度神经网络 SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

对最先进的深度神经网络的推断在计算上是昂贵的,使得它们难以在受约束的硬件环境上部署。降低这种复杂性的有效方法是通过用有限的入口码本近似它们的分布来量化训练期间的权重参数和/或激活。对于非常低的精度,例如具有1-8位激活的二进制或三进制网络,由于前向和后向功能之间的大的梯度失配,来自量化的信息损失导致显着的精度降级。在本文中,我们介绍了一种量化方法,通过学习特定权重子群的对称码本来减少这种损失。这些子组是根据它们在权重矩阵中的位置确定的,这样可以保持低精度表示的硬件简单性。根据经验,我们证明了对称量化可以显着提高具有极低精度权重和激活的网络的准确性。我们还证明了这种表示对更粗粒度的方法施加了最小的硬件含义或没有硬件含义。

Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with 1-8-bit activations, the information loss from quantization leads to significant accuracy degradation due to large gradient mismatches between the forward and backward functions. In this paper, we introduce a quantization method to reduce this loss by learning a symmetric codebook for particular weight subgroups. These subgroups are determined based on their locality in the weight matrix, such that the hardware simplicity of the low-precision representations is preserved. Empirically, we show that symmetric quantization can substantially improve accuracy for networks with extremely low-precision weights and activations. We also demonstrate that this representation imposes minimal or no hardware implications to more coarse-grained approaches.

https://github.com/julianfaraone/SYQ

转载请注明:《SYQ:学习对称量化的高效深度神经网络 SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

发表评论