Despite many modern applications of Deep Neural Networks (DNNs), the large number of parameters in the hidden layers makes them unattractive for deployment on devices with storage capacity constraints. In this paper we propose a Data-Driven Low-rank (DDLR) method to reduce the number of parameters of pretrained DNNs and expedite inference by imposing low-rank structure on the fully connected layers, while controlling for the overall accuracy and without requiring any retraining. We pose the problem as finding the lowest rank approximation of each fully connected layer with given performance guarantees and relax it to a tractable convex optimization problem. We show that it is possible to significantly reduce the number of parameters in common DNN architectures with only a small reduction in classification accuracy. We compare DDLR with Net-Trim, which is another data-driven DNN compression technique based on sparsity and show that DDLR consistently produces more compressed neural networks while maintaining higher accuracy.
Data-Driven Low-Rank Neural Network Compression
Data-Driven Low-Rank Neural Network Compression
Data-Driven Low-Rank Neural Network Compression
Research Paper / Sep 2021 / Video coding, Compression, Machine learning/ Deep learning /Artificial Intelligence Neural network
Related Content
Research Paper /Apr 2024 / Compression, Volumetric Imaging, Machine learning/ Deep learning /Artificial Intelligence
"Learning-based point cloud (PC) compression is a promising research avenue to reduce the transmission and storage costs for PC applications. Existing learning-based methods to compress PCs attributes employ variational autoencoders (VAE) or normalizing flows (NF) to learn compact signal representations. However, VAEs leverage a lower-dimensional bottleneck that…
Achieving successful variable bitrate compression with computationally simple algorithms from a single end-to-end learned image or video compression model remains a challenge. Many approaches have been proposed, including conditional auto-encoders, channel-adaptive gains for the latent tensor or uniformly quantizing all elements of the latent tensor. This paper …
Representation of 3D scenes is gaining popularity in industry, notably for Virtual Reality, Augmented Reality, and 360° Video. The point cloud format is well suited for such representations. Indeed, point clouds can be created with a simple capture process and modest processing, enabling a real-time, end-to-end point cloud distribution chain. However, point clou…
Webinar /Jun 2024
Blog Post /Jun 2025
Blog Post /Jun 2025