Approximate CNN Hardware Accelerators for Resource Constrained Devices

Implementation of Convolutional Neural Networks (CNNs) on edge devices require reduction in computational complexity. Leveraging optimization techniques or approximate computing techniques can reduce the overhead associated with hardware implementation. In this paper, we propose a modular pipelined...

Full description

Saved in:

Bibliographic Details
Main Authors:	P Thejaswini, Gautham Suresh, V. Chiraag, Sukumar Nandi
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Approximate computing convolutional neural networks (CNNs) hardware accelerators edge computing image classification
Online Access:	https://ieeexplore.ieee.org/document/10840189/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832586855127711744
author	P Thejaswini Gautham Suresh V. Chiraag Sukumar Nandi
author_facet	P Thejaswini Gautham Suresh V. Chiraag Sukumar Nandi
author_sort	P Thejaswini
collection	DOAJ
description	Implementation of Convolutional Neural Networks (CNNs) on edge devices require reduction in computational complexity. Leveraging optimization techniques or approximate computing techniques can reduce the overhead associated with hardware implementation. In this paper, we propose a modular pipelined Feedforward CNN Hardware Accelerator (FHA) and a novel Approximate Feedforward CNN Hardware Accelerator (AFHA). The AFHA design is achieved through the incorporation of hardware pruning and Approximate Multiply Accumulate (AMAC) units. Our proposed architectures are validated for functionality through an image classification application, utilising the popular MNIST dataset for 8-bit, 16-bit and 32-bit operational word size. Performance analysis of our proposed architectures show that the 32-bit FHA consumes 307.04pJ of energy while achieving acceleration of 76.91x. AFHA attains an acceleration of 120.69x with an energy consumption of 295.85pJ. Similarly, the 16-bit and 8-bit architectures demonstrate substantial acceleration while significantly reducing power consumption. The performance of our proposed architectures demonstrates significant acceleration and reduced power consumption compared to popular edge machine learning framework - TinyML TensorFlow Lite. FHA achieves a significant accuracy improvement of 6.2% along with a speedup of 1.07x and AFHA achieves accuracy enhancement of 4.3% and an impressive speedup of 1.42x.
format	Article
id	doaj-art-4d47e98b52d84e9f965886b1e008bdea
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-4d47e98b52d84e9f965886b1e008bdea2025-01-25T00:02:34ZengIEEEIEEE Access2169-35362025-01-0113125421255310.1109/ACCESS.2025.352966810840189Approximate CNN Hardware Accelerators for Resource Constrained DevicesP Thejaswini0https://orcid.org/0000-0003-1240-6860Gautham Suresh1https://orcid.org/0009-0004-7414-0476V. Chiraag2Sukumar Nandi3https://orcid.org/0000-0002-5869-1057JSS Academy of Technical Education, Bengaluru, Karnataka, IndiaJSS Academy of Technical Education, Bengaluru, Karnataka, IndiaJSS Academy of Technical Education, Bengaluru, Karnataka, IndiaIndian Institute of Technology Guwahati, Guwahati, Assam, IndiaImplementation of Convolutional Neural Networks (CNNs) on edge devices require reduction in computational complexity. Leveraging optimization techniques or approximate computing techniques can reduce the overhead associated with hardware implementation. In this paper, we propose a modular pipelined Feedforward CNN Hardware Accelerator (FHA) and a novel Approximate Feedforward CNN Hardware Accelerator (AFHA). The AFHA design is achieved through the incorporation of hardware pruning and Approximate Multiply Accumulate (AMAC) units. Our proposed architectures are validated for functionality through an image classification application, utilising the popular MNIST dataset for 8-bit, 16-bit and 32-bit operational word size. Performance analysis of our proposed architectures show that the 32-bit FHA consumes 307.04pJ of energy while achieving acceleration of 76.91x. AFHA attains an acceleration of 120.69x with an energy consumption of 295.85pJ. Similarly, the 16-bit and 8-bit architectures demonstrate substantial acceleration while significantly reducing power consumption. The performance of our proposed architectures demonstrates significant acceleration and reduced power consumption compared to popular edge machine learning framework - TinyML TensorFlow Lite. FHA achieves a significant accuracy improvement of 6.2% along with a speedup of 1.07x and AFHA achieves accuracy enhancement of 4.3% and an impressive speedup of 1.42x.https://ieeexplore.ieee.org/document/10840189/Approximate computingconvolutional neural networks (CNNs)hardware acceleratorsedge computingimage classification
spellingShingle	P Thejaswini Gautham Suresh V. Chiraag Sukumar Nandi Approximate CNN Hardware Accelerators for Resource Constrained Devices IEEE Access Approximate computing convolutional neural networks (CNNs) hardware accelerators edge computing image classification
title	Approximate CNN Hardware Accelerators for Resource Constrained Devices
title_full	Approximate CNN Hardware Accelerators for Resource Constrained Devices
title_fullStr	Approximate CNN Hardware Accelerators for Resource Constrained Devices
title_full_unstemmed	Approximate CNN Hardware Accelerators for Resource Constrained Devices
title_short	Approximate CNN Hardware Accelerators for Resource Constrained Devices
title_sort	approximate cnn hardware accelerators for resource constrained devices
topic	Approximate computing convolutional neural networks (CNNs) hardware accelerators edge computing image classification
url	https://ieeexplore.ieee.org/document/10840189/
work_keys_str_mv	AT pthejaswini approximatecnnhardwareacceleratorsforresourceconstraineddevices AT gauthamsuresh approximatecnnhardwareacceleratorsforresourceconstraineddevices AT vchiraag approximatecnnhardwareacceleratorsforresourceconstraineddevices AT sukumarnandi approximatecnnhardwareacceleratorsforresourceconstraineddevices

Approximate CNN Hardware Accelerators for Resource Constrained Devices

Similar Items