Hardware for Deep Learning Acceleration

Deep learning (DL) has proven to be one of the most pivotal components of machine learning given its notable performance in a variety of application domains. Neural networks (NNs) for DL are tailored to specific application domains by varying in their topology and activation nodes. Nevertheless, the...

Full description

Saved in:
Bibliographic Details
Main Authors: Choongseok Song, ChangMin Ye, Yonguk Sim, Doo Seok Jeong
Format: Article
Language:English
Published: Wiley 2024-10-01
Series:Advanced Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1002/aisy.202300762
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Deep learning (DL) has proven to be one of the most pivotal components of machine learning given its notable performance in a variety of application domains. Neural networks (NNs) for DL are tailored to specific application domains by varying in their topology and activation nodes. Nevertheless, the major operation type (with the largest computational complexity) is commonly multiply‐accumulate operation irrespective of their topology. Recent trends in DL highlight the evolution of NNs such that they become deeper and larger, and thus their prohibitive computational complexity. To cope with the consequent prohibitive latency for computation, 1) general‐purpose hardware, e.g., central processing units and graphics processing units, has been redesigned, and 2) various DL accelerators have been newly introduced, e.g., neural processing units, and computing‐in‐memory units for deep NN‐based DL, and neuromorphic processors for spiking NN‐based DL. In this review, these accelerators and their pros and cons are overviewed with particular focus on their performance and memory bandwidth.
ISSN:2640-4567