Hardware-Software Stitching Algorithm in Lightweight Q-Learning System on Chip (SoC) for Shortest Path Optimization
This paper presents a novel hardware-software co-design approach to accelerate Q-learning algorithms using a RISC-V-based System-on-Chip (SoC) design. We introduce a maze-stitching algorithm that enables efficient solving of large, complex mazes by decomposing them into smaller sub-mazes and thus ca...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11030563/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This paper presents a novel hardware-software co-design approach to accelerate Q-learning algorithms using a RISC-V-based System-on-Chip (SoC) design. We introduce a maze-stitching algorithm that enables efficient solving of large, complex mazes by decomposing them into smaller sub-mazes and thus can perform Q-Learning computation in low-complexity hardware accelerator. Furthermore, we provide a comprehensive analysis of the algorithm’s complexity, theoretical performance gains, and practical implementation results. The proposed implementation demonstrates significant performance improvements over traditional software approaches, achieving speedups ranging from <inline-formula> <tex-math notation="LaTeX">$84\times $ </tex-math></inline-formula> to <inline-formula> <tex-math notation="LaTeX">$233\times $ </tex-math></inline-formula> for complex maze-solving tasks while maintaining a small footprint. The algorithm, even without the accelerator also achieved speedups ranging from <inline-formula> <tex-math notation="LaTeX">$13\times $ </tex-math></inline-formula> to <inline-formula> <tex-math notation="LaTeX">$36\times $ </tex-math></inline-formula>. The proposed system combines a 64-bit RISC-V core with a Q-Learning accelerator, operating at 50MHz on an Arty A7-100T FPGA. The maze-stitching technique allows for scaling to larger problem sizes while maintaining hardware efficiency. The proposed work can be applied to a lightweight accelerator and provides a scalable solution for resource-constrained edge computing environments. |
|---|---|
| ISSN: | 2169-3536 |