Neuro-Mimetic Developmental Architecture for Continual Learning Through Self-Organizing Multimodal Perception Coordination

Developing cognitive capabilities in autonomous agents stands as a challenging yet pivotal task. In order to construct cohesive representations and extract meaningful insights from the environment, the brain utilizes patterns from a variety of sensory inputs, including vision and sound. Furthermore,...

Full description

Saved in:
Bibliographic Details
Main Authors: Farhan Dawood, Naoki Masuyama, Chu Kiong Loo
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10949081/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Developing cognitive capabilities in autonomous agents stands as a challenging yet pivotal task. In order to construct cohesive representations and extract meaningful insights from the environment, the brain utilizes patterns from a variety of sensory inputs, including vision and sound. Furthermore, related sensory inputs can activate each other’s representations, highlighting the brain’s ability to associate and integrate information across different sensory channels – key to cognitive development and adaptive learning. Current learning methods often mirror the developmental process of infants, who enhance their cognition through guidance and exploration. These methods often struggle with issues such as catastrophic forgetting and stability-plasticity trade-offs. This study presents a novel brain-inspired hierarchical autonomous framework, Cognitive Deep Self-Organizing Neural Network (CDSN), designed to enable autonomous agents to acquire object concepts dynamically. The architecture includes dual parallel audio-visual information pathways, incorporating three layers based on a Topological Kernel CIM-based Adaptive Resonance Neural Network (TC-ART). The first layer, referred to as the receptive layer, learns and organizes visual attributes and object names autonomously in an unsupervised manner. Subsequently, the second layer, the concept layer, distills clustered results from the corresponding receptive layer to create succinct symbol representation. In a synchronized manner, visual and auditory concepts are combined concurrently in the third layer, the associative layer, to establish real-time associative connections between the modalities. Furthermore, this layer introduces a top-down response approach, allowing agents to independently retrieve associated modalities and adapt acquired knowledge in a hierarchical manner. Experimental evaluations conducted on object datasets demonstrate the proposed architecture’s efficacy in online learning and the association of object viewpoints and labels. Our approach achieves superior recall accuracy, with rates of 92.25% for visual recall and 92.45% for auditory recall. A real-world simulation using humanoid platform further validate proposed neuromimetic architecture’s capabilities.
ISSN:2169-3536