MobileNet-HeX: Heterogeneous Ensemble of MobileNet eXperts for Efficient and Scalable Vision Model Optimization
Efficient and accurate vision models are essential for real-world applications such as medical imaging and deepfake detection, where both performance and computational efficiency are critical. While recent vision models achieve high accuracy, they often come with the trade-off of increased size and...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Big Data and Cognitive Computing |
Subjects: | |
Online Access: | https://www.mdpi.com/2504-2289/9/1/2 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Efficient and accurate vision models are essential for real-world applications such as medical imaging and deepfake detection, where both performance and computational efficiency are critical. While recent vision models achieve high accuracy, they often come with the trade-off of increased size and computational demands. In this work, we propose MobileNet-HeX, a new ensemble model based on Heterogeneous MobileNet eXperts, designed to achieve top-tier performance while minimizing computational demands in real-world vision tasks. By utilizing a two-step Expand-and-Squeeze mechanism, MobileNet-HeX first expands a MobileNet population through diverse random training setups. It then squeezes the population through pruning, selecting the top-performing models based on heterogeneity and validation performance metrics. Finally, the selected Heterogeneous eXpert MobileNets are combined via sequential quadratic programming to form an efficient super-learner. MobileNet-HeX is benchmarked against state-of-the-art vision models in challenging case studies, such as skin cancer classification and deepfake detection. The results demonstrate that MobileNet-HeX not only surpasses these models in performance but also excels in speed and memory efficiency. By effectively leveraging a diverse set of MobileNet eXperts, we experimentally show that small, yet highly optimized, models can outperform even the most powerful vision networks in both accuracy and computational efficiency. |
---|---|
ISSN: | 2504-2289 |