Digital chefs and intelligent cooking systems based on multimodal large language model
A digital chef and an intelligent cooking method were proposed to achieve high-quality, precise cooking results. In the offline phase, visual, auditory and thermal sensors record professional chefs' continuous cooking operations. The collected frame-by-frame images and multi-round Q&A t...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
POSTS&TELECOM PRESS Co., LTD
2024-12-01
|
Series: | 智能科学与技术学报 |
Subjects: | |
Online Access: | http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202448/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | A digital chef and an intelligent cooking method were proposed to achieve high-quality, precise cooking results. In the offline phase, visual, auditory and thermal sensors record professional chefs' continuous cooking operations. The collected frame-by-frame images and multi-round Q&A texts form a culinary expert knowledge base. A low-rank adaptation method was applied to fine-tune a pretrained multimodal large language model, enabling it to understand cooking intentions. In the online phase, real-time sensory data were converted into image-text inputs for the fine-tuned model, which then generated cooking instructions to guide users through the cooking steps. A hardware-software cooking system was implemented and tested with a pan-frying steak task. Experimental results show that the fine-tuned system effectively controls the steak's doneness and quality, and significantly improves the accuracy and rationality of cooking instructions compared to the model before fine-tuning. |
---|---|
ISSN: | 2096-6652 |