VARAT: Variable Annotation Tool for Documents on Manufacturing Processes

Building physical models is essential for realizing digital twins in the manufacturing industry. This task, however, is labor-intensive and requires a deep understanding of target processes and extensive knowledge from various literature sources. Although this extensive workload can be mitigated by...

Full description

Saved in:
Bibliographic Details
Main Authors: Shota Kato, Manabu Kano
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Journal of Chemical Engineering of Japan
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/00219592.2025.2454461
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832095518806769664
author Shota Kato
Manabu Kano
author_facet Shota Kato
Manabu Kano
author_sort Shota Kato
collection DOAJ
description Building physical models is essential for realizing digital twins in the manufacturing industry. This task, however, is labor-intensive and requires a deep understanding of target processes and extensive knowledge from various literature sources. Although this extensive workload can be mitigated by automated extraction of information from the literature, developing such methods necessitates domain-specific datasets lacking in chemical engineering. To address this problem, we developed an algorithm for extracting variable symbols from documents and a variable annotation tool, VARAT, based on this algorithm. Our proposed algorithm, tested on 47 papers on physical models of five manufacturing processes, achieved a recall of 97% and a precision of 96%. VARAT was subsequently employed to create a dataset containing 1,988 variable symbols from the 47 papers. This tool reduced the annotation time per paper by more than half. VARAT is expected to accelerate the development of datasets vital for chemical engineering information extraction and ultimately facilitate the development of physical models.
format Article
id doaj-art-f58364902c5e44c1bee96879c41b76b5
institution Kabale University
issn 0021-9592
1881-1299
language English
publishDate 2025-12-01
publisher Taylor & Francis Group
record_format Article
series Journal of Chemical Engineering of Japan
spelling doaj-art-f58364902c5e44c1bee96879c41b76b52025-02-05T16:40:52ZengTaylor & Francis GroupJournal of Chemical Engineering of Japan0021-95921881-12992025-12-0158110.1080/00219592.2025.2454461VARAT: Variable Annotation Tool for Documents on Manufacturing ProcessesShota Kato0Manabu Kano1Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto, 606-8501, JapanGraduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto, 606-8501, JapanBuilding physical models is essential for realizing digital twins in the manufacturing industry. This task, however, is labor-intensive and requires a deep understanding of target processes and extensive knowledge from various literature sources. Although this extensive workload can be mitigated by automated extraction of information from the literature, developing such methods necessitates domain-specific datasets lacking in chemical engineering. To address this problem, we developed an algorithm for extracting variable symbols from documents and a variable annotation tool, VARAT, based on this algorithm. Our proposed algorithm, tested on 47 papers on physical models of five manufacturing processes, achieved a recall of 97% and a precision of 96%. VARAT was subsequently employed to create a dataset containing 1,988 variable symbols from the 47 papers. This tool reduced the annotation time per paper by more than half. VARAT is expected to accelerate the development of datasets vital for chemical engineering information extraction and ultimately facilitate the development of physical models.https://www.tandfonline.com/doi/10.1080/00219592.2025.2454461AnnotationInformation extractionDocument understandingMathematical expressionsVariable extraction
spellingShingle Shota Kato
Manabu Kano
VARAT: Variable Annotation Tool for Documents on Manufacturing Processes
Journal of Chemical Engineering of Japan
Annotation
Information extraction
Document understanding
Mathematical expressions
Variable extraction
title VARAT: Variable Annotation Tool for Documents on Manufacturing Processes
title_full VARAT: Variable Annotation Tool for Documents on Manufacturing Processes
title_fullStr VARAT: Variable Annotation Tool for Documents on Manufacturing Processes
title_full_unstemmed VARAT: Variable Annotation Tool for Documents on Manufacturing Processes
title_short VARAT: Variable Annotation Tool for Documents on Manufacturing Processes
title_sort varat variable annotation tool for documents on manufacturing processes
topic Annotation
Information extraction
Document understanding
Mathematical expressions
Variable extraction
url https://www.tandfonline.com/doi/10.1080/00219592.2025.2454461
work_keys_str_mv AT shotakato varatvariableannotationtoolfordocumentsonmanufacturingprocesses
AT manabukano varatvariableannotationtoolfordocumentsonmanufacturingprocesses