Interval evaluation of temporal (in)stability for neural machine translation
Abstract Though neural machine translation (NMT) has become the leading machine translation (MT) paradigm, its output may still contain errors. To improve NMT quality, it is important to investigate these errors and to see how NMT quality changes with time. The primary focus of the paper is on what...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2025-01-01
|
Series: | Discover Artificial Intelligence |
Subjects: | |
Online Access: | https://doi.org/10.1007/s44163-025-00222-y |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832585549205995520 |
---|---|
author | Anna Egorova Mikhail Kruzhkov Vitaly Nuriev Igor Zatsman |
author_facet | Anna Egorova Mikhail Kruzhkov Vitaly Nuriev Igor Zatsman |
author_sort | Anna Egorova |
collection | DOAJ |
description | Abstract Though neural machine translation (NMT) has become the leading machine translation (MT) paradigm, its output may still contain errors. To improve NMT quality, it is important to investigate these errors and to see how NMT quality changes with time. The primary focus of the paper is on what is referred to here as “temporal (in)stability of NMT”, the phenomenon that was uncovered in a year-long experiment and may be researched applying interval evaluation methods. The paper presents data collected while observing how far, if at all, the Google’s Neural Machine Translation (GNMT) system progressed during a year. The data were qualitatively evaluated based on a set of indicators. To that end, 250 Russian text sentences were chosen. In the course of a year, each sentence was repeatedly translated into French using the GNMT engine (with a time step of 1 month). The produced translations were recorded and annotated in an especially designed supracorpora database, allowing to register a series of 12 translations for each of the 250 Russian sentences. To annotate the translations, there was a need to elaborate an error typology that would help reveal if the NMT system improved its output quality or not. One year-long experiment shows that not only does NMT quality improve, but it also may decrease with time. |
format | Article |
id | doaj-art-1730881d007f47789051af3299af8e6e |
institution | Kabale University |
issn | 2731-0809 |
language | English |
publishDate | 2025-01-01 |
publisher | Springer |
record_format | Article |
series | Discover Artificial Intelligence |
spelling | doaj-art-1730881d007f47789051af3299af8e6e2025-01-26T12:43:00ZengSpringerDiscover Artificial Intelligence2731-08092025-01-015111710.1007/s44163-025-00222-yInterval evaluation of temporal (in)stability for neural machine translationAnna Egorova0Mikhail Kruzhkov1Vitaly Nuriev2Igor Zatsman3Institute of Informatics Problems, Federal Research Center Computer Science and Control of the Russian Academy of Sciences (FRC CSC RAS)Independent researcher Center for Emerging Practices, Institute of Scientific Information for Social Sciences of the Russian Academy of Sciences (INION RAN)Institute of Informatics Problems, Federal Research Center Computer Science and Control of the Russian Academy of Sciences (FRC CSC RAS)Abstract Though neural machine translation (NMT) has become the leading machine translation (MT) paradigm, its output may still contain errors. To improve NMT quality, it is important to investigate these errors and to see how NMT quality changes with time. The primary focus of the paper is on what is referred to here as “temporal (in)stability of NMT”, the phenomenon that was uncovered in a year-long experiment and may be researched applying interval evaluation methods. The paper presents data collected while observing how far, if at all, the Google’s Neural Machine Translation (GNMT) system progressed during a year. The data were qualitatively evaluated based on a set of indicators. To that end, 250 Russian text sentences were chosen. In the course of a year, each sentence was repeatedly translated into French using the GNMT engine (with a time step of 1 month). The produced translations were recorded and annotated in an especially designed supracorpora database, allowing to register a series of 12 translations for each of the 250 Russian sentences. To annotate the translations, there was a need to elaborate an error typology that would help reveal if the NMT system improved its output quality or not. One year-long experiment shows that not only does NMT quality improve, but it also may decrease with time.https://doi.org/10.1007/s44163-025-00222-yNeural machine translationTemporal evaluationTemporal (in)stability of neural machine translationIndicator-based evaluationLinguistic annotationError typology |
spellingShingle | Anna Egorova Mikhail Kruzhkov Vitaly Nuriev Igor Zatsman Interval evaluation of temporal (in)stability for neural machine translation Discover Artificial Intelligence Neural machine translation Temporal evaluation Temporal (in)stability of neural machine translation Indicator-based evaluation Linguistic annotation Error typology |
title | Interval evaluation of temporal (in)stability for neural machine translation |
title_full | Interval evaluation of temporal (in)stability for neural machine translation |
title_fullStr | Interval evaluation of temporal (in)stability for neural machine translation |
title_full_unstemmed | Interval evaluation of temporal (in)stability for neural machine translation |
title_short | Interval evaluation of temporal (in)stability for neural machine translation |
title_sort | interval evaluation of temporal in stability for neural machine translation |
topic | Neural machine translation Temporal evaluation Temporal (in)stability of neural machine translation Indicator-based evaluation Linguistic annotation Error typology |
url | https://doi.org/10.1007/s44163-025-00222-y |
work_keys_str_mv | AT annaegorova intervalevaluationoftemporalinstabilityforneuralmachinetranslation AT mikhailkruzhkov intervalevaluationoftemporalinstabilityforneuralmachinetranslation AT vitalynuriev intervalevaluationoftemporalinstabilityforneuralmachinetranslation AT igorzatsman intervalevaluationoftemporalinstabilityforneuralmachinetranslation |