Constructing meaningful code changes via graph transformer

Abstract The rapid development of Open‐Source Software (OSS) has resulted in a significant demand for code changes to maintain OSS. Symptoms of poor design and implementation choices in code changes often occur, thus heavily hindering code reviewers to verify correctness and soundness of code change...

Full description

Saved in:
Bibliographic Details
Main Authors: Shikai Guo, Mengxuan Li, Xin Ge, Hui Li, Rong Chen, Tingting Li
Format: Article
Language:English
Published: Wiley 2023-04-01
Series:IET Software
Subjects:
Online Access:https://doi.org/10.1049/sfw2.12097
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832558535902232576
author Shikai Guo
Mengxuan Li
Xin Ge
Hui Li
Rong Chen
Tingting Li
author_facet Shikai Guo
Mengxuan Li
Xin Ge
Hui Li
Rong Chen
Tingting Li
author_sort Shikai Guo
collection DOAJ
description Abstract The rapid development of Open‐Source Software (OSS) has resulted in a significant demand for code changes to maintain OSS. Symptoms of poor design and implementation choices in code changes often occur, thus heavily hindering code reviewers to verify correctness and soundness of code changes. Researchers have investigated how to learn meaningful code changes to assist developers in anticipating changes that code reviewers may suggest for the submitted code. However, there are two main limitations to be addressed, including the limitation of long‐range dependencies of the source code and the missing syntactic structural information of the source code. To solve these limitations, a novel method is proposed, named Graph Transformer for learning meaningful Code Transformations (GTCT), to provide developers with preliminary and quick feedback when developers submit code changes, which can improve the quality of code changes and improve the efficiency of code review. GTCT comprises two components: code graph embedding and code transformation learning. To address the missing syntactic structural information of the source code limitation, the code graph embedding component captures the types and patterns of code changes by encoding the source code into a code graph structure from the lexical and syntactic representations of the source code. Subsequently, the code transformation learning component uses the multi‐head attention mechanism and positional encoding mechanism to address the long‐range dependencies limitation. Extensive experiments are conducted to evaluate the performance of GTCT by both quantitative and qualitative analyses. For the quantitative analysis, GTCT relatively outperforms the baseline on six datasets by 210%, 342.86%, 135%, 29.41%, 109.09%, and 91.67% in terms of perfect prediction. Meanwhile, the qualitative analysis shows that each type of code change by GTCT outperforms that of the baseline method in terms of bug fixed, refactoring code and others' taxonomy of code changes.
format Article
id doaj-art-739bb11a995d4194982a25cf6b6a7638
institution Kabale University
issn 1751-8806
1751-8814
language English
publishDate 2023-04-01
publisher Wiley
record_format Article
series IET Software
spelling doaj-art-739bb11a995d4194982a25cf6b6a76382025-02-03T01:32:08ZengWileyIET Software1751-88061751-88142023-04-0117215416710.1049/sfw2.12097Constructing meaningful code changes via graph transformerShikai Guo0Mengxuan Li1Xin Ge2Hui Li3Rong Chen4Tingting Li5The College of Information Science and Technology Dalian Maritime University Dalian ChinaThe College of Information Science and Technology Dalian Maritime University Dalian ChinaThe College of Information Science and Technology Dalian Maritime University Dalian ChinaThe College of Information Science and Technology Dalian Maritime University Dalian ChinaThe College of Information Science and Technology Dalian Maritime University Dalian ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University Changchun ChinaAbstract The rapid development of Open‐Source Software (OSS) has resulted in a significant demand for code changes to maintain OSS. Symptoms of poor design and implementation choices in code changes often occur, thus heavily hindering code reviewers to verify correctness and soundness of code changes. Researchers have investigated how to learn meaningful code changes to assist developers in anticipating changes that code reviewers may suggest for the submitted code. However, there are two main limitations to be addressed, including the limitation of long‐range dependencies of the source code and the missing syntactic structural information of the source code. To solve these limitations, a novel method is proposed, named Graph Transformer for learning meaningful Code Transformations (GTCT), to provide developers with preliminary and quick feedback when developers submit code changes, which can improve the quality of code changes and improve the efficiency of code review. GTCT comprises two components: code graph embedding and code transformation learning. To address the missing syntactic structural information of the source code limitation, the code graph embedding component captures the types and patterns of code changes by encoding the source code into a code graph structure from the lexical and syntactic representations of the source code. Subsequently, the code transformation learning component uses the multi‐head attention mechanism and positional encoding mechanism to address the long‐range dependencies limitation. Extensive experiments are conducted to evaluate the performance of GTCT by both quantitative and qualitative analyses. For the quantitative analysis, GTCT relatively outperforms the baseline on six datasets by 210%, 342.86%, 135%, 29.41%, 109.09%, and 91.67% in terms of perfect prediction. Meanwhile, the qualitative analysis shows that each type of code change by GTCT outperforms that of the baseline method in terms of bug fixed, refactoring code and others' taxonomy of code changes.https://doi.org/10.1049/sfw2.12097software engineeringsoftware maintenance
spellingShingle Shikai Guo
Mengxuan Li
Xin Ge
Hui Li
Rong Chen
Tingting Li
Constructing meaningful code changes via graph transformer
IET Software
software engineering
software maintenance
title Constructing meaningful code changes via graph transformer
title_full Constructing meaningful code changes via graph transformer
title_fullStr Constructing meaningful code changes via graph transformer
title_full_unstemmed Constructing meaningful code changes via graph transformer
title_short Constructing meaningful code changes via graph transformer
title_sort constructing meaningful code changes via graph transformer
topic software engineering
software maintenance
url https://doi.org/10.1049/sfw2.12097
work_keys_str_mv AT shikaiguo constructingmeaningfulcodechangesviagraphtransformer
AT mengxuanli constructingmeaningfulcodechangesviagraphtransformer
AT xinge constructingmeaningfulcodechangesviagraphtransformer
AT huili constructingmeaningfulcodechangesviagraphtransformer
AT rongchen constructingmeaningfulcodechangesviagraphtransformer
AT tingtingli constructingmeaningfulcodechangesviagraphtransformer