CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises

Generating high-quality programming exercises with well-aligned problem descriptions, test cases, and code solutions is crucial for computer science education. However, current methods often lack coherence among these components, reducing their educational value. We present CodeContrast, a novel gen...

Full description

Saved in:
Bibliographic Details
Main Author: Nicolás Torres
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Education Sciences
Subjects:
Online Access:https://www.mdpi.com/2227-7102/15/1/80
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832588621952057344
author Nicolás Torres
author_facet Nicolás Torres
author_sort Nicolás Torres
collection DOAJ
description Generating high-quality programming exercises with well-aligned problem descriptions, test cases, and code solutions is crucial for computer science education. However, current methods often lack coherence among these components, reducing their educational value. We present CodeContrast, a novel generative model that uses contrastive learning to map programming problems, test cases, and solutions into a shared feature space. By minimizing the distance between matched components and maximizing it for non-matched ones, CodeContrast learns the intricate relationships necessary to generate coherent programming exercises. Our model architecture includes three encoder networks for problem descriptions, test cases, and solutions. During training, CodeContrast processes positive triplets (matching problem, test case, solution) and negative triplets (non-matching combinations) and uses a contrastive loss to position positive triplets close in the feature space while separating negative ones. Comprehensive evaluations of CodeContrast—through automatic metrics, expert ratings, and student studies—demonstrate its effectiveness. Results show high code correctness (92.3% of test cases passed), strong problem–solution alignment (BLEU score up to 0.826), and robust test case coverage (85.7% statement coverage). Expert feedback and student performance further support the pedagogical value of these generated exercises, with students performing comparably to those using manually curated content. CodeContrast advances the automated generation of high-quality programming exercises, capturing relationships among programming components to enhance educational content and improve the learning experience for students and instructors.
format Article
id doaj-art-554813b7ddf1446ebca4785e08d89ed0
institution Kabale University
issn 2227-7102
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Education Sciences
spelling doaj-art-554813b7ddf1446ebca4785e08d89ed02025-01-24T13:30:31ZengMDPI AGEducation Sciences2227-71022025-01-011518010.3390/educsci15010080CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming ExercisesNicolás Torres0Departamento de Electrónica, Universidad Técnica Federico Santa María, Santiago 8940897, ChileGenerating high-quality programming exercises with well-aligned problem descriptions, test cases, and code solutions is crucial for computer science education. However, current methods often lack coherence among these components, reducing their educational value. We present CodeContrast, a novel generative model that uses contrastive learning to map programming problems, test cases, and solutions into a shared feature space. By minimizing the distance between matched components and maximizing it for non-matched ones, CodeContrast learns the intricate relationships necessary to generate coherent programming exercises. Our model architecture includes three encoder networks for problem descriptions, test cases, and solutions. During training, CodeContrast processes positive triplets (matching problem, test case, solution) and negative triplets (non-matching combinations) and uses a contrastive loss to position positive triplets close in the feature space while separating negative ones. Comprehensive evaluations of CodeContrast—through automatic metrics, expert ratings, and student studies—demonstrate its effectiveness. Results show high code correctness (92.3% of test cases passed), strong problem–solution alignment (BLEU score up to 0.826), and robust test case coverage (85.7% statement coverage). Expert feedback and student performance further support the pedagogical value of these generated exercises, with students performing comparably to those using manually curated content. CodeContrast advances the automated generation of high-quality programming exercises, capturing relationships among programming components to enhance educational content and improve the learning experience for students and instructors.https://www.mdpi.com/2227-7102/15/1/80contrastive learningprogramming exercise generationcomputer science educationcode generationeducational content creation
spellingShingle Nicolás Torres
CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises
Education Sciences
contrastive learning
programming exercise generation
computer science education
code generation
educational content creation
title CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises
title_full CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises
title_fullStr CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises
title_full_unstemmed CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises
title_short CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises
title_sort codecontrast a contrastive learning approach for generating coherent programming exercises
topic contrastive learning
programming exercise generation
computer science education
code generation
educational content creation
url https://www.mdpi.com/2227-7102/15/1/80
work_keys_str_mv AT nicolastorres codecontrastacontrastivelearningapproachforgeneratingcoherentprogrammingexercises