The Evolution of the Idiolect over the Lifetime: A Quantitative and Qualitative Study of French 19th Century Literature

The way in which authors express themselves is unique but changes over their lifetime. However, quantitative studies of this idiolectal evolution are rare. Using the Corpus for Idiolectal Research (CIDRE) that contains the dated works of 11 prolific 19th century French fiction writers, we propose ne...

Full description

Saved in:
Bibliographic Details
Main Authors: Olga Seminck, Philippe Gambette, Dominique Legallois, Thierry Poibeau
Format: Article
Language:English
Published: Department of Languages, Literatures, and Cultures at McGill University 2022-09-01
Series:Journal of Cultural Analytics
Online Access:https://doi.org/10.22148/001c.37588
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850219430473105408
author Olga Seminck
Philippe Gambette
Dominique Legallois
Thierry Poibeau
author_facet Olga Seminck
Philippe Gambette
Dominique Legallois
Thierry Poibeau
author_sort Olga Seminck
collection DOAJ
description The way in which authors express themselves is unique but changes over their lifetime. However, quantitative studies of this idiolectal evolution are rare. Using the Corpus for Idiolectal Research (CIDRE) that contains the dated works of 11 prolific 19th century French fiction writers, we propose new methods to identify, quantify and describe the grammatical-stylistic changes that take place using lexico-morphosyntactic patterns, also called motifs. To examine the strength of the chronological signal of change, we developed a method to calculate if a distance matrix of literary works contains a stronger chronological signal than expected by chance. Ten out of 11 corpora showed a higher than chance chronological signal, leading us to conclude that the evolution of the idiolect is in a mathematical sense monotonic, supporting the rectilinearity hypothesis previously put forward in the stylometric literature. The rectilinear property of the evolution of the idiolect found for most authors in CIDRE subsequently enabled us to propose a machine learning task: predicting the year in which a work was written. For the majority of the authors in our corpus, the accuracy and the amount of variance that is explained by the model were high and we discuss why the technique might fail for others. After applying a feature selection algorithm, we examined the most important features, i.e. the motifs that have the greatest influence on idiolectal evolution. We find that some of those features are stylistic and have been previously identified in qualitative literature studies. We report some remarkable stylistic constructions revealed by our algorithm to illustrate which kind of stylistic patterns can be extracted using our method.
format Article
id doaj-art-b44e434d37fd4ac2bcb65b1b873e7b22
institution OA Journals
issn 2371-4549
language English
publishDate 2022-09-01
publisher Department of Languages, Literatures, and Cultures at McGill University
record_format Article
series Journal of Cultural Analytics
spelling doaj-art-b44e434d37fd4ac2bcb65b1b873e7b222025-08-20T02:07:23ZengDepartment of Languages, Literatures, and Cultures at McGill UniversityJournal of Cultural Analytics2371-45492022-09-017310.22148/001c.37588The Evolution of the Idiolect over the Lifetime: A Quantitative and Qualitative Study of French 19th Century LiteratureOlga SeminckPhilippe GambetteDominique LegalloisThierry PoibeauThe way in which authors express themselves is unique but changes over their lifetime. However, quantitative studies of this idiolectal evolution are rare. Using the Corpus for Idiolectal Research (CIDRE) that contains the dated works of 11 prolific 19th century French fiction writers, we propose new methods to identify, quantify and describe the grammatical-stylistic changes that take place using lexico-morphosyntactic patterns, also called motifs. To examine the strength of the chronological signal of change, we developed a method to calculate if a distance matrix of literary works contains a stronger chronological signal than expected by chance. Ten out of 11 corpora showed a higher than chance chronological signal, leading us to conclude that the evolution of the idiolect is in a mathematical sense monotonic, supporting the rectilinearity hypothesis previously put forward in the stylometric literature. The rectilinear property of the evolution of the idiolect found for most authors in CIDRE subsequently enabled us to propose a machine learning task: predicting the year in which a work was written. For the majority of the authors in our corpus, the accuracy and the amount of variance that is explained by the model were high and we discuss why the technique might fail for others. After applying a feature selection algorithm, we examined the most important features, i.e. the motifs that have the greatest influence on idiolectal evolution. We find that some of those features are stylistic and have been previously identified in qualitative literature studies. We report some remarkable stylistic constructions revealed by our algorithm to illustrate which kind of stylistic patterns can be extracted using our method.https://doi.org/10.22148/001c.37588
spellingShingle Olga Seminck
Philippe Gambette
Dominique Legallois
Thierry Poibeau
The Evolution of the Idiolect over the Lifetime: A Quantitative and Qualitative Study of French 19th Century Literature
Journal of Cultural Analytics
title The Evolution of the Idiolect over the Lifetime: A Quantitative and Qualitative Study of French 19th Century Literature
title_full The Evolution of the Idiolect over the Lifetime: A Quantitative and Qualitative Study of French 19th Century Literature
title_fullStr The Evolution of the Idiolect over the Lifetime: A Quantitative and Qualitative Study of French 19th Century Literature
title_full_unstemmed The Evolution of the Idiolect over the Lifetime: A Quantitative and Qualitative Study of French 19th Century Literature
title_short The Evolution of the Idiolect over the Lifetime: A Quantitative and Qualitative Study of French 19th Century Literature
title_sort evolution of the idiolect over the lifetime a quantitative and qualitative study of french 19th century literature
url https://doi.org/10.22148/001c.37588
work_keys_str_mv AT olgaseminck theevolutionoftheidiolectoverthelifetimeaquantitativeandqualitativestudyoffrench19thcenturyliterature
AT philippegambette theevolutionoftheidiolectoverthelifetimeaquantitativeandqualitativestudyoffrench19thcenturyliterature
AT dominiquelegallois theevolutionoftheidiolectoverthelifetimeaquantitativeandqualitativestudyoffrench19thcenturyliterature
AT thierrypoibeau theevolutionoftheidiolectoverthelifetimeaquantitativeandqualitativestudyoffrench19thcenturyliterature
AT olgaseminck evolutionoftheidiolectoverthelifetimeaquantitativeandqualitativestudyoffrench19thcenturyliterature
AT philippegambette evolutionoftheidiolectoverthelifetimeaquantitativeandqualitativestudyoffrench19thcenturyliterature
AT dominiquelegallois evolutionoftheidiolectoverthelifetimeaquantitativeandqualitativestudyoffrench19thcenturyliterature
AT thierrypoibeau evolutionoftheidiolectoverthelifetimeaquantitativeandqualitativestudyoffrench19thcenturyliterature