MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte

Starting from the common observation that there is no recognized closed class of Discourse Markers (DMs) and that their definition may vary from one theoretical framework to another, the aim of the MDMA project (“Model for Discourse Marker Annotation”) is to establish an empirical method for the ide...

Full description

Saved in:

Bibliographic Details
Main Authors:	Catherine T. Bolly, Ludivine Crible, Liesbeth Degand, Deniz Uygur-Distexhe
Format:	Article
Language:	English
Published:	Presses universitaires de Caen 2015-09-01
Series:	Discours
Subjects:	discourse markers annotation model corpus-based multivariate analysis spoken French
Online Access:	https://journals.openedition.org/discours/9009
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832581870279196672
author	Catherine T. Bolly Ludivine Crible Liesbeth Degand Deniz Uygur-Distexhe
author_facet	Catherine T. Bolly Ludivine Crible Liesbeth Degand Deniz Uygur-Distexhe
author_sort	Catherine T. Bolly
collection	DOAJ
description	Starting from the common observation that there is no recognized closed class of Discourse Markers (DMs) and that their definition may vary from one theoretical framework to another, the aim of the MDMA project (“Model for Discourse Marker Annotation”) is to establish an empirical method for the identification and annotation of DMs in spoken French. Central to our proposal is that DMs may be described as clusters of features that, in specific patterns of combination, make it possible to distinguish between more or less prototypical uses of DMs in context. We proceeded in three steps: (i) manual identification of all so-called “potential” DMs in a balanced corpus of spoken French (5,000 words; Belgium and France); (ii) automatic extraction from the corpus of every token corresponding to the candidate DMs previously identified (1,181 tokens) ; and (iii) parameter analysis of a random sample of 200 potential DMs (syntactic, formal and semantic-pragmatic variables). The hypothesis is that the statistical analysis – based on the distributional constraints of the potential DMs at stake – should uncover a certain hierarchy between the different features under scrutiny, regarding their relevance, reliability, and generalizability (or even specificity). In the present paper, we first present the annotation procedure, then we discuss several aspects of inter-rater agreement, and finally discuss the results from the in-depth corpus-based and statistical analyses.
format	Article
id	doaj-art-e088650753594906b82fdee20ae31c44
institution	Kabale University
issn	1963-1723
language	English
publishDate	2015-09-01
publisher	Presses universitaires de Caen
record_format	Article
series	Discours
spelling	doaj-art-e088650753594906b82fdee20ae31c442025-01-30T09:52:49ZengPresses universitaires de CaenDiscours1963-17232015-09-011610.4000/discours.9009MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexteCatherine T. BollyLudivine CribleLiesbeth DegandDeniz Uygur-DistexheStarting from the common observation that there is no recognized closed class of Discourse Markers (DMs) and that their definition may vary from one theoretical framework to another, the aim of the MDMA project (“Model for Discourse Marker Annotation”) is to establish an empirical method for the identification and annotation of DMs in spoken French. Central to our proposal is that DMs may be described as clusters of features that, in specific patterns of combination, make it possible to distinguish between more or less prototypical uses of DMs in context. We proceeded in three steps: (i) manual identification of all so-called “potential” DMs in a balanced corpus of spoken French (5,000 words; Belgium and France); (ii) automatic extraction from the corpus of every token corresponding to the candidate DMs previously identified (1,181 tokens) ; and (iii) parameter analysis of a random sample of 200 potential DMs (syntactic, formal and semantic-pragmatic variables). The hypothesis is that the statistical analysis – based on the distributional constraints of the potential DMs at stake – should uncover a certain hierarchy between the different features under scrutiny, regarding their relevance, reliability, and generalizability (or even specificity). In the present paper, we first present the annotation procedure, then we discuss several aspects of inter-rater agreement, and finally discuss the results from the in-depth corpus-based and statistical analyses.https://journals.openedition.org/discours/9009discourse markersannotation modelcorpus-basedmultivariate analysisspoken French
spellingShingle	Catherine T. Bolly Ludivine Crible Liesbeth Degand Deniz Uygur-Distexhe MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte Discours discourse markers annotation model corpus-based multivariate analysis spoken French
title	MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte
title_full	MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte
title_fullStr	MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte
title_full_unstemmed	MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte
title_short	MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte
title_sort	mdma un modele pour l identification et l annotation des marqueurs discursifs potentiels en contexte
topic	discourse markers annotation model corpus-based multivariate analysis spoken French
url	https://journals.openedition.org/discours/9009
work_keys_str_mv	AT catherinetbolly mdmaunmodelepourlidentificationetlannotationdesmarqueursdiscursifspotentielsencontexte AT ludivinecrible mdmaunmodelepourlidentificationetlannotationdesmarqueursdiscursifspotentielsencontexte AT liesbethdegand mdmaunmodelepourlidentificationetlannotationdesmarqueursdiscursifspotentielsencontexte AT denizuygurdistexhe mdmaunmodelepourlidentificationetlannotationdesmarqueursdiscursifspotentielsencontexte

MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte

Similar Items