De-identification of Emergency Medical Records in French: Survey and Comparison of State-of-the-Art Automated Systems
In France, structured data from emergency room (ER) visits are aggregated at the national level to build a syndromic surveillance system for several health events. For visits motivated by a traumatic event, information on the causes are stored in free-text clinical notes. To exploit these data, an a...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
LibraryPress@UF
2021-04-01
|
| Series: | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| Subjects: | |
| Online Access: | https://journals.flvc.org/FLAIRS/article/view/128480 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In France, structured data from emergency room (ER) visits are aggregated at the national level to build a syndromic surveillance system for several health events. For visits motivated by a traumatic event, information on the causes are stored in free-text clinical notes. To exploit these data, an automated de-identification system guaranteeing protection of privacy is required.
In this study we review available de-identification tools to de-identify free-text clinical documents in French. A key point is how to overcome the resource barrier that hampers NLP applications in languages other than English. We compare rule-based, named entity recognition, new Transformer-based deep learning and hybrid systems using, when required, a fine-tuning set of 30,000 unlabeled clinical notes. The evaluation is performed on a test set of 3,000 manually annotated notes.
Hybrid systems, combining capabilities in complementary tasks, show the best performance. This work is a first step in the foundation of a national surveillance system based on the exhaustive collection of ER visits reports for automated trauma monitoring. |
|---|---|
| ISSN: | 2334-0754 2334-0762 |