GNI Corpus Version 1.0: Annotated Full-Text Corpus of to Support Biomedical Information Extraction
Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BioMed Central
2018-09-01
|
Series: | Genomics & Informatics |
Subjects: | |
Online Access: | http://genominfo.org/upload/pdf/gi-2018-16-3-75.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832574087439843328 |
---|---|
author | So-Yeon Oh Ji-Hyeon Kim Seo-Jin Kim Hee-Jo Nam Hyun-Seok Park |
author_facet | So-Yeon Oh Ji-Hyeon Kim Seo-Jin Kim Hee-Jo Nam Hyun-Seok Park |
author_sort | So-Yeon Oh |
collection | DOAJ |
description | Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining. |
format | Article |
id | doaj-art-342d4b9b8a7b4fb9a8f0d9cc793e8b89 |
institution | Kabale University |
issn | 2234-0742 |
language | English |
publishDate | 2018-09-01 |
publisher | BioMed Central |
record_format | Article |
series | Genomics & Informatics |
spelling | doaj-art-342d4b9b8a7b4fb9a8f0d9cc793e8b892025-02-02T01:01:00ZengBioMed CentralGenomics & Informatics2234-07422018-09-01163757710.5808/GI.2018.16.3.75517GNI Corpus Version 1.0: Annotated Full-Text Corpus of to Support Biomedical Information ExtractionSo-Yeon Oh0Ji-Hyeon Kim1Seo-Jin Kim2Hee-Jo Nam3Hyun-Seok Park4 Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, Korea Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, Korea Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, Korea Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, Korea Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, KoreaGenomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining.http://genominfo.org/upload/pdf/gi-2018-16-3-75.pdfbiomedical text miningcorpus linguisticstext analytics |
spellingShingle | So-Yeon Oh Ji-Hyeon Kim Seo-Jin Kim Hee-Jo Nam Hyun-Seok Park GNI Corpus Version 1.0: Annotated Full-Text Corpus of to Support Biomedical Information Extraction Genomics & Informatics biomedical text mining corpus linguistics text analytics |
title | GNI Corpus Version 1.0: Annotated Full-Text Corpus of to Support Biomedical Information Extraction |
title_full | GNI Corpus Version 1.0: Annotated Full-Text Corpus of to Support Biomedical Information Extraction |
title_fullStr | GNI Corpus Version 1.0: Annotated Full-Text Corpus of to Support Biomedical Information Extraction |
title_full_unstemmed | GNI Corpus Version 1.0: Annotated Full-Text Corpus of to Support Biomedical Information Extraction |
title_short | GNI Corpus Version 1.0: Annotated Full-Text Corpus of to Support Biomedical Information Extraction |
title_sort | gni corpus version 1 0 annotated full text corpus of to support biomedical information extraction |
topic | biomedical text mining corpus linguistics text analytics |
url | http://genominfo.org/upload/pdf/gi-2018-16-3-75.pdf |
work_keys_str_mv | AT soyeonoh gnicorpusversion10annotatedfulltextcorpusoftosupportbiomedicalinformationextraction AT jihyeonkim gnicorpusversion10annotatedfulltextcorpusoftosupportbiomedicalinformationextraction AT seojinkim gnicorpusversion10annotatedfulltextcorpusoftosupportbiomedicalinformationextraction AT heejonam gnicorpusversion10annotatedfulltextcorpusoftosupportbiomedicalinformationextraction AT hyunseokpark gnicorpusversion10annotatedfulltextcorpusoftosupportbiomedicalinformationextraction |