Rule extraction from scientific texts: Evaluation in the specialty of gynecology

Due to the considerable increase in freely available data (especially on the Web), extracting relevant information from textual content is a critical challenge. Most of the available data is embedded in unstructured texts and is not linked to formalized knowledge structures such as ontologies or rul...

Full description

Saved in:
Bibliographic Details
Main Authors: Amina Boufrida, Zizette Boufaida
Format: Article
Language:English
Published: Springer 2022-04-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1319157820303736
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Due to the considerable increase in freely available data (especially on the Web), extracting relevant information from textual content is a critical challenge. Most of the available data is embedded in unstructured texts and is not linked to formalized knowledge structures such as ontologies or rules. A potential solution to this problem is to acquire such knowledge through natural language processing (NLP) tools and text mining techniques. Prior work has focused on the automatic extraction of ontologies from texts, but the acquired knowledge is generally limited to simple hierarchies of terms. This paper presents a polyvalent framework for acquiring complex relationships from texts and coding these in the form of rules. Our approach begins with existing domain knowledge represented as an OWL ontology, and applies NLP tools and text matching techniques to deduce different atoms, such as classes, properties and literals, to capture deductive knowledge in the form of new rules. For the reason, to enrich the existing domain ontology by these rules, in order to obtain higher relational expressiveness, make reasoning and produce new facts. The approach was tested using medical reports, specifically, in the specialty of gynecology. It reports an F-measure of 95.83% on test our corpus.
ISSN:1319-1578