Rule-Based Knowledge Acquisition Method for Promoter Prediction in Human and Drosophila Species
The rapid and reliable identification of promoter regions is important when the number of genomes to be sequenced is increasing very speedily. Various methods have been developed but few methods investigate the effectiveness of sequence-based features in promoter prediction. This study proposes a kn...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2014-01-01
|
Series: | The Scientific World Journal |
Online Access: | http://dx.doi.org/10.1155/2014/327306 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832563882635296768 |
---|---|
author | Wen-Lin Huang Chun-Wei Tung Chyn Liaw Hui-Ling Huang Shinn-Ying Ho |
author_facet | Wen-Lin Huang Chun-Wei Tung Chyn Liaw Hui-Ling Huang Shinn-Ying Ho |
author_sort | Wen-Lin Huang |
collection | DOAJ |
description | The rapid and reliable identification of promoter regions is important when the number of genomes to be sequenced is increasing very speedily. Various methods have been developed but few methods investigate the effectiveness of sequence-based features in promoter prediction. This study proposes a knowledge acquisition method (named PromHD) based on if-then rules for promoter prediction in human and Drosophila species. PromHD utilizes an effective feature-mining algorithm and a reference feature set of 167 DNA sequence descriptors (DNASDs), comprising three descriptors of physicochemical properties (absorption maxima, molecular weight, and molar absorption coefficient), 128 top-ranked descriptors of 4-mer motifs, and 36 global sequence descriptors. PromHD identifies two feature subsets with 99 and 74 DNASDs and yields test accuracies of 96.4% and 97.5% in human and Drosophila species, respectively. Based on the 99- and 74-dimensional feature vectors, PromHD generates several if-then rules by using the decision tree mechanism for promoter prediction. The top-ranked informative rules with high certainty grades reveal that the global sequence descriptor, the length of nucleotide A at the first position of the sequence, and two physicochemical properties, absorption maxima and molecular weight, are effective in distinguishing promoters from non-promoters in human and Drosophila species, respectively. |
format | Article |
id | doaj-art-f1e97e167f824d73a52d09140818b1f6 |
institution | Kabale University |
issn | 2356-6140 1537-744X |
language | English |
publishDate | 2014-01-01 |
publisher | Wiley |
record_format | Article |
series | The Scientific World Journal |
spelling | doaj-art-f1e97e167f824d73a52d09140818b1f62025-02-03T01:12:23ZengWileyThe Scientific World Journal2356-61401537-744X2014-01-01201410.1155/2014/327306327306Rule-Based Knowledge Acquisition Method for Promoter Prediction in Human and Drosophila SpeciesWen-Lin Huang0Chun-Wei Tung1Chyn Liaw2Hui-Ling Huang3Shinn-Ying Ho4Department of Management Information System, Asia Pacific Institute of Creativity, Miaoli 351, TaiwanSchool of Pharmacy, College of Pharmacy, Kaohsiung Medical University, Kaohsiung 807, TaiwanInstitute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, TaiwanInstitute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, TaiwanInstitute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, TaiwanThe rapid and reliable identification of promoter regions is important when the number of genomes to be sequenced is increasing very speedily. Various methods have been developed but few methods investigate the effectiveness of sequence-based features in promoter prediction. This study proposes a knowledge acquisition method (named PromHD) based on if-then rules for promoter prediction in human and Drosophila species. PromHD utilizes an effective feature-mining algorithm and a reference feature set of 167 DNA sequence descriptors (DNASDs), comprising three descriptors of physicochemical properties (absorption maxima, molecular weight, and molar absorption coefficient), 128 top-ranked descriptors of 4-mer motifs, and 36 global sequence descriptors. PromHD identifies two feature subsets with 99 and 74 DNASDs and yields test accuracies of 96.4% and 97.5% in human and Drosophila species, respectively. Based on the 99- and 74-dimensional feature vectors, PromHD generates several if-then rules by using the decision tree mechanism for promoter prediction. The top-ranked informative rules with high certainty grades reveal that the global sequence descriptor, the length of nucleotide A at the first position of the sequence, and two physicochemical properties, absorption maxima and molecular weight, are effective in distinguishing promoters from non-promoters in human and Drosophila species, respectively.http://dx.doi.org/10.1155/2014/327306 |
spellingShingle | Wen-Lin Huang Chun-Wei Tung Chyn Liaw Hui-Ling Huang Shinn-Ying Ho Rule-Based Knowledge Acquisition Method for Promoter Prediction in Human and Drosophila Species The Scientific World Journal |
title | Rule-Based Knowledge Acquisition Method for Promoter Prediction in Human and Drosophila Species |
title_full | Rule-Based Knowledge Acquisition Method for Promoter Prediction in Human and Drosophila Species |
title_fullStr | Rule-Based Knowledge Acquisition Method for Promoter Prediction in Human and Drosophila Species |
title_full_unstemmed | Rule-Based Knowledge Acquisition Method for Promoter Prediction in Human and Drosophila Species |
title_short | Rule-Based Knowledge Acquisition Method for Promoter Prediction in Human and Drosophila Species |
title_sort | rule based knowledge acquisition method for promoter prediction in human and drosophila species |
url | http://dx.doi.org/10.1155/2014/327306 |
work_keys_str_mv | AT wenlinhuang rulebasedknowledgeacquisitionmethodforpromoterpredictioninhumananddrosophilaspecies AT chunweitung rulebasedknowledgeacquisitionmethodforpromoterpredictioninhumananddrosophilaspecies AT chynliaw rulebasedknowledgeacquisitionmethodforpromoterpredictioninhumananddrosophilaspecies AT huilinghuang rulebasedknowledgeacquisitionmethodforpromoterpredictioninhumananddrosophilaspecies AT shinnyingho rulebasedknowledgeacquisitionmethodforpromoterpredictioninhumananddrosophilaspecies |