PARALLEL ALGORITHMS OF RANDOM FORESTS FOR CLASSIFYING VERY LARGE DATASETS
The random forests algorithm proposed by Breiman is an ensemble-based approach with very high accuracy. The learning and classification tasks of a set of decision trees take a lot of time, make it intractable when dealing with very large datasets. There is a need to scale up the random forests algor...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Dalat University
2013-06-01
|
Series: | Tạp chí Khoa học Đại học Đà Lạt |
Subjects: | |
Online Access: | https://tckh.dlu.edu.vn/index.php/tckhdhdl/article/view/247 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832570833414914048 |
---|---|
author | Do Thanh Nghi Pham Nguyen Khang Nguyen Van Hoa Ly Hoang Trong |
author_facet | Do Thanh Nghi Pham Nguyen Khang Nguyen Van Hoa Ly Hoang Trong |
author_sort | Do Thanh Nghi |
collection | DOAJ |
description | The random forests algorithm proposed by Breiman is an ensemble-based approach with very high accuracy. The learning and classification tasks of a set of decision trees take a lot of time, make it intractable when dealing with very large datasets. There is a need to scale up the random forests algorithm to handle massive datasets. We propose parallel algorithms of random forests to take into account the benefits of Grids computing. These algorithms improve training and classification time compared with the original ones. The experimental results on large datasets including Forest cover type,KDD Cup 1999, Connect-4 from the UCI data repository showed that the training and classification time of parallel algorithms are significantly reduced. |
format | Article |
id | doaj-art-6359738de3634b1c9cb675977c075c46 |
institution | Kabale University |
issn | 0866-787X |
language | English |
publishDate | 2013-06-01 |
publisher | Dalat University |
record_format | Article |
series | Tạp chí Khoa học Đại học Đà Lạt |
spelling | doaj-art-6359738de3634b1c9cb675977c075c462025-02-02T13:53:50ZengDalat UniversityTạp chí Khoa học Đại học Đà Lạt0866-787X2013-06-013210.37569/DalatUniversity.3.2.247(2013)PARALLEL ALGORITHMS OF RANDOM FORESTS FOR CLASSIFYING VERY LARGE DATASETSDo Thanh Nghi0Pham Nguyen Khang1Nguyen Van Hoa2Ly Hoang Trong3College of Information Technology, Cantho UniversityCollege of Information Technology, Cantho UniversityFaculty of Technology, Engineering and Environment, Angiang UniversityCollege of Information Technology, Cantho UniversityThe random forests algorithm proposed by Breiman is an ensemble-based approach with very high accuracy. The learning and classification tasks of a set of decision trees take a lot of time, make it intractable when dealing with very large datasets. There is a need to scale up the random forests algorithm to handle massive datasets. We propose parallel algorithms of random forests to take into account the benefits of Grids computing. These algorithms improve training and classification time compared with the original ones. The experimental results on large datasets including Forest cover type,KDD Cup 1999, Connect-4 from the UCI data repository showed that the training and classification time of parallel algorithms are significantly reduced.https://tckh.dlu.edu.vn/index.php/tckhdhdl/article/view/247Random forestDecision treeBaggingBoostingMPIGrids. |
spellingShingle | Do Thanh Nghi Pham Nguyen Khang Nguyen Van Hoa Ly Hoang Trong PARALLEL ALGORITHMS OF RANDOM FORESTS FOR CLASSIFYING VERY LARGE DATASETS Tạp chí Khoa học Đại học Đà Lạt Random forest Decision tree Bagging Boosting MPI Grids. |
title | PARALLEL ALGORITHMS OF RANDOM FORESTS FOR CLASSIFYING VERY LARGE DATASETS |
title_full | PARALLEL ALGORITHMS OF RANDOM FORESTS FOR CLASSIFYING VERY LARGE DATASETS |
title_fullStr | PARALLEL ALGORITHMS OF RANDOM FORESTS FOR CLASSIFYING VERY LARGE DATASETS |
title_full_unstemmed | PARALLEL ALGORITHMS OF RANDOM FORESTS FOR CLASSIFYING VERY LARGE DATASETS |
title_short | PARALLEL ALGORITHMS OF RANDOM FORESTS FOR CLASSIFYING VERY LARGE DATASETS |
title_sort | parallel algorithms of random forests for classifying very large datasets |
topic | Random forest Decision tree Bagging Boosting MPI Grids. |
url | https://tckh.dlu.edu.vn/index.php/tckhdhdl/article/view/247 |
work_keys_str_mv | AT dothanhnghi parallelalgorithmsofrandomforestsforclassifyingverylargedatasets AT phamnguyenkhang parallelalgorithmsofrandomforestsforclassifyingverylargedatasets AT nguyenvanhoa parallelalgorithmsofrandomforestsforclassifyingverylargedatasets AT lyhoangtrong parallelalgorithmsofrandomforestsforclassifyingverylargedatasets |