Ruggieri, Salvatore (2000) Efficient C4.5. Technical Report del Dipartimento di Informatica . Università di Pisa, Pisa, IT.
Postscript (GZip) - Published Version Available under License Creative Commons Attribution No Derivatives. Download (126Kb) |
Abstract
We present an analytic evaluation of the run-time behavior of the C4.5 algorithm which highlights some efficiency improvements. We have implemented a more efficient version of the algorithm, called EC4.5, that improves on C4.5 by adopting the best among three strategies at each node construction. The first strategy uses a binary search of thresholds instead of the linear search of C4.5. The second strategy adopts a counting sort method instead of the quicksort of C4.5. The third strategy uses a main-memory version of the RainForest algorithm for constructing decision trees. Our implementation computes the same decision trees as C4.5 with a performance gain of up to 5 times.
Item Type: | Book |
---|---|
Uncontrolled Keywords: | C4.5, Decision Trees Induction Algorithms, Supervised Machine Learning, Data Mining |
Subjects: | Area01 - Scienze matematiche e informatiche > INF/01 - Informatica |
Divisions: | Dipartimenti (until 2012) > DIPARTIMENTO DI INFORMATICA |
Depositing User: | dott.ssa Sandra Faita |
Date Deposited: | 27 Jan 2015 09:54 |
Last Modified: | 27 Jan 2015 09:54 |
URI: | http://eprints.adm.unipi.it/id/eprint/2026 |
Repository staff only actions
View Item |