Decision tree research papers

In this paper, pruning with Bayes minimum risk is introduced for estimating the risk-rate. Exp Syst Appl — CrossRef Google Scholar Poulet F Cooperation between automatic algorithms, interactive algorithms and visualization tools for visual data mining.

Download preview PDF. Therefore, error-based pruning produced applicable tree size with good accuracy compared to reduced-error pruning. Research issue A decision tree is a flowchart-like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node holds a class label.

Decision tree research papers

Several experiments are conducted to investigate the effectiveness of proposed PBMR method and its results are compared with the results of reduced-error pruning and minimum-error pruning approaches. Decision trees suffer from over-fitting problem that appears during data classification process and sometimes produce a tree that is large in size with unwanted branches. Generally, variable importance is computed based on the reduction of model accuracy or in the purities of nodes in the tree when the variable is removed. Association for Computing Machinery Inc. Pruning methods are introduced to combat this problem by removing the non-productive and meaningless branches to avoid the unnecessary tree complexity. Once a set of relevant variables is identified, researchers may want to know which variables play major roles. Moreover, the flowchart in Fig 2 indicates the structure of the proposed algorithm and way followed to proceed. Basic concepts Figure 1 illustrates a simple decision tree model that includes a single binary target variable Y 0 or 1 and two continuous variables, x1 and x2, that range from 0 to 1. Pre-pruning method navigates the tree in a top-down approach while post-pruning navigates the tree in a bottom-up approach. Proposed algorithm adopts a post-pruning bottom-up method for C4. In most circumstances the more records a variable have an effect on, the greater the importance of the variable.

Pre-pruning method navigates the tree in a top-down approach while post-pruning navigates the tree in a bottom-up approach. Motivation The advance progresses in information technologies result in a large amount of data that needs to be analysed and managed to gain useful information knowledge to predict future behaviour.

Decision tree algorithm research paper

If the parent node has a lower risk-rate than its leaf, the parent node is converted to a leaf node, otherwise, the parent node is retained. An example of the medical use of decision trees is in the diagnosis of a medical condition from the pattern of symptoms, in which the classes defined by the decision tree could either be different clinical subtypes or a condition, or patients with a condition who should receive different therapies. Nevertheless, in term of simplification and complexity post-pruning algorithm is more robust since it has access to the full tree. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Both discrete and continuous variables can be used either as target variables or independent variables. Extending database technology. Several experiments are conducted to investigate the effectiveness of proposed PBMR method and its results are compared with the results of reduced-error pruning and minimum-error pruning approaches. Association for Computing Machinery Inc. Many of these variables are of marginal relevance and, thus, should probably not be included in data mining exercises. Moreover, the flowchart in Fig 2 indicates the structure of the proposed algorithm and way followed to proceed. In: Proceedings of the sixteenth international conference on machine learning, Bled, Slovenia, pp — Google Scholar Freund Y, Schapire RE A decision-theoretic generalization of on-line learning and an application to boosting. More recently, decision tree methodology has become popular in medical research. Pre-pruning method navigates the tree in a top-down approach while post-pruning navigates the tree in a bottom-up approach.

Once a set of relevant variables is identified, researchers may want to know which variables play major roles. An example of the medical use of decision trees is in the diagnosis of a medical condition from the pattern of symptoms, in which the classes defined by the decision tree could either be different clinical subtypes or a condition, or patients with a condition who should receive different therapies.

In: Proceedings of second international conference on knowledge discovery and data mining. Related works Several post-pruning algorithms for decision trees such as reduced-error pruning, pessimistic pruning, error-based pruning, cost-complexity pruning and minimum-error pruning have been introduced in the literature [ 9 — 11 ].

In this paper, we adopt post-pruning approach to combat the over-fitting problem that rises during data classification process and leads to a complex tree with large size and difficult to understand.

research paper on decision tree in data mining

Post-pruning is implemented after the tree is grown. Basic concepts Figure 1 illustrates a simple decision tree model that includes a single binary target variable Y 0 or 1 and two continuous variables, x1 and x2, that range from 0 to 1.

In: Proceedings of the fifth Pacific-Asia conference on advances in knowledge discovery and data mining, pp — Google Scholar Piramuthu S Input data for decision trees.

Rated 9/10 based on 61 review
Download
A novel decision tree classification based on post