We present an application of the measure of total uncertainty on convex sets of probability distributions, also called credal sets, to the construction of classification trees. In these classification trees the probabilities of the classes in each one of its leaves is estimated by using the imprecise Dirichlet model. In this way, smaller samples give rise to wider probability intervals. Branching a classification tree can decrease the entropy associated to the classes but, at the same time, as the sample is divided among the branches the non-specificity increases. We use a total uncertainty measure (entropy + non-specificity) as branching criterion. The stopping rule is not to increase the total uncertainty. The good behavior of this procedure for standard classification problems is shown. It is important to remark that it does not suffer of overfitting, with similar results in the training and test samples.
Keywords. Imprecise probabilities, uncertainty, imprecision, non-specificity, classification, classification trees, credal sets
Format. Postscript
Paper Download
The paper is availabe in the following sites:
Authors addresses:
Joaquín Abellán
Dpto. Ciencias de la Computación
ETSI Informática
18071 Granada
SPAIN
Serafín Moral
Dpto. Ciencias de la Computación e IA
ETSI Informática
Universidad de Granada
18071 Granada - Spain
E-mail addresses:
Joaquín Abellán | jabemu@teleline.es |
Serafín Moral | smc@decsai.ugr.es |
Related Web Sites