Prototype selection for dissimilarity-based classifiers

EM Pekalska, RPW Duin, P Paclik

Research output: Contribution to journalArticleScientificpeer-review

268 Citations (Scopus)

Abstract

Abstract A conventional way to discriminate between objects represented by dissimilarities is the nearest neighbor method. A more efficient and sometimes a more accurate solution is offered by other dissimilarity-based classifiers. They construct a decision rule based on the entire training set, but they need just a small set of prototypes, the so-called representation set, as a reference for classifying new objects. Such alternative approaches may be especially advantageous for non-Euclidean or even non-metric dissimilarities. The choice of a proper representation set for dissimilarity-based classifiers is not yet fully investigated. It appears that a random selection may work well. In this paper, a number of experiments has been conducted on various metric and non-metric dissimilarity representations and prototype selection methods. Several procedures, like traditional feature selection methods (here effectively searching for prototypes), mode seeking and linear programming are compared to the random selection. In general, we find out that systematic approaches lead to better results than the random selection, especially for a small number of prototypes. Although there is no single winner as it depends on data characteristics, the k-centres works well, in general. For two-class problems, an important observation is that our dissimilarity-based discrimination functions relying on significantly reduced prototype sets (3¿10% of the training objects) offer a similar or much better classification accuracy than the best k-NN rule on the entire training set. This may be reached for multi-class data as well, however such problems are more difficult. Keywords: Dissimilarity; Representation; Prototype selection; Normal density based classifiers; Nearest neighbor rule
Original languageUndefined/Unknown
Pages (from-to)189-208
Number of pages20
JournalPattern Recognition
Volume39
Issue number2
DOIs
Publication statusPublished - 2006

Keywords

  • Wiskunde en Informatica
  • Techniek
  • technische Wiskunde en Informatica
  • academic journal papers
  • CWTS 0.75 <= JFIS < 2.00

Cite this