Wolfram Library Archive

Courseware Demos MathSource Technical Notes
All Collections Articles Books Conference Proceedings

On the Finite Sample Performance of the Nearest Neighbor Classifier

D. Psaltis
R. Snapp
S. Venkatesh
Journal / Anthology

IEEE Transactions on Information Theory
Year: 1994
Volume: 40
Issue: 3
Page range: 820-837

The finite sample performance of a nearest neighbor classifier is analyzed for a two-class pattern recognition problem. An exact integral expression is derived for the m-sample risk Rm given that a reference m-sample of labeled points is available to the classifier. The statistical setup assumes that the pattern classes arise in nature with fixed a priori probabilities and that points representing the classes are drawn from Euclidean n-space according to fixed class-conditional probability distributions. The sample is assumed to consist of m independently generated class-labeled points. For a family of smooth class-condition distributions characterized by asymptotic expansions in general form, it is shown that the m-sample risk Rm has a complete asymptotic expansion...where Rinfinity denotes the nearest neighbor risk in the infinite sample limit and the coefficients ck are distribution-dependent constants independent of the sample size m. This analysis thus provides further analytic validation of Bellman's curse of dimensionality. Numerical simulations corroborating the formal results are included, and extensions of the theory discussed. The analysis also contains a novel application of Laplace's asymptotic method of integration to a multidimensional integral where the integrand attains its maximum on a continuum of points.

*Applied Mathematics > Information Theory

nearest neighbor classifier, pattern classification, curse of dimensionality, Laplace's method of integration