The relationship between the distribution of data, on the one hand, and classifier performance, on the other, for non-parametric classifiers has been studied. It is shown that predictable factors such as the available amount of training data (relative to the dimensionality of the feature space), the spatial variability of the effective average distance between data samples, and the type and amount of noise in the data set influence such classifiers to a significant degree. The methods developed here can be used to gain a detailed understanding of classifier design and selection.
Reference:
Van der Walt, C and Barnard, E. 2006. Data characteristics that determine classifier performance. 17th Annual Symposium of the Pattern Recognition Association of South Africa, Parys, South Africa, 29 Nov - 1 Dec 2006, pp 6
Van der Walt, C. M., & Barnard, E. (2006). Data characteristics that determine classifier performance. http://hdl.handle.net/10204/1038
Van der Walt, Christiaan M, and E Barnard. "Data characteristics that determine classifier performance." (2006): http://hdl.handle.net/10204/1038
Van der Walt CM, Barnard E, Data characteristics that determine classifier performance; 2006. http://hdl.handle.net/10204/1038 .