
Machine Learning
122
intuitive information about the similarity characteristics of the data. The SDA-generated
probability estimates are useful for interpreting the results in a probabilistic framework, and
allow for class priors and costs to be seamlessly integrated into the classification rules.
7. References
S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape
contexts. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(4): 509-522,
April 2002.
M. Bicego, V. Murino, M. Pelillo, and A. Torsello. Special issue on similarity-based
classification. Pattern Recognition, 39, October 2006.
L. Cazzanti and M. R. Gupta. Local similarity discriminant analysis. In Intl. Conf.on Machine
Learning (ICML), 2007.
L. Cazzanti and M. R. Gupta. Information-theoretic and set-theoretic similarity. In Proc. of
the IEEE Intl. Symposium on Information Theory, pages 1836-1840, 2006.
L. Cazzanti, M. R. Gupta, and A. J. Koppal. Generative models for similarity-based
classification. Pattern Recognition, 41, number = 7, pages = 2289-2297, YEAR = 2008,.
S. Cost and S. Salzberg. A weighted nearest neighbor algorithm for learning with symbolic
features. Machine Learning, 10(1):57-78, 1993.
T. Cover and J. Thomas. Elements of Information Theory. John Wiley and Sons, New York,
1991.
L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer-
Verlag Inc., New York, 1996.
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience, 2001.
B. S. Everitt and S. Rabe-Hesketh. The Analysis of Proximity Data. Arnold, London, 1997.
I. Gati and A. Tversky. Weighting common and distinctive features in perceptual and
conceptual judgments. Cognitive Psychology, (16):341-370, 1984.
M. R. Gupta, L. Cazzanti, and A. J. Koppal. Maximum entropy generative models for
similarity-based learning. In Proc. IEEE Intl. Symposium on Information Theory, 2007.
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer-Verlag,
New York, 2001.
S. Hochreiter and K. Obermayer. Support vector machines for dyadic data. Neural
Computation, 18(6):1472-1510, 2006.
T. Hofmann and J.M. Buhmann. Pairwise data clustering by deterministic annealing. IEEE
Trans. on Pattern Analysis and Machine Intelligence, 19(1), January 1997.
D. W. Jacobs, D. Weinshall, and Y. Gdalyahu. Classification with nonmetric distances: Image
retrieval and class representation. IEEE Trans. on Pattern Analysis and Machine
Intelligence, 22(6):583-600, June 2000.
E. T. Jaynes. On the rationale for maximum entropy methods. Proc. of the IEEE, 70(9):939{952,
September 1982.
E. T. Jaynes. Probability theory: the logic of science. Cambridge University Press, 2003.
M. I. Jordan. An Introduction to Probabilistic Graphical Models. To be published, 20xx.
W. Lam, C. Keung, and D. Liu. Discovering useful concept prototypes for classification
based on filtering and abstraction. IEEE Trans. on Pattern Analysis and Machine
Intelligence, 24(8):1075-1090, August 2002.