Borgwardt, K., Schraudolph, N., and Vishwanathan, S. v. n. (2006).
Fast computation of graph kernels. In B. Schölkopf, J. Platt, and T. Hoffman, editors,
Advances in neural information processing systems,Vol. 19. MIT Press.
Chaudhuri, K., and Dasgupta, S. (2014).
Rates of convergence for nearest neighbor classification. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger, editors,
Advances in neural information processing systems,Vol. 27. Curran Associates, Inc.
Chopra, S., Hadsell, R., and LeCun, Y. (2005).
Learning a similarity metric discriminatively, with application to face verification.
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05),
1, 539–546.
Faghri, F., Fleet, D. J., Kiros, J. R., and Fidler, S. (2018).
VSE++: Improving visual-semantic embeddings with hard negatives.
Gibbs, M. N. (1997).
Bayesian gaussian process regression and classification (PhD thesis). Cambridge University. Retrieved from
https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=b5a0c62c8d7cf51137bfb079947b8393c00ed169
Goldberger, J., Hinton, G. E., Roweis, S., and Salakhutdinov, R. R. (2004).
Neighbourhood components analysis. In L. Saul, Y. Weiss, and L. Bottou, editors,
Advances in neural information processing systems,Vol. 17. MIT Press.
Heinonen, M., Mannerström, H., Rousu, J., Kaski, S., and Lähdesmäki, H. (2016).
Non-stationary gaussian process regression with hamiltonian monte carlo. In A. Gretton and C. C. Robert, editors,
Proceedings of the 19th international conference on artificial intelligence and statistics,Vol. 51, pages 732–740. Cadiz, Spain: PMLR.
Jona-Lasinio, G., Gelfand, A., and Jona-Lasinio, M. (2012).
SPATIAL ANALYSIS OF WAVE DIRECTION DATA USING WRAPPED GAUSSIAN PROCESSES.
The Annals of Applied Statistics,
6(4), 1478–1498.
Kriege, N. M., Johansson, F. D., and Morris, C. (2020).
A survey on graph kernels.
Applied Network Science,
5(1), 6.
Liu, H., Ong, Y.-S., Shen, X., and Cai, J. (2020).
When gaussian process meets big data: A review of scalable GPs.
IEEE Transactions on Neural Networks and Learning Systems,
31(11), 4405–4423.
Lloyd, S. (1982).
Least squares quantization in PCM.
IEEE Transactions on Information Theory,
28(2), 129–137.
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., and Watkins, C. (2002).
Text classification using string kernels.
Journal of Machine Learning Research,
2, 419–444.
Loeliger, H.-A., Bruderer, L., Malmberg, H., Wadehn, F., and Zalmai, N. (2016).
On sparsity by NUV-EM, gaussian message passing, and kalman smoothing. In
2016 information theory and applications workshop (ITA), pages 1–10.
MacKay, D. J. C. (1994).
Bayesian nonlinear modeling for the prediction competition (No. 2),Vol. 100. American Society of Heating, Refrigerating,; Air Conditioning Engineers (ASHRAE). Retrieved from
https://www.osti.gov/biblio/33309
MacQueen, J. (1967).
Some methods for classification and analysis of multivariate observations. In
Proceedings of the fifth berkeley symposium on mathematical statistics and probability,Vol. 1, pages 281–297.
Movshovitz-Attias, Y., Toshev, A., Leung, T. K., Ioffe, S., and Singh, S. (2017).
No fuss distance metric learning using proxies. In
2017 IEEE international conference on computer vision (ICCV), pages 360–368. Los Alamitos, CA, USA: IEEE Computer Society.
Musgrave, K., Belongie, S., and Lim, S.-N. (2020). A metric learning reality check. In A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, editors, Computer vision – ECCV 2020, pages 681–699. Cham: Springer International Publishing.
Neal, R. M. (1996).
Bayesian learning for neural networks,Vol. 118. Springer New York.
Oord, A. van den, Li, Y., Babuschkin, I., Simonyan, K., Vinyals, O., Kavukcuoglu, K., … Hassabis, D. (2018).
Parallel WaveNet: Fast high-fidelity speech synthesis. In J. Dy and A. Krause, editors,
Proceedings of the 35th international conference on machine learning,Vol. 80, pages 3918–3926. PMLR.
Qian, Q., Shang, L., Sun, B., Hu, J., Li, H., and Jin, R. (2019).
SoftTriple loss: Deep metric learning without triplet sampling. In
Proceedings of the IEEE/CVF international conference on computer vision (ICCV).
Rahimi, A., and Recht, B. (2007).
Random features for large-scale kernel machines. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors,
Advances in neural information processing systems,Vol. 20. Curran Associates, Inc.
Rahimi, A., and Recht, B. (2008).
Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors,
Advances in neural information processing systems,Vol. 21. Curran Associates, Inc.
Rasmussen, C. E., and Williams, C. K. I. (2006).
Gaussian processes for machine learning. The MIT Press.
Remes, S., Heinonen, M., and Kaski, S. (2017).
Non-stationary spectral kernels. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,
Advances in neural information processing systems,Vol. 30. Curran Associates, Inc.
Schroff, F., Kalenichenko, D., and Philbin, J. (2015).
FaceNet: A unified embedding for face recognition and clustering. In
Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
Shervashidze, N., Schweitzer, P., Leeuwen, E. J. van, Mehlhorn, K., and Borgwardt, K. M. (2011).
Weisfeiler-lehman graph kernels.
Journal of Machine Learning Research,
12(77), 2539–2561.
Sohn, K. (2016).
Improved deep metric learning with multi-class n-pair loss objective. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors,
Advances in neural information processing systems,Vol. 29. Curran Associates, Inc.
Sutherland, D. J., and Schneider, J. (2015). On the error of random fourier features. In Proceedings of the thirty-first conference on uncertainty in artificial intelligence, pages 862–871. Arlington, Virginia, USA: AUAI Press.
Tipping, M. E. (2001).
Sparse bayesian learning and the relevance vector machine.
Journal of Machine Learning Research,
1, 211–244.
Weinberger, Kilian Q., Blitzer, J., and Saul, L. (2005).
Distance metric learning for large margin nearest neighbor classification. In Y. Weiss, B. Schölkopf, and J. Platt, editors,
Advances in neural information processing systems,Vol. 18. MIT Press.
Weinberger, Kilian Q., and Saul, L. K. (2009).
Distance metric learning for large margin nearest neighbor classification.
Journal of Machine Learning Research,
10(9), 207–244.
Wilson, A., and Adams, R. (2013).
Gaussian process kernels for pattern discovery and extrapolation. In S. Dasgupta and D. McAllester, editors,
Proceedings of the 30th international conference on machine learning,Vol. 28, pages 1067–1075. Atlanta, Georgia, USA: PMLR.
Yu, F. X. X., Suresh, A. T., Choromanski, K. M., Holtmann-Rice, D. N., and Kumar, S. (2016).
Orthogonal random features. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors,
Advances in neural information processing systems,Vol. 29. Curran Associates, Inc.
持橋大地, and 大羽成征. (2019).
ガウス過程と機械学習. 講談社.