Skip to main content

On public and private binary classification with metric space valued predictors

:::info Stub — Full Engineering Breakdown Coming This paper was auto-fetched from arXiv on 2026-06-01. A full breakdown with production viability rating, implementation notes, and honest limitations is being written. Subscribe to AI Letters → :::

AuthorsLászló Györfi et al.
Year2026
FieldStatistics / ML
arXiv2605.31184
PDFDownload
Categoriesstat.ML

Abstract

We consider the problem of binary classification in a framework where the predictor XX takes values in an arbitrary separable metric space X\mathcal X and the label YY values in \{ \pm 1 \}. In the first part of this work, we assume that one has direct access to an i.i.d. sample (X1,Y1),,(Xn,Yn)(X_1,Y_1),\ldots,(X_n,Y_n) from the unknown distribution of the pair (X,Y)(X,Y). We derive a convergence rate for the Proto-NN classifier which was recently introduced as a classifier in the presence of metric space-valued predictors. In the second part of the paper, we reconsider the same problem under an additional privacy constraint. More precisely, we work in the framework of local differential privacy where one assumes that the data (X1,Y1),,(Xn,Yn)(X_1,Y_1),\ldots,(X_n,Y_n) cannot be directly observed but only a privatised surrogate obtained through a suitable mechanism satisfying the privacy constraint is available. The statistician should select an optimal privacy mechanism from the class of all mechanism that guarantee local differential privacy. Our method of choice is to add Laplace distributed noise to both a set of in Proto-NN classifier using the privatised data only is universally consistent. Finally, a rate of convergence for the privatised Proto-NN classifier is derived.


Engineering Breakdown

The Problem

We consider the problem of binary classification in a framework where the predictor XX takes values in an arbitrary separable metric space X\mathcal X and the label YY values in \{ \pm 1 \}. In the second part of the paper, we reconsider the same problem under an additional privacy constraint.

The Approach

In the first part of this work, we assume that one has direct access to an i.i.d. Our method of choice is to add Laplace distributed noise to both a set of in Proto-NN classifier using the privatised data only is universally consistent.

Key Results

Finally, a rate of convergence for the privatised Proto-NN classifier is derived.

Research Areas

This paper contributes to the following areas of AI/ML engineering:

  • Machine learning
  • Deep learning
  • Neural networks
  • Model optimization
  • AI systems
  • Classification

:::tip Subscribe Get weekly breakdowns of papers like this in AI Letters - the newsletter for engineers building production AI systems. :::


Back to Research Lab → · Subscribe to AI Letters →

© 2026 EngineersOfAI. All rights reserved.