Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization

Javanmard, Adel; Mirrokni, Vahab

Computer Science > Machine Learning

arXiv:2310.04015 (cs)

[Submitted on 6 Oct 2023 (v1), last revised 2 Nov 2023 (this version, v3)]

Title:Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization

Authors:Adel Javanmard, Vahab Mirrokni

View PDF

Abstract:While personalized recommendations systems have become increasingly popular, ensuring user data protection remains a top concern in the development of these learning systems. A common approach to enhancing privacy involves training models using anonymous data rather than individual data. In this paper, we explore a natural technique called \emph{look-alike clustering}, which involves replacing sensitive features of individuals with the cluster's average values. We provide a precise analysis of how training models using anonymous cluster centers affects their generalization capabilities. We focus on an asymptotic regime where the size of the training set grows in proportion to the features dimension. Our analysis is based on the Convex Gaussian Minimax Theorem (CGMT) and allows us to theoretically understand the role of different model components on the generalization error. In addition, we demonstrate that in certain high-dimensional regimes, training over anonymous cluster centers acts as a regularization and improves generalization error of the trained models. Finally, we corroborate our asymptotic theory with finite-sample numerical experiments where we observe a perfect match when the sample size is only of order of a few hundreds.

Comments:	accepted at the Conference on Neural Information Processing Systems (NeurIPS 2023)
Subjects:	Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2310.04015 [cs.LG]
	(or arXiv:2310.04015v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.04015

Submission history

From: Adel Javanmard [view email]
[v1] Fri, 6 Oct 2023 04:52:46 UTC (592 KB)
[v2] Mon, 9 Oct 2023 16:20:49 UTC (592 KB)
[v3] Thu, 2 Nov 2023 02:40:07 UTC (592 KB)

Computer Science > Machine Learning

Title:Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators