Abstract
Techniques are described for privacy-preserving feature imputation in environments where a limited-personalization policy deterministically withholds a subset of user features. Complete feature vectors from consenting users are collected and normalized using automated statistical classification of features. A deterministic binary mask derived from a privacy policy configuration is applied during training to simulate restricted-feature conditions. A denoising autoencoder (or other encoder-decoder model) is trained by reconstructing full vectors from masked inputs while computing reconstruction loss only on masked positions, optionally adding noise to visible features. During inference, visible features for a privacy-restricted user are normalized, passed through the trained model to predict restricted features, denormalized, and merged with observed values to form a complete feature vector for downstream ranking or ad delivery. Evaluation uses held-out consenting users by masking their features and comparing predictions to known ground truth.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Anonymous, "Neural Network-Based Feature Imputation for Privacy-Constrained Environments Using Cross-Population Training", Technical Disclosure Commons, ()
https://www.tdcommons.org/dpubs_series/10625