Techniques are described herein for securing data used for a machine learning algorithm. The frequency or top-k values calculated over time of the respective network traffic feature data sets are used instead of the actual data or a set thereof (this can also be extended to any other data sets). Here, the frequency represents the actual data and thereby obfuscates potential sensitive information that should not be used within an oftentimes shared cloud machine learning application.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Rantzau, Ralf; Jeuk, Sebastian; and Salgueiro, Gonzalo, "OBFUSCATION AND ANONYMIZATION TECHNIQUES FOR NETWORK DATA SETS FOR MACHINE LEARNING", Technical Disclosure Commons, (December 21, 2018)