The present disclosure describes computer-implemented systems and methods for automatic expansion of data sets for machine learning models by automatically creating crowdsourcing tasks to send to users around the world to describe a data set in their language and returning the expanded data set for the machine learning model. A user may provide an initial data set and receive a diverse and complete expanded data set with annotations in multiple languages in less time and without any additional effort from the user than in traditional data gathering for machine learning models.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Mikolajewski, Tomasz, "Automatic Expansion of Data Sets for Machine Learning Models Using Crowdsourcing", Technical Disclosure Commons, (December 26, 2022)