Disclosed herein is a mechanism for detecting spam content by clustering media channels having similar thumbnail images. In some instances, the mechanism can construct an index of video thumbnail images. For example, the index of thumbnail images can be a unique set of the most frequently used thumbnail images from the thumbnail images associated with each video in a corpus of videos. The mechanism can use the index of thumbnail images to cluster videos having thumbnail images that are the same or substantially similar to those in the index of thumbnail images. The mechanism can then, based on the video clusters, determine clusters of media channels that share the same or substantially similar thumbnail images to those in the index of thumbnail images. Upon determining clusters of media channels, the mechanism can perform further analysis, such as determining whether the content associated with a cluster of media channels contains spam content or determining whether the content associated with a cluster of media channels indicates scaled abuse at the user account level.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Jian, Bing and Curry, Charles, "Detecting Spam Content By Clustering Media Channels Having Similar Thumbnail Images", Technical Disclosure Commons, (December 12, 2017)