Matching image or video content to other content is an important requirement for content hosting platforms. A common mechanism is to construct an index of known content, e.g., that include multi-dimensional embeddings generated from the content, and match new content against the index. The precision and recall of such techniques require a high quality fingerprint, and tradeoffs between recall performance and the cost of filtering out false positives. This disclosure describes improvements to content matching techniques that generate multiple transformations of the input content, look up each transformation in the index, and limit detection of false positives or other downstream analysis to content that has at least a threshold number of matches. Performance improvements in the recall vs. cost tradeoff are obtained due to the shape of the volume in the embedding space is no longer spherical, and instead, including many smaller spheres around the different transformed versions.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.