A rights management database maps content to content owners. Such a database can have duplicates of essentially identical assets, a problem referred to as undermerging. Conversely, a single asset can erroneously include multiple distinct assets, a problem referred to as overmerging. This disclosure describes techniques to automatically resolve overmerged assets and undermerged assets in a rights management database. Per the techniques, a consistency signal for an asset is computed. A logic module uses the consistency signal and a strict-match signal to merge assets that correspond to the same content. Another logic module detects duplicate reference material and builds a graph of thus-far undermerged assets that can be merged. The techniques detect and resolve undermerged and overmerged assets at scale, correctly handling slightly transformed duplicates, e.g., remasters, trimmed versions, sped-up versions, etc.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.