Abstract
Conventional image or video compression is based on the local statistics of an image and ignores statistical correlations across similar images. Such compression may not reach a globally optimal compression ratio for a collection of images that share similar features. This disclosure describes techniques to compress images by using large visual machine-learning models to generate a common embedding, shared by similar images, and a unique embedding, specific to the image, to improve compression efficiency. The techniques leverage the ability of large visual models (LVMs) to generate or reconstruct high-quality images from small embeddings. The techniques also leverage the observation that a universe of images can be split into subsets with similar scene-segment compressibility properties, which can therefore be expected to share common embeddings. In contrast to conventional image compression, which compresses images independent of each other, the described techniques leverage statistical properties across similar images to achieve higher compression ratios.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Shin, D, "Efficient Image Compression via Shared Embeddings Generated by a Large Visual Model", Technical Disclosure Commons, (August 23, 2024)
https://www.tdcommons.org/dpubs_series/7307