Abstract

Data compression enables reduction in cost for data storage and communication. Storage of data obtained from a variety of sensors and other data sources is expensive. Conventional compression techniques are specific to individual data types and not suitable for many data modalities such as location data, user interaction data, etc. Contrastive pre-training enables generating a text representation of an image that preserves valuable information, e.g., the subject of the image. This disclosure describes the use of modality-specific encoders in combination with a text encoder to generate a text summary of data in each modality. The text summaries are merged into a single text by using a large language model. The single text can be compressed using a text compression technique.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Shin, D, "Data Compression Using Contrastive Pre-Training and Large Language Models", Technical Disclosure Commons, (March 29, 2023)
https://www.tdcommons.org/dpubs_series/5765

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Data Compression Using Contrastive Pre-Training and Large Language Models

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Data Compression Using Contrastive Pre-Training and Large Language Models

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information