Defensive Publications Series

Automatic Dish Name Extraction from User-generated Content Using LLM

Abstract

Extraction of dish names from user-provided content such as food photographs and captions, restaurant reviews, and other free-form text is a challenging task. Rule-based approaches are difficult to maintain and improve. Pattern matching against a predefined dictionary often suffers from low recall. Conventional machine learning models require large amounts of labeled data to perform named entity recognition (e.g., to recognize dish names) which is often costly and does not scale well across multiple languages and countries. This disclosure describes the use of a multimodal large language model to automatically extract dish names from user-generated content such as food photographs and associated free-form text such as tags, captions, etc. Dish name extraction from the user-provided tags can be formulated as an open vocabulary dish name entity recognition and discovery task, which fits naturally with the framework of pre-trained LLMs, and leverages the model capability in handling multilingual, multicultural text understanding.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Lin, Bo; Hibschman, Johann; and Oshima, Kathleen, "Automatic Dish Name Extraction from User-generated Content Using LLM", Technical Disclosure Commons, (November 07, 2023)
https://www.tdcommons.org/dpubs_series/6399

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Automatic Dish Name Extraction from User-generated Content Using LLM

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Automatic Dish Name Extraction from User-generated Content Using LLM

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information