Abstract
Digital map services offer a live view feature that provides users an augmented view of their surroundings. However, these features are only accessible to users who can view a display screen. While screen reader technology can provide voice output of on-screen content, such techniques are not particularly suitable for the type of information displayed in a live view. This disclosure leverages the summarization capabilities of a large language model (LLM) to provide a scene description to a user that is using a live map view. Per the techniques, in response to a user request from a user device, a digital map service retrieves data related to a user location, such as roads, traffic lanes, and crosswalks; real-time traffic, weather, air quality, etc.; businesses near the location; etc. and generates text according to simple text templates from the retrieved data, e.g., by concatenating the information into a string. The string is provided to an LLM with appropriate prompts to generate a summary. The summary is narrated to the user via their device.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Shuma, Jim and Oda, Ohan, "Generating Geospatial Scene Summaries Using a Large Language Model", Technical Disclosure Commons, (December 21, 2023)
https://www.tdcommons.org/dpubs_series/6532