Abstract

Digital map services offer a live view feature that provides users an augmented view of their surroundings. However, these features are only accessible to users who can view a display screen. While screen reader technology can provide voice output of on-screen content, such techniques are not particularly suitable for the type of information displayed in a live view. This disclosure leverages the summarization capabilities of a large language model (LLM) to provide a scene description to a user that is using a live map view. Per the techniques, in response to a user request from a user device, a digital map service retrieves data related to a user location, such as roads, traffic lanes, and crosswalks; real-time traffic, weather, air quality, etc.; businesses near the location; etc. and generates text according to simple text templates from the retrieved data, e.g., by concatenating the information into a string. The string is provided to an LLM with appropriate prompts to generate a summary. The summary is narrated to the user via their device.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS