Rob Hanton


When someone is making a video or audio call, manually typing in the address or the telephone number is both slow and error-prone. This is particularly true when the person is using a dedicated video endpoint with a contextual touchscreen keyboard. The situation is further exacerbated when video endpoints are employed as monitors in which the user interface (UI) of an endpoint when making calls frequently obscures or completely occludes the desktop screen. To address these challenges, techniques are presented herein that support, for situations in which a video endpoint is being used as a monitor, the video endpoint capturing text (when a call is placed) from the monitor feed to which it has access, automatically identifying any addresses or contact numbers, and providing such information as destinations that may be called either directly or once a user starts typing (e.g., through predictions). Aspects of the presented techniques may perform optical character recognition (OCR) during the analysis of text, may employ algorithms to calculate a confidence value for an identified address, and may employ one or methods to suggest identified addresses to a user.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.