Abstract

Documents in portable document format (PDF) that have no accessibility tree are inaccessible by screen readers that are commonly used by visually impaired users or any user that is in a circumstance where they are unable to view the screen where the PDF is displayed. This disclosure leverages optical character recognition (OCR) to make text content in such documents accessible via screen readers, select-to-speak tools, or other assistive technologies. OCR is performed, e.g., by an on-device machine learning model, to generate an accessibility tree for the document. The accessibility tree is provided to a screen reader, select to speak tools, or other assistive technologies. OCR is used to provide both the text as well as its layout. The techniques can be implemented in a standalone application, as a browser extension, or can be integrated into a browser or operating system.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS