Transforming Inaccessible PDF Documents into Accessible, Interactive Documents

Abstract

Documents in portable document format (PDF) that have no accessibility tree are inaccessible by screen readers. Such documents, rendered as images, cannot be searched through or interacted with, and their textual elements cannot be selected or copy-pasted. This disclosure leverages optical character recognition (OCR) to automatically recognize inaccessible text content in scanned PDF documents and to enable access to such documents via screen readers, select-to-speak tools, and/or other assistive technologies. Upon recognizing text in the scanned document, an accessibility tree is generated and an invisible text layer is overlaid, enabling reading aloud (via a screen reader), searching through, selecting-and-copying elements from, and interacting with the scanned document. A previously passive image of a document achieves the qualities of an editable document or a searchable webpage that supports copy-paste and other text-related operations. In addition to widening accessibility, the described techniques are useful to those who prefer to consume content aurally rather than visually.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Bernal Castano, Jonathan Ray; Halavati, Ramin; Tseng, David; Lee, Kyungjun; Yan, Jason; Ley-Wild, Ruy; Paisios, Nektarios; Bissacco, Alessandro; Kulkarni, Sameer; Zhang, Lei; Payne, Jennifer; and Yang, Chu- Hsuan, "Transforming Inaccessible PDF Documents into Accessible, Interactive Documents", Technical Disclosure Commons, (February 06, 2025)
https://www.tdcommons.org/dpubs_series/7818

Technical Disclosure Commons

Defensive Publications Series

Transforming Inaccessible PDF Documents into Accessible, Interactive Documents

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Transforming Inaccessible PDF Documents into Accessible, Interactive Documents

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information