Abstract
In this work, we propose a machine learning model for document enhancement that is
invariant to small shifts between pixels in pairs of images. This is achieved by mapping the input
image (i.e., the low‐quality raw image to be enhanced) to the same latent space (or feature space)
learned from the corresponding high‐quality image (i.e., the enhanced ground‐truth version).
Furthermore, our algorithm can enhance photographed documents by mapping those
photographs to the same latent space learned from the corresponding expected enhancements.
The feature mapping between the photograph and the reference image avoids the necessity of
performing an image alignment step by providing the necessary features to decode an enhanced
version of the photographed document image.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 4.0 License.
Recommended Citation
INC, HP, "DOCUMENT ENHANCEMENT OF MISALIGNED CONTENT BY LEARNING A COMMON LATENT SPACE BETWEEN DOCUMENT PHOTOGRAPHS AND REFERENCE IMAGES", Technical Disclosure Commons, (February 03, 2022)
https://www.tdcommons.org/dpubs_series/4883