This disclosure describes techniques for phonetic training of spelling models for use for spell correction. Phonetic canonicalization is utilized during the training of the spelling model to generate rules that map similar sounding words and portions of words to each other. Phonetic normalization is utilized to reduce the space of phonetic representations. A combination of a textual edit distance and a phonetic edit distance is utilized to score corrected alternatives for a word. The minimum of the textual edit distance and the phonetic edit distance is used as the noisy channel edit distance. Phonetic canonicalization can also be utilized during runtime in addition to its utilization in the training of the spelling model.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Jhala, Sanjit; Prajapati, Pratibha; Liu, Bing; and Wang, Grant, "Phonetic training of spelling models", Technical Disclosure Commons, (September 19, 2019)