Voice based user queries sometimes include words in multiple languages that could stymie speech recognition systems. Alternatively, different words in the same language may have similar pronunciation. A device, e.g., smart home speaker, smartphone, etc. that implements a voice-based user interface receives a user command, where portions of the user speech are parsed with low confidence, e.g., due to similar pronunciation, or use of a different language. The described techniques enable improved voice-based interaction by permitting a user to provide corrections for such portions. For example, the techniques enable users to provide a correction via speech, via a displayed interface, etc.