Abstract
Existing explainable AI (XAI) attribution methods for natural language processing (NLP) models may have difficulty determining token importance in code-switched text, potentially leading to explanations that do not fully capture the emphasis conveyed by switching languages or the structural role of words at language boundaries. A method is described that can address this by performing token-level language identification to locate boundaries between languages in an input text. A contextual weighting engine can then assign weights to tokens within a code-switched segment and to boundary tokens adjacent to the switch. These weights may be used to modulate raw attribution scores generated by an XAI algorithm. This process can produce an explanation that reflects linguistic aspects of code-switching relevant to a model's prediction.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Kumar S R, Mithu, "A Method for Assessing Token Importance in Code-Switched Text Through Language Boundary Analysis", Technical Disclosure Commons, (September 22, 2025)
https://www.tdcommons.org/dpubs_series/8610