Abstract
Dual encoder models can be effective for efficient information retrieval. However, they can have limitations in precisely quantifying the degree of document relevance. This can result in varying distances for relevant documents within a vector space, hindering consistent relevance cutoffs. Disclosed herein are systems and methods that address these drawbacks by incorporating a scoring head (e.g., a cross-attention module followed by a linear multi-layer perceptron (MLP) to project the cross-attention output into a float score) into a dual encoder architecture and employing an additional scoring loss function alongside conventional retrieval loss, with optional adjustable weights to combine these loss components. The integration of a scoring mechanism may facilitate re-ranking and thresholding of retrieved candidates, providing more fine-grained relevance assessment.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Dua, Sahil; Moiseev, Fedor; Van Cleeff, Pascal; and Dong, Zhe, "Enhanced Dual Encoder with Retrieval and Scoring Loss (AttHyDE)", Technical Disclosure Commons, (October 03, 2025)
https://www.tdcommons.org/dpubs_series/8674