Abstract

Dual encoder models can be effective for efficient information retrieval. However, they can have limitations in precisely quantifying the degree of document relevance. This can result in varying distances for relevant documents within a vector space, hindering consistent relevance cutoffs. Disclosed herein are systems and methods that address these drawbacks by incorporating a scoring head (e.g., a cross-attention module followed by a linear multi-layer perceptron (MLP) to project the cross-attention output into a float score) into a dual encoder architecture and employing an additional scoring loss function alongside conventional retrieval loss, with optional adjustable weights to combine these loss components. The integration of a scoring mechanism may facilitate re-ranking and thresholding of retrieved candidates, providing more fine-grained relevance assessment.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS