Abstract

Vector search effectiveness may be reduced by semantic mismatches where nuisance dimensions, such as formality or complexity, can degrade retrieval relevance. Addressing these dimensions could involve resource-intensive model retraining. Systems and methods are described for dynamically modifying vector embeddings using prisms, which can be vector representations of certain semantic concepts. For example, a prism can be calculated from the vector difference between embeddings of two data items that differ primarily along a given semantic axis. To refine a search, a query or document embedding can be modified through a geometric projection into a subspace orthogonal to a prism vector. This operation can attenuate the influence of the semantic concept represented by the prism, which may improve search relevance by focusing comparisons on topical content, potentially without a need to alter the foundational embedding model.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS