Defensive Publications Series

Low-Latency Pointwise Language Model Ranker with Token-Probability Normalization and Coordinated Batch Inference for Online Content Ranking

AnonymousFollow

Abstract

Techniques are described for low-latency pointwise content ranking using a fine-tuned student language model. A ranking service constructs per-candidate language-model inputs that include user-context signals and candidate item context, and sends the inputs to an inference server in coordinated batches. The inference server returns output values for designated positive and negative label tokens, reducing accelerator-to-host transfer. A continuous relevance score is computed using token-probability normalization, score = P(pos)/(P(pos)+P(neg)), yielding a calibrated value in [0,1] for thresholding and ranking. Batch processing may include reuse of cached key/value states for shared user-context prefixes. The techniques enable scoring hundreds of candidates within tight latency and serving-cost constraints for online ranking surfaces such as feeds and search results.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

Anonymous, "Low-Latency Pointwise Language Model Ranker with Token-Probability Normalization and Coordinated Batch Inference for Online Content Ranking", Technical Disclosure Commons, (June 30, 2026)
https://www.tdcommons.org/dpubs_series/10700

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

Low-Latency Pointwise Language Model Ranker with Token-Probability Normalization and Coordinated Batch Inference for Online Content Ranking

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

Low-Latency Pointwise Language Model Ranker with Token-Probability Normalization and Coordinated Batch Inference for Online Content Ranking

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information