๐Ÿงฒ gte-rs: general text embedding and re-ranking in Rust

March 28, 2025 ยท View on GitHub

๐Ÿ’ฌ Introduction

This crate provides simple pipelines that can be used out-of-the box to perform text-embedding and re-ranking using ONNX models.

They are built with ๐Ÿงฉ orp (which relies on the ๐Ÿฆ€ ort runtime), and use ๐Ÿค— tokenizers for token encoding.

๐ŸŽ“ Examples

[dependencies]
"gte-rs" = "0.9.1"
"orp" = "0.9.2"

Embedding:

let params = Parameters::default();
let pipeline = TextEmbeddingPipeline::new("gte-modernbert-base/tokenizer.json", &params)?;
let model = Model::new("gte-modernbert-base/model.onnx", RuntimeParameters::default())?;
            
let inputs = TextInput::from_str(&[
    "text content", 
    "some more content",
    //...
]);

let embeddings = model.inference(inputs, &pipeline, &params)?;

Re-ranking:

let params = Parameters::default();
let pipeline = RerankingPipeline::new("gte-reranker-modernbert-base/tokenizer.json", &params)?;
let model = Model::new("gte-reranker-modernbert-base/model.onnx", RuntimeParameters::default())?;

let inputs = TextInput::from_str(&[
    ("one candidate", "query"),
    ("another candidate", "query"),
    //...
]);

let similarities = model.inference(inputs, &pipeline, &params)?;

Please refer the the source code in examples for complete examples.

๐Ÿงฌ Models

Alibaba's gte-modernbert

For english language, the gte-modernbert-base model outperforms larger models on retrieval with only 149M parameters, and runs efficiently on GPU and CPU. The gte-reranker-modernbert-base version does re-ranking with similar characteristics. This post provides interesting insights about them.

Other

This crate should be usable out-of-the box with other models, or easily adapted to other ones. Please report your own tests or requirements!

This project follows the same principles as the ones below. Refer to their documentation for more details:

  • ๐ŸŒฟ gline-rs: inference engine for GLiNER models
  • ๐Ÿท๏ธ gliclass-rs: inference engine for GLiClass models