Skip to content
No models found
OpenRouter
© 2026 OpenRouter, Inc

Product

  • Chat
  • Rankings
  • Apps
  • Models
  • Providers
  • Pricing
  • Enterprise
  • Labs

Company

  • About
  • Blog
  • CareersHiring
  • Privacy
  • Terms of Service
  • Support
  • State of AI
  • Works With OR
  • Data

Developer

  • Documentation
  • API Reference
  • SDK
  • Status

Connect

  • Discord
  • GitHub
  • LinkedIn
  • X
  • YouTube
Collections/Rerank Models

Best Rerank Models for Search and RAG

Model rankings updated June 2026 based on real usage data.

Rerank models improve retrieval systems by reordering candidate documents, passages, or search results according to relevance. They are commonly used in semantic search, retrieval-augmented generation (RAG), recommendations, and knowledge-base applications where the first retrieval step returns too many possible matches. Compare top reranking models on OpenRouter to find the best fit for your search or RAG pipeline.

Top Rerank Models on OpenRouter

Favicon for nvidia

NVIDIA: Llama Nemotron Rerank VL 1B V2 (free)

3K tokens

Llama Nemotron Rerank VL 1B V2 is a 1.7B multimodal reranking model from NVIDIA. It evaluates the relevance of document images and text against user queries, designed for vision RAG pipelines handling charts, tables, infographics, and mixed-media documents. Functions as a cross-encoder that accepts text queries paired with image, text, or combined document inputs, delivering approximately 6-7% recall improvements over embedding-only baselines on visual document retrieval benchmarks.

by nvidia10K context$0/M input tokens$0/M output tokens
Favicon for cohere

Cohere: Rerank 4 Pro

Cohere's AI search foundation model for enhancing the relevance of information surfaced within search and RAG systems. Features a 32K context window, multilingual support across 100+ languages, no data pre-processing required, and state of the art performance with low latency.

by cohere33K context$0/M input tokens$0/M output tokens
Favicon for cohere

Cohere: Rerank 4 Fast

Cohere's AI search foundation model for enhancing the relevance of information surfaced within search and RAG systems. Features a 32K context window, multilingual support across 100+ languages, no data pre-processing required, and high performance with lowest latency.

by cohere33K context$0/M input tokens$0/M output tokens
Favicon for cohere

Cohere: Rerank v3.5

Rerank v3.5 is designed to reorder search results for improved relevance. It supports multi-aspect and semi-structured data reranking over 100+ languages. Ideal for refining results from semantic or keyword search pipelines.

by cohere4K context$0/M input tokens$0/M output tokens