Search/
Skip to content
/
OpenRouter
© 2026 OpenRouter, Inc

Product

  • Chat
  • Rankings
  • Apps
  • Models
  • Providers
  • Pricing
  • Enterprise
  • Labs

Company

  • About
  • Announcements
  • CareersHiring
  • Privacy
  • Terms of Service
  • Support
  • State of AI
  • Works With OR
  • Data

Developer

  • Documentation
  • API Reference
  • SDK
  • Status

Connect

  • Discord
  • GitHub
  • LinkedIn
  • X
  • YouTube
Collections/Rerank Models

Best Rerank Models for Search and RAG

Model rankings updated April 2026 based on real usage data.

Rerank models improve retrieval systems by reordering candidate documents, passages, or search results according to relevance. They are commonly used in semantic search, retrieval-augmented generation (RAG), recommendations, and knowledge-base applications where the first retrieval step returns too many possible matches. Compare top reranking models on OpenRouter to find the best fit for your search or RAG pipeline.

Top Rerank Models on OpenRouter

Favicon for cohere

Cohere: Rerank 4 Pro

Cohere's AI search foundation model for enhancing the relevance of information surfaced within search and RAG systems. Features a 32K context window, multilingual support across 100+ languages, no data pre-processing required, and state of the art performance with low latency.

by cohere33K context$0/M input tokens$0/M output tokens
Favicon for cohere

Cohere: Rerank 4 Fast

Cohere's AI search foundation model for enhancing the relevance of information surfaced within search and RAG systems. Features a 32K context window, multilingual support across 100+ languages, no data pre-processing required, and high performance with lowest latency.

by cohere33K context$0/M input tokens$0/M output tokens
Favicon for cohere

Cohere: Rerank v3.5

Rerank v3.5 is designed to reorder search results for improved relevance. It supports multi-aspect and semi-structured data reranking over 100+ languages. Ideal for refining results from semantic or keyword search pipelines.

by cohere4K context$0/M input tokens$0/M output tokens