Skip to content
/
OpenRouter
© 2026 OpenRouter, Inc

Product

  • Chat
  • Rankings
  • Apps
  • Models
  • Providers
  • Pricing
  • Enterprise
  • Labs

Company

  • About
  • Announcements
  • CareersHiring
  • Privacy
  • Terms of Service
  • Support
  • State of AI
  • Works With OR
  • Data

Developer

  • Documentation
  • API Reference
  • SDK
  • Status

Connect

  • Discord
  • GitHub
  • LinkedIn
  • X
  • YouTube
Favicon for perceptron

Perceptron: Perceptron Mk1

perceptron/perceptron-mk1

ChatCompare

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired with natural language queries, and produces detailed visual understanding responses, either structured or natural language. It excels at video understanding tasks like video QA, summarization, and event detection. On image inputs, it advances point-by-example grounding from multimodal prompts, OCR and document parsing on messy real-world inputs, open vocabulary object detection and counting, and hand pose estimation.

Reasoning can be enabled per request to trade latency for deeper analysis on harder tasks. Structured annotations are emitted inline with text only when explicitly requested via the annotation_format parameter (pass "point", "box", or "polygon" for spatial localization on images, or "clip" (start/end timestamps) for temporal segments in video). Without annotation_format, the model returns natural-language text only.

Modalities

In / Out Price

$0.15 / $1.50per 1M

Context

33K

Overview
Playground
Providers
Performance
Pricing
Apps
Activity
Uptime
API

Effective Pricing for Perceptron Mk1

Actual cost per million tokens across providers over the past hour