Skip to main content
Measured savings across 11 LLMs — Claude Opus 4.7 to Gemini Flash.→ See per-model data
Connect your client
MCP Gateway

46% smaller tool responses on average. One bearer token.

Point your MCP client at https://api.gotcontext.ai/mcp, add your key, and every tool response is compressed before your agent reads it. Works with Claude Code, Cursor, Codex, and Gemini CLI — 150+ tools.

Get your free API key

Free tier — 1,000 compressions/month, no credit card.

Read the docs →Try it live
Try it — no signup1,069 / 5,000

Compatible with

Claude Code
Cursor
Gemini CLI
Codex
Windsurf
VS Code
Step 1 — Ingest
Document Analysis
Text chunked, analyzed, and scored semantically. Compression graph assembled.
Step 2 — Rank
PageRank Scoring
Graph edges weighted by semantic similarity. Importance propagated through the network.
Step 3 — Extract
Ranked extract (not generated)
Top-ranked nodes form the compressed output. Every output token appears in your input. Target ratio controls fidelity.
Step 4 — Deliver
Return to MCP client
Compressed output returned to your AI tool, typically 46% smaller on production traffic (87.4% on benchmark peak). Expandable on demand.
COMPRESSION
Semantic Graph
PageRank-based importance scoring
46% live average
terminal
live
# 1. Get a free API key at gotcontext.ai/sign-up
# 2. Point your AI tool at our MCP endpoint:
https://api.gotcontext.ai/mcp
Authorization: Bearer gc_your_key
# 3. Call tools naturally — Claude Code / Cursor / etc:
> ingest_context(file_id="api.md", content="...")
> read_skeleton(file_id="api.md", ratio=0.15)
# Result: 485 → 61 tokens (87.4% reduction)
46%
Live avg compression
<90ms
p95 pipeline latency
150+
MCP Tools

Try it now

Paste any text and see how much you can save. No signup required.

Text is processed in-memory and is not stored, logged with PII, or used for training. Do not paste secrets or production credentials. Privacy details →

1,069/5,000 chars
Compressed output
Compressed text will appear here...
Pricing

Pay for tool calls. Compression is included.

Every MCP tool response is compressed before it returns to your agent, so each call delivers more context per token. The multiplier scales with the live compression ratio (see hero). Covers solo developers to enterprise teams.

Free

$0/month
Free tier
No credit card. Built for evaluation and side projects.
  • 1,000 compressions/month
  • 100KB max document
  • Standard compression
  • Command Palette & shortcuts
  • Activity Feed
  • Dark/Light theme
  • Community support
Start free — 1,000 compressions/mo

Pro

$49/mo
For individual developers
All 150+ MCP tools, accelerated compression, priority queue with 2 reserved compression slots.
  • 50,000 compressions/month
  • All 150+ MCP tools (incl. ACE, knowledge mgmt, multimodal)
  • Priority queue: 2 concurrent compression slots
  • 1MB max document
  • Accelerated compression (3-5x faster)
  • Queue Monitor (real-time SSE)
  • Usage analytics
  • Webhook Notifications
  • Priority support
Start Pro Plan

Business

$199/mo
Shared infra with SOC2-ready logging, OIDC/SSO, and DPA
Self-hosted Docker, OIDC/SSO, audit-log export for SOC2, SBERT embeddings, named Customer Success Manager.
  • 500,000+ compressions / month
  • All 150+ MCP tools
  • Priority queue: 8 concurrent compression slots
  • Self-hosted Docker (run in your VPC)
  • OIDC federation (Okta, Auth0, Azure AD)
  • Audit-log export (NDJSON/CSV) for SOC2
  • SBERT embeddings (higher fidelity than the default MiniLM tier)
  • SSO / SAML
  • Email support · SLA on request (custom MSA)
  • DPA / IP indemnity / custom MSA
Contact Sales
See full plan comparison (Free · Pro · Team · Enterprise)
How we measure

How the numbers are measured.

Two sources: the live API, and an open benchmark you can run. The hero number is a live rolling average from production traffic via the /v1/global-savings endpoint. The benchmark peak below is from the open-source harness — run it yourself, the numbers will be identical.

87.4%
Benchmark peak — large-document workloads
Peak on long-form documents (API specs, codebases, research papers). Live production average is 46% across all workload types — both numbers are real, the difference is workload mix.
View public benchmarks
150+
MCP tools
Claude, GPT, Gemini, Codex.
<90ms
Pipeline latency
Ingest → compress → return, p95.
5
CLI integrations
Claude Code, Cursor, Gemini CLI, Codex, Windsurf.
Start today

Start free.

1,000 compressions/month, all 150+ tools, no credit card.