Now in public beta

Route every LLM call.
One key.

GateML sits between your app and every AI provider — OpenAI, Anthropic, Google. Automatic fallback, real-time cost tracking, and prompt versioning. Change one line of code.

one-line change# Before client = OpenAI(api_key="sk-proj-...") # After — all other code stays the same client = OpenAI(api_key="gml-sk-live_...", base_url="https://api.gateml.io/v1")
Compatible withOpenAI SDKLangChainLlamaIndexAnthropic SDKVercel AI SDK

The AI stack tax
every team is paying.

Every company is now building AI features. But the moment you go beyond a simple API call, you hit the same wall. Teams are solving it with duct tape: spreadsheets, Notion docs, random logging. It's the same chaos backend engineering had before observability tools existed.

Prompts are unversionedNo observability in productionCosts spiral until the bill hitsTesting is manual eyeballingLocked into one provider's SDKMultiple accounts & invoices
✗ Without GateML
Your App
OpenAI
own account
Anthropic
own account
Google
own account
3 accounts · 3 invoices · 3 key stores
0 unified observability
vs
✓ With GateML
Your App
GateML Gateway
observe · route · version · test
OpenAI
Anthropic
Google
1 account · 1 invoice · 1 key
full observability across all providers
✗ Without GateML
✓ With GateML
Provider accounts
One account per provider — OpenAI, Anthropic, Google. Three sign-ups, three billing dashboards.
One GateML key. We route to every provider.
Billing
Multiple invoices each month. Token costs arrive as a surprise.
One invoice. Real-time cost tracked per request.
Switching models
New SDK, code changes, a PR, a deploy.
Change a model in the dashboard. No code.
Provider outages
Your app returns errors. Users notice.
Auto-fallback to the next model. Invisible to users.
Prompt changes
Edit code → PR → review → deploy → hope.
Version in the dashboard. Roll back in one click.
Debugging bad outputs
Add more logging, redeploy, grep in prod.
Request inspector: inputs, outputs, tokens, latency — all there.
Testing output quality
Manually read a sample and guess.
Assertion-based eval suite. Run it in CI before you ship.
“It's the same chaos backend engineering had before observability tools existed — every team building the same duct-tape solution from scratch.”
GateML is to LLMs what Datadog is to infrastructure.

Everything your AI stack needs

From your first prototype to millions of requests, GateML grows with you.

Smart Routing
Route between GPT-4o, Claude, and Gemini with a simple config. No code changes. Rules live in the dashboard.
Automatic Fallback
On rate-limits or 5xx errors, GateML retries the next model with exponential backoff. Your users never see a failure.
Real-time Observability
Every request logged. Track latency, tokens, costs, and error rates in real time. Drill down into any request.
Prompt Versioning
Version your prompts like code. Diff versions, compare outputs, and roll back in one click without a deploy.
Eval Testing
Write assertions against your prompts. Run them in CI before you ship. Catch regressions before your users do.
Test Mode
Test keys return synthetic responses — no real LLM calls, no charges. Develop and integrate with confidence.

Up and running in 3 minutes

No infrastructure to manage. No new SDK to learn. Just a base URL change.

1
Sign up free
Create an account and get your test and live API keys in 30 seconds. No credit card required.
2
Add your provider keys
Paste your OpenAI, Anthropic, or Google API keys. We store them encrypted with AES-256.
3
Change one line
Point your existing code at api.gateml.io/v1. Everything else — models, messages, streaming — stays the same.

Works with your language

Official SDKs for every major stack. Or use any OpenAI-compatible library — just swap the base URL.

install$ pip install gateml openai
from gateml import GateML # Drop-in replacement for OpenAI() client = GateML(api_key="gml-sk-live_...") response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) # Or use your existing OpenAI code unchanged: from openai import OpenAI from gateml import get_config client = OpenAI(**get_config(api_key="gml-sk-live_..."))

Simple, transparent pricing

Start free with real LLM access. Scale as you grow. No surprise bills.

MonthlyAnnualSave 32%
Starter
For developers and side projects.
Free forever
  • 1,000 live requests / month
  • 5 requests / minute (burst)
  • Unlimited test mode requests
  • BYOK — bring any OpenAI / Anthropic / Google key
  • Managed Keys: STANDARD models (Haiku, GPT-4o mini, Flash)
  • 100K managed tokens / month · 50 managed req / day
  • Basic observability dashboard
  • 1 API key pair · 7-day log retention
  • BYOK overage at $0.002 / req (routing fee only)
Get started

No credit card required

Enterprise
Unlimited scale, SLA, and dedicated support.
Custom
  • Unlimited requests · 500 req / min burst
  • Managed Keys: ALL models (Opus, GPT-4 Turbo, o1)
  • Unlimited managed tokens / month
  • Custom rate limits & SLA
  • SSO / SAML
  • Dedicated Slack channel
  • Custom contract & invoicing

Response within 1 business day

Managed KeysSkip provider key setup entirely. Enable GateML Managed Keys and GateML routes through its own accounts — billed per-token at provider cost + 20% markup. Model access scales with your plan: STANDARD models on Starter, PREMIUM on Pro, all models on Enterprise. Monthly token budgets apply. BYOK always bypasses tier limits — your own keys work with any model on any plan.
Pay-as-you-goExhausted your monthly request quota on BYOK calls? Enable pay-as-you-go and keep routing at $0.0020/req (Starter) or $0.0010/req (Pro). This is a platform routing fee only — you continue to pay your provider directly for tokens as normal. Managed Key calls are never subject to this fee; they follow their own per-token billing.

What's new

View all →
Loading…