Now in public beta

Route every LLM call.
One key.

GateML sits between your app and every AI provider — OpenAI, Anthropic, Google. Automatic fallback, real-time cost tracking, and prompt versioning. Change one line of code.

Get your API key free →Read the docs

one-line change# Before client = OpenAI(api_key="sk-proj-...") # After — all other code stays the same client = OpenAI(api_key="gml-sk-live_...", base_url="https://api.gateml.io/v1")

Compatible withOpenAI SDKLangChainLlamaIndexAnthropic SDKVercel AI SDK

The Problem

The AI stack tax
every team is paying.

Every company is now building AI features. But the moment you go beyond a simple API call, you hit the same wall. Teams are solving it with duct tape: spreadsheets, Notion docs, random logging. It's the same chaos backend engineering had before observability tools existed.

✗ Prompts are unversioned✗ No observability in production✗ Costs spiral until the bill hits✗ Testing is manual eyeballing✗ Locked into one provider's SDK✗ Multiple accounts & invoices

✗ Without GateML

Your App

↙↓↘

OpenAI

own account

Anthropic

own account

Google

own account

3 accounts · 3 invoices · 3 key stores
0 unified observability

✓ With GateML

Your App

↓

GateML Gateway

observe · route · version · test

↙↓↘

OpenAI

Anthropic

Google

1 account · 1 invoice · 1 key
full observability across all providers

✗ Without GateML

✓ With GateML

Provider accounts

One account per provider — OpenAI, Anthropic, Google. Three sign-ups, three billing dashboards.

✓One GateML key. We route to every provider.

Billing

Multiple invoices each month. Token costs arrive as a surprise.

✓One invoice. Real-time cost tracked per request.

Switching models

New SDK, code changes, a PR, a deploy.

✓Change a model in the dashboard. No code.

Provider outages

Your app returns errors. Users notice.

✓Auto-fallback to the next model. Invisible to users.

Prompt changes

Edit code → PR → review → deploy → hope.

✓Version in the dashboard. Roll back in one click.

Debugging bad outputs

Add more logging, redeploy, grep in prod.

✓Request inspector: inputs, outputs, tokens, latency — all there.

Testing output quality

Manually read a sample and guess.

✓Assertion-based eval suite. Run it in CI before you ship.

“It's the same chaos backend engineering had before observability tools existed — every team building the same duct-tape solution from scratch.”

GateML is to LLMs what Datadog is to infrastructure.

Features

Everything your AI stack needs

From your first prototype to millions of requests, GateML grows with you.

Smart Routing

Route between GPT-4o, Claude, and Gemini with a simple config. No code changes. Rules live in the dashboard.

Automatic Fallback

On rate-limits or 5xx errors, GateML retries the next model with exponential backoff. Your users never see a failure.

Real-time Observability

Every request logged. Track latency, tokens, costs, and error rates in real time. Drill down into any request.

Prompt Versioning

Version your prompts like code. Diff versions, compare outputs, and roll back in one click without a deploy.

Eval Testing

Write assertions against your prompts. Run them in CI before you ship. Catch regressions before your users do.

Test Mode

Test keys return synthetic responses — no real LLM calls, no charges. Develop and integrate with confidence.

How it works

Up and running in 3 minutes

No infrastructure to manage. No new SDK to learn. Just a base URL change.

Create an account and get your test and live API keys in 30 seconds. No credit card required.

Add your provider keys

Paste your OpenAI, Anthropic, or Google API keys. We store them encrypted with AES-256.

Change one line

Point your existing code at api.gateml.io/v1. Everything else — models, messages, streaming — stays the same.

SDKs

Works with your language

Official SDKs for every major stack. Or use any OpenAI-compatible library — just swap the base URL.

install$ pip install gateml openai

from gateml import GateML # Drop-in replacement for OpenAI() client = GateML(api_key="gml-sk-live_...") response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) # Or use your existing OpenAI code unchanged: from openai import OpenAI from gateml import get_config client = OpenAI(**get_config(api_key="gml-sk-live_..."))

Pricing

Simple, transparent pricing

Start free with real LLM access. Scale as you grow. No surprise bills.

MonthlyAnnualSave 32%

Starter

For developers and side projects.

Free forever

1,000 live requests / month
5 requests / minute (burst)
Unlimited test mode requests
BYOK — bring any OpenAI / Anthropic / Google key
Managed Keys: STANDARD models (Haiku, GPT-4o mini, Flash)
100K managed tokens / month · 50 managed req / day
Basic observability dashboard
1 API key pair · 7-day log retention
BYOK overage at $0.002 / req (routing fee only)

Get started

No credit card required

Pro

For teams running AI in production.

$19 / mo

30,000 live requests / month
60 requests / minute (burst)
BYOK — all providers, all models, unrestricted
Managed Keys: PREMIUM models (GPT-4o, Sonnet, Gemini Pro)
5M managed tokens / month · 2,000 managed req / day
Full observability + cost tracking
Prompt library + version diffs
Eval testing suite · Fallback chain config
5 API key pairs · 90-day log retention
BYOK overage at $0.001 / req (routing fee only)
Email support

Start free trial

7-day free trial · cancel anytime

Enterprise

Unlimited scale, SLA, and dedicated support.

Custom

Unlimited requests · 500 req / min burst
Managed Keys: ALL models (Opus, GPT-4 Turbo, o1)
Unlimited managed tokens / month
Custom rate limits & SLA
SSO / SAML
Dedicated Slack channel
Custom contract & invoicing

Response within 1 business day

Managed KeysSkip provider key setup entirely. Enable GateML Managed Keys and GateML routes through its own accounts — billed per-token at provider cost + 20% markup. Model access scales with your plan: STANDARD models on Starter, PREMIUM on Pro, all models on Enterprise. Monthly token budgets apply. BYOK always bypasses tier limits — your own keys work with any model on any plan.

Pay-as-you-goExhausted your monthly request quota on BYOK calls? Enable pay-as-you-go and keep routing at $0.0020/req (Starter) or $0.0010/req (Pro). This is a platform routing fee only — you continue to pay your provider directly for tokens as normal. Managed Key calls are never subject to this fee; they follow their own per-token billing.

Changelog

What's new

View all →

Loading…

Route every LLM call.One key.

The AI stack taxevery team is paying.

Everything your AI stack needs

Up and running in 3 minutes

Works with your language

Simple, transparent pricing

What's new

Route every LLM call.
One key.

The AI stack tax
every team is paying.