Access Claude models through a gateway API compatible with both Anthropic and OpenAI formats. Drop-in replacement for your existing code.
CheapLM is an API gateway for Claude models. It proxies your requests to upstream providers, handling authentication, credit tracking, and usage logging automatically.
POST /v1/messages works with any Anthropic SDK
POST /v1/chat/completions works with any OpenAI SDK
Dedicated /cursor/v1 endpoints with agent mode
Opus 4.6, Sonnet 4.5, Haiku 3.5, and more
Sign up and create an API key from your dashboard. Keys follow the format:
sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxUse the official OpenAI or Anthropic SDK - both are fully compatible!
pip install openaipip install anthropicfrom openai import OpenAI
client = OpenAI(
api_key="sk-YOUR_KEY",
base_url="https://api.cheaplm.xyz/api/v1"
)
response = client.chat.completions.create(
model="claude-sonnet-4",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)import anthropic
client = anthropic.Anthropic(
api_key="sk-YOUR_KEY",
base_url="https://api.cheaplm.xyz/api/v1"
)
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(message.content[0].text)import OpenAI from "openai";
const client = new OpenAI({
apiKey: "sk-YOUR_KEY",
baseURL: "https://api.cheaplm.xyz/api/v1",
});
const response = await client.chat.completions.create({
model: "claude-sonnet-4",
messages: [
{ role: "user", content: "Hello!" },
],
});
console.log(response.choices[0].message.content);curl https://api.cheaplm.xyz/api/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: sk-YOUR_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, Claude!"}
]
}'curl https://api.cheaplm.xyz/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-YOUR_KEY" \
-d '{
"model": "claude-sonnet-4",
"messages": [
{"role": "user", "content": "Hello, Claude!"}
]
}'All API requests require authentication via an API key. You can pass it in two ways:
x-api-key: sk-YOUR_KEYAuthorization: Bearer sk-YOUR_KEYhttps://api.cheaplm.xyz/v1/messagesAnthropic Messages API format
/v1/chat/completionsOpenAI Chat Completions format
/cursor/v1/chat/completionsCursor IDE with agent mode support
/v1/modelsList available models (OpenAI format)
/v1/models/infoDetailed model information with capabilities
/v1/profileCheck account balance and API key info
| Model | Max Input | Max Output | Cost/1K | Features |
|---|---|---|---|---|
claude-opus-4.6 | 200K | 32K | 0.22 credits | ThinkingVision |
claude-opus-4.5 | 200K | 32K | 0.22 credits | ThinkingVision |
claude-opus-4-20250514 | 200K | 32K | 0.22 credits | ThinkingVision |
claude-sonnet-4-5-20250514 | 200K | 8K | 0.13 credits | ThinkingVision |
claude-sonnet-4-20250514 | 200K | 16K | 0.13 credits | ThinkingVision |
claude-haiku-3-5-20241022 | 200K | 8K | 0.04 credits | Vision |
💡 Note: 1 credit ≈ 10,000 tokens. Model aliases like claude-sonnet-4 are automatically normalized.
/v1/messagesCreate a message using the Anthropic Messages API format.
x-api-key: sk-YOUR_KEY
anthropic-version: 2023-06-01
Content-Type: application/json{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"system": "You are a helpful assistant.",
"messages": [
{
"role": "user",
"content": "Explain quantum computing."
}
],
"temperature": 0.7,
"stream": false
}{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Quantum computing uses quantum bits..."
}
],
"model": "claude-sonnet-4-20250514",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 25,
"output_tokens": 150
}
}All Claude models support vision. Send images as base64-encoded data:
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image:"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": "iVBORw0KGgoAAAANS..."
}
}
]
}/v1/chat/completionsCreate a chat completion using the OpenAI format. Works with any OpenAI SDK.
{
"model": "claude-sonnet-4",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
],
"max_tokens": 1024,
"temperature": 0.7
}{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "claude-sonnet-4-20250514",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 10,
"total_tokens": 35
}
}/cursor/v1/chat/completionsDedicated endpoint for Cursor IDE with agent mode support, tool calling, and task completion detection.
sk-YOUR_KEYhttps://api.cheaplm.xyz/cursor/v1{
"model": "claude-sonnet-4",
"messages": [
{
"role": "system",
"content": "You are a helpful coding assistant."
},
{
"role": "user",
"content": "Refactor this function to be more efficient"
}
],
"max_tokens": 2048,
"temperature": 0.7
}Enable streaming to receive responses token-by-token using Server-Sent Events (SSE). Set "stream": true in your request body.
from openai import OpenAI
client = OpenAI(
api_key="sk-YOUR_KEY",
base_url="https://api.cheaplm.xyz/api/v1"
)
stream = client.chat.completions.create(
model="claude-sonnet-4",
messages=[{"role": "user", "content": "Write a story."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")curl https://api.cheaplm.xyz/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-YOUR_KEY" \
--no-buffer \
-d '{
"model": "claude-sonnet-4",
"stream": true,
"messages": [
{"role": "user", "content": "Write a haiku."}
]
}'Install the OpenAI Python SDK:
pip install openaifrom openai import OpenAI
client = OpenAI(
api_key="sk-YOUR_KEY",
base_url="https://api.cheaplm.xyz/api/v1"
)
response = client.chat.completions.create(
model="claude-sonnet-4",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)Install the Anthropic Python SDK:
pip install anthropicimport anthropic
client = anthropic.Anthropic(
api_key="sk-YOUR_KEY",
base_url="https://api.cheaplm.xyz/api/v1"
)
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(message.content[0].text)from openai import OpenAI
client = OpenAI(
api_key="sk-YOUR_KEY",
base_url="https://api.cheaplm.xyz/api/v1"
)
stream = client.chat.completions.create(
model="claude-sonnet-4",
messages=[{"role": "user", "content": "Write a story."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Install the OpenAI TypeScript SDK:
npm install openaiimport OpenAI from "openai";
const client = new OpenAI({
apiKey: "sk-YOUR_KEY",
baseURL: "https://api.cheaplm.xyz/api/v1",
});
const response = await client.chat.completions.create({
model: "claude-sonnet-4",
messages: [
{ role: "user", content: "Hello!" },
],
});
console.log(response.choices[0].message.content);curl https://api.cheaplm.xyz/api/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: sk-YOUR_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, Claude!"}
]
}'curl https://api.cheaplm.xyz/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-YOUR_KEY" \
-d '{
"model": "claude-sonnet-4",
"messages": [
{"role": "user", "content": "Hello, Claude!"}
]
}'CheapLM works with any tool that supports the Anthropic or OpenAI API format.
sk-YOUR_KEYhttps://api.cheaplm.xyz/cursor/v1💡 Cursor uses dedicated /cursor/v1 endpoints with agent mode support.
Add to your ~/.continue/docs.yaml:
name: Local Config
version: 1.0.0
schema: v1
models:
- name: Claude 4.6 Opus
provider: anthropic
model: claude-opus-4-6
apiKey: sk-YOUR_KEY
apiBase: https://api.cheaplm.xyz/api/v1
roles:
- chat
- edit
- apply
- autocomplete
- embedsk-YOUR_KEYhttps://api.cheaplm.xyz/api/v1Set environment variables before running:
export ANTHROPIC_BASE_URL="https://api.cheaplm.xyz"
export ANTHROPIC_API_KEY="sk-YOUR_KEY"
claudeSet environment variables before running:
export ANTHROPIC_API_KEY="sk-YOUR_KEY"
export ANTHROPIC_BASE_URL="https://api.cheaplm.xyz/api/v1"
opencodesk-YOUR_KEYhttps://api.cheaplm.xyz/api/v1For any tool that supports Anthropic API format, set:
Base URL:
https://api.cheaplm.xyz/api/v1API Key:
sk-YOUR_KEYFor any tool that supports custom OpenAI endpoints, set:
Base URL:
https://api.cheaplm.xyz/api/v1API Key:
sk-YOUR_KEY💡 Use model names like claude-sonnet-4, claude-opus-4.6, etc.
400invalid_request_errorMalformed request, missing required fields, or invalid model
401authentication_errorMissing or invalid API key
402insufficient_creditsNot enough credits to complete request
403permission_errorKey is frozen or insufficient balance
429rate_limit_errorRate limit exceeded
500api_errorInternal server error
502api_errorUpstream API error
{
"type": "error",
"error": {
"type": "authentication_error",
"message": "Missing API key. Provide via x-api-key header or Authorization: Bearer."
}
}{
"type": "error",
"error": {
"type": "permission_error",
"message": "Insufficient balance. Please top up your account."
}
}| Model | Credits per 1K tokens |
|---|---|
| Claude Opus 4.6 / 4.5 / 4 | 0.22 |
| Claude Sonnet 4.5 / 4 | 0.13 |
| Claude 3.5 Haiku | 0.04 |
💡 Note: 1 credit ≈ 10,000 tokens (varies by model)
| Task | Model | Tokens | Credits Used |
|---|---|---|---|
| Short chat | Haiku 3.5 | 1,000 | 0.04 |
| Code review | Sonnet 4 | 5,000 | 0.65 |
| Complex analysis | Opus 4.6 | 10,000 | 2.20 |
/v1/profile endpointcredits_used = (total_tokens / 1000) × cost_per_1k_tokenscurl https://api.cheaplm.xyz/api/v1/profile \
-H "Authorization: Bearer sk-YOUR_KEY"Response:
{
"username": "your_username",
"credits": 1000.5,
"key_name": "Production Server"
}❌ Bad
"I would really appreciate it if you could please help me by writing a Python function that takes a list of numbers as input and returns the sum of all the even numbers in that list."
✅ Good
"Write a Python function that returns the sum of even numbers in a list."
Don't set max_tokens higher than needed. For short answers, use 256 or 512 instead of 4096.
{
"system": "You are a code reviewer. Respond with brief, actionable feedback only.",
"messages": [{"role": "user", "content": "Review this: ..."}]
}Implement retry logic with exponential backoff:
import time
from openai import OpenAI
client = OpenAI(
api_key="sk-YOUR_KEY",
base_url="https://api.cheaplm.xyz/api/v1"
)
def call_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model="claude-sonnet-4",
messages=messages
)
except Exception as e:
if attempt == max_retries - 1:
raise
wait = min(2 ** attempt * 2, 30)
time.sleep(wait)Create your account and start building with Claude AI today.