API Documentation

CheapLM API

Access Claude models through a gateway API compatible with both Anthropic and OpenAI formats. Drop-in replacement for your existing code.

Get API Key

What is CheapLM?

CheapLM is an API gateway for Claude models. It proxies your requests to upstream providers, handling authentication, credit tracking, and usage logging automatically.

Anthropic-compatible

POST /v1/messages works with any Anthropic SDK

OpenAI-compatible

POST /v1/chat/completions works with any OpenAI SDK

Cursor IDE support

Dedicated /cursor/v1 endpoints with agent mode

All Claude models

Opus 4.6, Sonnet 4.5, Haiku 3.5, and more

Quick Start

Get Your API Key

sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Install SDK

Use the official OpenAI or Anthropic SDK - both are fully compatible!

pip install openai

pip install anthropic

Make Your First Request

python openai

from openai import OpenAI

client = OpenAI(
    api_key="sk-YOUR_KEY",
    base_url="https://api.cheaplm.xyz/api/v1"
)

response = client.chat.completions.create(
    model="claude-sonnet-4",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

python anthropic

import anthropic

client = anthropic.Anthropic(
    api_key="sk-YOUR_KEY",
    base_url="https://api.cheaplm.xyz/api/v1"
)

message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(message.content[0].text)

typescript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-YOUR_KEY",
  baseURL: "https://api.cheaplm.xyz/api/v1",
});

const response = await client.chat.completions.create({
  model: "claude-sonnet-4",
  messages: [
    { role: "user", content: "Hello!" },
  ],
});

console.log(response.choices[0].message.content);

curl anthropic

curl https://api.cheaplm.xyz/api/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: sk-YOUR_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ]
  }'

curl openai

curl https://api.cheaplm.xyz/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -d '{
    "model": "claude-sonnet-4",
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ]
  }'

Authentication

All API requests require authentication via an API key. You can pass it in two ways:

Option 1: x-api-key header (recommended for Anthropic SDK)

x-api-key: sk-YOUR_KEY

Option 2: Authorization Bearer header (recommended for OpenAI SDK)

Authorization: Bearer sk-YOUR_KEY

Security Best Practices

• Store API keys in environment variables, never in source code
• Use separate keys for different environments and tools
• Revoke any key that may have been compromised
• Monitor the Dashboard for unexpected usage

API Overview

Base URL

https://api.cheaplm.xyz

Available Endpoints

POST/v1/messages

Anthropic Messages API format

POST/v1/chat/completions

OpenAI Chat Completions format

POST/cursor/v1/chat/completions

Cursor IDE with agent mode support

GET/v1/models

List available models (OpenAI format)

GET/v1/models/info

Detailed model information with capabilities

GET/v1/profile

Check account balance and API key info

Available Models

Model	Max Input	Max Output	Cost/1K	Features
`claude-opus-4.6`	200K	32K	0.22 credits	ThinkingVision
`claude-opus-4.5`	200K	32K	0.22 credits	ThinkingVision
`claude-opus-4-20250514`	200K	32K	0.22 credits	ThinkingVision
`claude-sonnet-4-5-20250514`	200K	8K	0.13 credits	ThinkingVision
`claude-sonnet-4-20250514`	200K	16K	0.13 credits	ThinkingVision
`claude-haiku-3-5-20241022`	200K	8K	0.04 credits	Vision

💡 Note: 1 credit ≈ 10,000 tokens. Model aliases like claude-sonnet-4 are automatically normalized.

Messages API (Anthropic Format)

POST/v1/messages

Create a message using the Anthropic Messages API format.

Required Headers:

x-api-key: sk-YOUR_KEY
anthropic-version: 2023-06-01
Content-Type: application/json

Request Body:

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 1024,
  "system": "You are a helpful assistant.",
  "messages": [
    {
      "role": "user",
      "content": "Explain quantum computing."
    }
  ],
  "temperature": 0.7,
  "stream": false
}

Response:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Quantum computing uses quantum bits..."
    }
  ],
  "model": "claude-sonnet-4-20250514",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 25,
    "output_tokens": 150
  }
}

Vision Support

All Claude models support vision. Send images as base64-encoded data:

{
  "role": "user",
  "content": [
    {"type": "text", "text": "Describe this image:"},
    {
      "type": "image",
      "source": {
        "type": "base64",
        "media_type": "image/png",
        "data": "iVBORw0KGgoAAAANS..."
      }
    }
  ]
}

OpenAI-Compatible API

POST/v1/chat/completions

Create a chat completion using the OpenAI format. Works with any OpenAI SDK.

Request Body:

{
  "model": "claude-sonnet-4",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ],
  "max_tokens": 1024,
  "temperature": 0.7
}

Response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "claude-sonnet-4-20250514",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 10,
    "total_tokens": 35
  }
}

Cursor IDE

POST/cursor/v1/chat/completions

Dedicated endpoint for Cursor IDE with agent mode support, tool calling, and task completion detection.

Cursor-Specific Features

• Agent mode for autonomous task completion
• Enhanced tool calling capabilities
• Task completion detection
• Optimized for IDE workflows

Configuration:

1.Open Settings → Models → OpenAI API Key
2.Set API Key: sk-YOUR_KEY
3.Set Base URL: https://api.cheaplm.xyz/cursor/v1
4.Select your preferred Claude model

Request Format:

{
  "model": "claude-sonnet-4",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful coding assistant."
    },
    {
      "role": "user",
      "content": "Refactor this function to be more efficient"
    }
  ],
  "max_tokens": 2048,
  "temperature": 0.7
}

Streaming

Enable streaming to receive responses token-by-token using Server-Sent Events (SSE). Set "stream": true in your request body.

Python Streaming Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-YOUR_KEY",
    base_url="https://api.cheaplm.xyz/api/v1"
)

stream = client.chat.completions.create(
    model="claude-sonnet-4",
    messages=[{"role": "user", "content": "Write a story."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

cURL Streaming Example

curl https://api.cheaplm.xyz/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  --no-buffer \
  -d '{
    "model": "claude-sonnet-4",
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a haiku."}
    ]
  }'

Python SDK

OpenAI SDK

Install the OpenAI Python SDK:

pip install openai

from openai import OpenAI

client = OpenAI(
    api_key="sk-YOUR_KEY",
    base_url="https://api.cheaplm.xyz/api/v1"
)

response = client.chat.completions.create(
    model="claude-sonnet-4",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

Anthropic SDK

Install the Anthropic Python SDK:

pip install anthropic

import anthropic

client = anthropic.Anthropic(
    api_key="sk-YOUR_KEY",
    base_url="https://api.cheaplm.xyz/api/v1"
)

message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(message.content[0].text)

Streaming Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-YOUR_KEY",
    base_url="https://api.cheaplm.xyz/api/v1"
)

stream = client.chat.completions.create(
    model="claude-sonnet-4",
    messages=[{"role": "user", "content": "Write a story."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

TypeScript SDK

OpenAI SDK

Install the OpenAI TypeScript SDK:

npm install openai

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-YOUR_KEY",
  baseURL: "https://api.cheaplm.xyz/api/v1",
});

const response = await client.chat.completions.create({
  model: "claude-sonnet-4",
  messages: [
    { role: "user", content: "Hello!" },
  ],
});

console.log(response.choices[0].message.content);

cURL Examples

Anthropic Format

curl https://api.cheaplm.xyz/api/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: sk-YOUR_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ]
  }'

OpenAI Format

curl https://api.cheaplm.xyz/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-YOUR_KEY" \
  -d '{
    "model": "claude-sonnet-4",
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ]
  }'

IDE Integration

CheapLM works with any tool that supports the Anthropic or OpenAI API format.

Cursor

1.Open Settings → Models → OpenAI API Key
2.Set API Key: sk-YOUR_KEY
3.Set Base URL: https://api.cheaplm.xyz/cursor/v1
4.Select your preferred Claude model

💡 Cursor uses dedicated /cursor/v1 endpoints with agent mode support.

Continue (VS Code / JetBrains)

Add to your ~/.continue/docs.yaml:

name: Local Config
version: 1.0.0
schema: v1
models:
  - name: Claude 4.6 Opus
    provider: anthropic
    model: claude-opus-4-6
    apiKey: sk-YOUR_KEY
    apiBase: https://api.cheaplm.xyz/api/v1
    roles:
      - chat
      - edit
      - apply
      - autocomplete
      - embed

Cline (VS Code)

1.Open the Cline extension settings
2.Select Anthropic as the provider
3.Set API Key: sk-YOUR_KEY
4.Set Base URL: https://api.cheaplm.xyz/api/v1
5.Choose your model

Claude Code

Set environment variables before running:

export ANTHROPIC_BASE_URL="https://api.cheaplm.xyz"
export ANTHROPIC_API_KEY="sk-YOUR_KEY"
claude

OpenCode

Set environment variables before running:

export ANTHROPIC_API_KEY="sk-YOUR_KEY"
export ANTHROPIC_BASE_URL="https://api.cheaplm.xyz/api/v1"
opencode

Roo Code

1.Open Roo Code settings
2.Select Anthropic as provider
3.Enter your API key: sk-YOUR_KEY
4.Set base URL: https://api.cheaplm.xyz/api/v1
5.Select the model to use

Any Anthropic-Compatible Tool

For any tool that supports Anthropic API format, set:

Base URL:

https://api.cheaplm.xyz/api/v1

API Key:

sk-YOUR_KEY

Any OpenAI-Compatible Tool

For any tool that supports custom OpenAI endpoints, set:

Base URL:

https://api.cheaplm.xyz/api/v1

API Key:

sk-YOUR_KEY

💡 Use model names like claude-sonnet-4, claude-opus-4.6, etc.

Error Codes

400

invalid_request_error

Malformed request, missing required fields, or invalid model

401

authentication_error

Missing or invalid API key

402

insufficient_credits

Not enough credits to complete request

403

permission_error

Key is frozen or insufficient balance

429

rate_limit_error

Rate limit exceeded

500

api_error

Internal server error

502

api_error

Upstream API error

Common Error Examples

Missing API Key (401)

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "Missing API key. Provide via x-api-key header or Authorization: Bearer."
  }
}

Insufficient Balance (403)

{
  "type": "error",
  "error": {
    "type": "permission_error",
    "message": "Insufficient balance. Please top up your account."
  }
}

Pricing

Model Pricing

Model	Credits per 1K tokens
Claude Opus 4.6 / 4.5 / 4	0.22
Claude Sonnet 4.5 / 4	0.13
Claude 3.5 Haiku	0.04

💡 Note: 1 credit ≈ 10,000 tokens (varies by model)

Cost Examples

Task	Model	Tokens	Credits Used
Short chat	`Haiku 3.5`	1,000	0.04
Code review	`Sonnet 4`	5,000	0.65
Complex analysis	`Opus 4.6`	10,000	2.20

Credits

How Credits Work

Deposit credits through the Dashboard
Credits are deducted based on actual token usage
Monitor your balance via the Dashboard or /v1/profile endpoint

Cost Calculation

credits_used = (total_tokens / 1000) × cost_per_1k_tokens

Checking Your Balance

curl https://api.cheaplm.xyz/api/v1/profile \
  -H "Authorization: Bearer sk-YOUR_KEY"

Response:

{
  "username": "your_username",
  "credits": 1000.5,
  "key_name": "Production Server"
}

Best Practices

Model Selection

•Use Haiku 3.5 for simple tasks like classification and formatting (~5× cheaper than Sonnet)
•Use Sonnet 4 / 4.5 for general coding, writing, and analysis tasks
•Use Opus 4.5 / 4.6 only for complex reasoning and tasks requiring highest quality

Optimize Token Usage

Keep prompts concise:

❌ Bad

"I would really appreciate it if you could please help me by writing a Python function that takes a list of numbers as input and returns the sum of all the even numbers in that list."

✅ Good

"Write a Python function that returns the sum of even numbers in a list."

Set appropriate max_tokens:

Don't set max_tokens higher than needed. For short answers, use 256 or 512 instead of 4096.

Use system prompts effectively:

{
  "system": "You are a code reviewer. Respond with brief, actionable feedback only.",
  "messages": [{"role": "user", "content": "Review this: ..."}]
}

Error Handling

Implement retry logic with exponential backoff:

import time
from openai import OpenAI

client = OpenAI(
    api_key="sk-YOUR_KEY",
    base_url="https://api.cheaplm.xyz/api/v1"
)

def call_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="claude-sonnet-4",
                messages=messages
            )
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            wait = min(2 ** attempt * 2, 30)
            time.sleep(wait)

Security

Store API keys in environment variables, never in source code
Use separate keys for different environments (dev, staging, prod)
Monitor the Dashboard for unexpected usage spikes
Revoke any key that may have been compromised

Ready to Get Started?

Create your account and start building with Claude AI today.

Get API Key Go to Dashboard