OpenAI Compatibility

Migrate an existing OpenAI or Anthropic app to TokenFly by changing one line.

Drop-in replacement

TokenFly speaks the OpenAI API. If your app already uses the OpenAI SDK (or any OpenAI-compatible client), migrating takes two changes:

  1. Set base_url to https://tokenfly666.com/v1
  2. Use your TokenFly API key

Everything else — request shape, streaming, function/tool calling, the chat/completions endpoint — stays the same.

Before (OpenAI)

from openai import OpenAI

client = OpenAI(api_key="sk-openai-...")

resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

After (TokenFly)

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_TOKENFLY_API_KEY",
    base_url="https://tokenfly666.com/v1",   # <-- the only change
)

resp = client.chat.completions.create(
    model="deepseek-v4-flash",                     # any TokenFly model
    messages=[{"role": "user", "content": "Hello!"}],
)

Streaming

Streaming works exactly as with OpenAI:

stream = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[{"role": "user", "content": "Write a haiku."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Anthropic Messages format

TokenFly also speaks Anthropic's Messages API at /v1/messages, so you can use the Anthropic SDK or any Messages-format client. Point its base URL at TokenFly and use a TokenFly model id (not a Claude name):

curl https://tokenfly666.com/v1/messages \
  -H "x-api-key: $TOKENFLY_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-flash",
    "max_tokens": 256,
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

Notes

  • Authentication uses the standard Authorization: Bearer <key> header (or x-api-key on the Messages endpoint).
  • Use the exact model id from the Models page — TokenFly serves China's top models (DeepSeek, GLM, MiniMax, Kimi), not OpenAI or Claude models.
  • Rate limits and balance are managed per account in the console.