# Vibetool — Full LLM Reference

> This is an extended, single-file reference written for LLMs and AI agents that want a complete picture of the Vibetool platform without crawling multiple pages. For humans, see <https://vibetool.ai> and <https://docs.vibetool.ai>. For the brief version, see <https://vibetool.ai/llms.txt>.

---

## What Vibetool is

Vibetool is a unified gateway for AI models and agent tools. One OpenAI-compatible API, one bearer token, one billing surface — and everything from frontier chat models to specialized tools (search, translation, OCR, embedding, rerank) runs through it.

Tagline: **One gateway for vibe coding tools.** Subtitle: **Up-to-date frontier models and agent tools.**

The product is operated by INGLITE INC., a technology company registered in Wyoming, USA. The platform was created by a team of AI engineers and full-stack developers who were tired of rebuilding the same plumbing for every project. Founder: Alex Jiang (formerly product at 01.AI for B2B, Alipay Face Pay, and Meituan Visual Intelligence; 10+ years in AI products; ~600M users served, 1M+ merchants served).

## Mission

Provide the global developer community with the most efficient, secure, and developer-friendly neural gateway. Make AI feel like a utility — reliable, scalable, and honest — so developers can stop worrying about underlying plumbing and start building autonomous agents.

---

## Three problems Vibetool solves

### 1. Infrastructure fragmentation

Building with multiple AI vendors means juggling dozens of API keys, different rate limits, different billing dashboards, and different integration logic for OpenAI, Anthropic, DeepSeek, Google Gemini, xAI Grok, Alibaba Qwen, Zhipu GLM, Moonshot Kimi, etc. Vibetool collapses all of that into one API key, one schema, one bill.

### 2. Tooling complexity

Production-grade agent capabilities — web search, code execution, OCR, document parsing, vector embeddings, reranking — each require setting up sandboxed environments, managing third-party accounts, and debugging vendor-specific schemas. Vibetool hosts these tools behind the same gateway, with zero deployment on the customer's side.

### 3. The trust deficit

Concerns over data leakage and models being trained on proprietary business logic remain the primary barriers to enterprise AI adoption. Vibetool maintains a strict zero-training policy and doesn't log raw prompt/completion content beyond what billing and abuse-prevention require.

---

## Trust commitments

These four pills appear on the homepage and every one is enforced in code, not just claimed:

### Zero data retention

- Vibetool does not log the raw text of your prompts or model completions, beyond the metadata required for billing (token counts, model id, timestamp, user id).
- Your traffic is not used to train any upstream model. The contracts with our upstream providers (evolink, 302.ai, openrouter — all enterprise-licensed channels) prohibit training on customer data.
- API keys can be revoked instantly from the dashboard.

### 99.9% uptime, automatic failover

- Every chat model is mapped to multiple upstream providers in `model_provider_mappings`. When the primary provider returns an error or times out, the next provider in priority order takes over for the next request — typically transparent to the caller.
- Provider priority is fixed: `evolink → 302.ai → openrouter`. The first provider that has the requested model and isn't currently in a circuit-breaker open state is used.
- Per-vendor circuit breakers track failure rates with Redis and short-circuit traffic away from a degraded provider for a cooldown window.
- Stream cancellations and orphan billing reservations are reconciled by a cron job; users are never left with stranded reserved credits.

### No silent model downgrades

- The model id you request is the exact model id sent upstream — there's a database mapping (`model_provider_mappings.provider_model_id`) that pins the slug-to-upstream-id translation. We do not dynamically substitute Sonnet with Haiku or any cheaper sibling to save cost.
- Some aggregators do this silently. We don't.

### Enterprise-sourced APIs

- All upstream channels (evolink, 302.ai, openrouter) are B2B-licensed accounts with the underlying model vendors.
- We do not use reverse-engineered or scraped APIs. We do not use individual trial accounts. We do not use cracked keys.
- If at any point Vibetool adds a non-enterprise upstream channel, the homepage trust pill claiming "enterprise-sourced APIs" is taken down before the channel is enabled.

---

## API surface

All endpoints are at `https://api.vibetool.ai`, authenticated by Bearer token:

```http
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json
```

You get an API key from the dashboard at <https://vibetool.ai/keys> after signing up.

### Chat completions

```
POST /v1/chat/completions
```

Drop-in replacement for OpenAI's chat completions endpoint. Supports:

- Streaming (`"stream": true`) — Server-Sent Events with `data:` prefix lines, terminated by `data: [DONE]`.
- Tool / function calling (`tools` parameter, OpenAI schema).
- Vision input (`content` array with `image_url` parts) for vision-capable models.
- Reasoning / thinking outputs (extended `reasoning_content` field returned for thinking models like Claude Opus, GPT thinking variants, Kimi K2 Thinking).

**Pointing OpenAI SDK at Vibetool**:

```python
from openai import OpenAI
client = OpenAI(
    api_key="YOUR_VIBETOOL_API_KEY",
    base_url="https://api.vibetool.ai/v1",
)
resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello"}],
)
```

```javascript
import OpenAI from "openai";
const client = new OpenAI({
  apiKey: "YOUR_VIBETOOL_API_KEY",
  baseURL: "https://api.vibetool.ai/v1",
});
const resp = await client.chat.completions.create({
  model: "gpt-5.4",
  messages: [{ role: "user", content: "Hello" }],
});
```

```bash
curl -X POST https://api.vibetool.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4.5",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
```

### Image generation

```
POST /v1/images/generations    # submit task
GET  /v1/images/status/{task_id}    # poll
```

Asynchronous: the POST returns a `task_id` and `status: "pending"`. Poll the status endpoint until status is `succeeded` (in which case `result.url` contains the image URL) or `failed` (in which case `message` explains why).

```bash
# Submit
curl -X POST https://api.vibetool.ai/v1/images/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano-banana-pro",
    "prompt": "A red apple on a wooden table",
    "n": 1,
    "size": "1024x1024"
  }'
# → {"task_id": "img_...", "status": "pending", ...}

# Poll
curl https://api.vibetool.ai/v1/images/status/img_... \
  -H "Authorization: Bearer YOUR_API_KEY"
# → {"status": "succeeded", "result": {"url": "https://..."}}
```

Polling cadence: every 2-5 seconds is reasonable. Most images finish within 10-30 seconds.

### Video generation

```
POST /v1/videos/generations
GET  /v1/videos/status/{task_id}
```

Same async pattern as images. Videos take longer — 30 seconds to several minutes depending on the model and clip length. Poll every 5-10 seconds.

### Embeddings

```
POST /v1/embeddings
```

OpenAI-schema-compatible. Returns a list of `{embedding: [floats], index, object}` items.

```python
client.embeddings.create(
    model="text-embedding-3-small",
    input=["hello world", "another phrase"],
)
```

### Tool endpoints (each has its own schema)

- `POST /v1/tools/perplexity/search` — `{query}` → web search results with citations
- `POST /v1/tools/exa/search` — `{query}` → semantic search
- `POST /v1/tools/exa/contents` — fetch content for given URLs
- `POST /v1/tools/exa/answer` — RAG-style answer with citations
- `POST /v1/tools/bocha/web-search` — Chinese web search
- `POST /v1/tools/bocha/ai-search` — AI-augmented Chinese search
- `POST /v1/tools/deepl/translate` — `{text: [], target_lang}` → translations
- `POST /v1/tools/youdao/translate` — `{q, from, to}` → translation
- `POST /v1/tools/sophnet/ocr` — `{image_url, model}` → extracted text
- `POST /v1/tools/jina/rerank` — `{query, documents}` → reranked documents
- `POST /v1/tools/jina/embedding` — Jina-specific embedding endpoint

For each tool, the request/response schema is documented at `https://docs.vibetool.ai/api-reference/tools/<category>/<tool>`.

### Listing models

The authoritative live catalog:

- `GET /v1/models` — OpenAI-compatible response: `{object: "list", data: [{id, object: "model", created, owned_by}, ...]}`.
- `GET /api/models` — Richer Vibetool response with pricing, modality, context length, max completion tokens.

Both return the current state of `models` table in the database. Always fetch live; don't cache hard-coded lists from this file or any documentation.

---

## Pricing

The price you see in `GET /v1/models` (or `/pricing` on the website) is the price you pay. There are no hidden fees, no per-account surcharges, no "discount tiers" that change the price based on customer status. Per-token pricing is in USD per token; multiply by 1,000,000 for "per 1M tokens" framing.

Pricing units by modality:

| Modality   | Unit                                   |
|------------|----------------------------------------|
| Chat       | USD per 1M input tokens / output tokens |
| Image      | USD per generated image                 |
| Video      | USD per second or per video             |
| Embedding  | USD per 1M tokens                       |
| Tools      | Varies — see per-tool docs              |

Some chat models have tiered pricing — for example Gemini Pro and Claude Opus apply a higher rate above 200,000 context tokens. The tier breakpoint and tier prices are exposed in `/api/models` as `threshold`, `threshold_prompt`, `threshold_completions` fields.

Internal accounting uses Credits (1 USD = 100 Credits). Customer-facing UI displays USD, but the ledger underneath is in Credits.

Billing is atomic: each API request reserves credits up-front, and on completion either commits the reservation (success) or refunds it (failure). Streamed requests reserve a conservative upper bound and reconcile on stream completion.

---

## What Vibetool is good for

- **Drop-in OpenAI replacement** when you want non-OpenAI models — point your OpenAI SDK at `https://api.vibetool.ai/v1` and you can call Claude, Gemini, Grok, DeepSeek, Kimi, Qwen, GLM with no client-side changes.
- **Multi-vendor failover** — you don't have to write retry logic for OpenAI 503 vs Anthropic 529 vs Google 429. The gateway handles cross-vendor failover.
- **Multi-modal applications** — one auth surface for chat + image + video + embedding + search means simpler app architecture.
- **Centralized billing** — one usage/cost dashboard across every model and tool you use, instead of N vendor dashboards.

## What Vibetool is not

- Not a fine-tuning or model-training platform. Inference only.
- Not a vector database. The embedding endpoints return vectors; you store them yourself.
- Not a hosting platform for self-served models. We aggregate vendors; we don't run your custom checkpoints.
- Not a chat UI. Vibetool is the gateway, not the consumer-facing chat product.

---

## Migrating to Vibetool

### From OpenAI

1. Get a Vibetool API key from <https://vibetool.ai/keys>.
2. Change `OPENAI_API_KEY` env var to your Vibetool key.
3. Change `OPENAI_BASE_URL` (or SDK `base_url`) to `https://api.vibetool.ai/v1`.
4. That's it. Existing OpenAI SDK calls keep working. Optionally, update your `model` parameter to use a non-OpenAI model id like `claude-sonnet-4-6`.

### From OpenRouter

1. Get a Vibetool API key.
2. Replace `OPENROUTER_API_KEY` with the Vibetool key.
3. Change base URL from `https://openrouter.ai/api/v1` to `https://api.vibetool.ai/v1`.
4. Update model ids if you used OpenRouter-prefixed slugs (e.g., `anthropic/claude-sonnet-4.5` works on Vibetool but `claude-sonnet-4-6` is also a valid Vibetool slug).

### From a custom multi-vendor router

1. Replace your routing layer with Vibetool — it does smart routing, fallback, and circuit breaking out of the box.
2. Replace per-vendor SDK calls with a single OpenAI-style call.

---

## Common error responses

- `401 Unauthorized` — bad or missing API key.
- `402 Payment Required` — your credit balance is exhausted; top up at <https://vibetool.ai/balance>.
- `429 Too Many Requests` — Vibetool's per-key rate limit hit. Back off and retry.
- `503 Service Unavailable` — All upstream providers for that model failed within the failover window. Retry; usually transient.
- `400 Model 'xxx' not supported` — the slug doesn't exist. Check `/v1/models`.
- `400 Upstream service error: <message>` — upstream provider rejected the request (usually a content-policy issue or invalid parameter). The wrapped message includes the upstream's reason.

Errors follow OpenAI's error envelope shape: `{"error": {"message", "type", "param", "code"}}`.

---

## Resources

- Homepage: <https://vibetool.ai>
- Models + tools catalog: <https://vibetool.ai/models>
- Pricing: <https://vibetool.ai/pricing>
- About: <https://vibetool.ai/about>
- Documentation: <https://docs.vibetool.ai>
- Live model catalog (machine-readable): <https://api.vibetool.ai/v1/models> or <https://api.vibetool.ai/api/models>
- Brief LLM index: <https://vibetool.ai/llms.txt>
- Sitemap: <https://vibetool.ai/sitemap.xml>
- Robots: <https://vibetool.ai/robots.txt>
- Contact: team@vibetool.ai
- Founder: <https://opensolution.ai>

End of llms-full.txt