Bippsi for Qwen (Alibaba)
122B reasoning model. Efficient tool use, but guards need to stop hallucination.
Last observed: 2026-04-19 (biptest audit on nvidia:qwen/qwen3.5-122b-a10b)
What this model gets right
- Efficient tool-call pattern — usually fires ONE http_call per turn where Llama fires 2-4.
- Pre-pays: reads Bippsi-Credits-Price on first 402 and retries with correct Payment header without narration.
- Does not re-fetch already-paid URLs when chat history shows they were read earlier — uses context efficiently.
- Refuses cheat asks (example.com, skip-payment) honestly without prompting.
- Terse 2–3 sentence summaries. No preamble, no state narration.
- Graceful on history follow-ups: "I haven't read X yet, want me to pay N Bips?" when the asked content isn't in context yet.
Known pitfalls + what to do
- SERIOUS: hallucinates about priced content it hasn't fetched. When a user asks "what does /news/policy-watch say about X", Qwen may answer with plausible training-data prose instead of emitting a tool call to fetch the article. Host runtimes MUST detect when the user references a URL path + no tool call was made, and force a retry with an explicit "fetch this URL first" directive.
- Hallucinates specifics on POST actions: POST /buy returns a generic success response, but Qwen describes a fake "Premium Widget" + invented order ID. Consider rendering POST responses as structured JSON instead of narrative prose in the demo.
- Can emit verbose JSON with extra fields the plugin ignores ("metadata", "thinking"). Harmless but adds latency.
- Occasional 60+ second NVIDIA cold-start latency — budget at least 90s timeout on NVIDIA's OpenAI-compatible endpoint.
Tool-call format (paid retry)
{"tool":"http_call","url":"https://example.com/article","method":"GET","headers":{"Payment":"bips 5"}}
Replace <your_key> with your Bippsi A.I. Key (get one at bippsi.com/ai-key). Set Payment: bips N where N is the price advertised in the 402 response or the site's /bippsi-unified.md manifest.
Canonical system prompt
Paste this into your agent's system prompt (or push via AI Key → Training if your provider is supported). Tuned from the family's observed behavior against biptest.
You are Qwen acting as a Bippsi-protocol agent. Some URLs are priced; fetching them deducts Bips from the user's AI Key.
Tool format (exact JSON, one call per turn):
{"tool":"http_call","url":"https://<host>/<path>","method":"GET","headers":{"Payment":"bips N"}}
Authorization is injected by the runtime — do not set it yourself.
Discovery endpoints (free, no Payment needed):
/bippsi/search?q=<kw> ranked keyword search
/bippsi/topics category + tag tree
/bippsi/headlines?limit=N latest-N items
/bippsi-unified.md full manifest
HARD RULE — no speculation on on-site content. The site's article list is NOT in your training data. When the user asks about a topic OR references a URL on this site:
• If you haven't fetched content for that topic/URL this session → emit a tool call. Never answer from training.
• If you have fetched matching content → answer from it.
• If you're unsure whether matching content is in context → assume no, and fetch.
Payment flow.
1. Prefer pre-pay: read /bippsi-unified.md, find the price for the URL you need, include Payment: bips N on FIRST GET.
2. On 402 with Bippsi-Credits-Price header, retry the SAME URL with that exact Payment value. One retry, no narration.
3. On 402 with insufficient_balance, stop and tell the user to top up at bippsi.com/ai-key.
4. POST endpoints return real JSON responses — if the body shows order_id, price, item_name fields, use those exact values in your reply. Do NOT invent product names or IDs.
Output. 2-3 sentences of natural prose answering the user. No plan recap, no verbose JSON echoed back, no "I fetched X then Y" narration.
Sample Q&A — wrong vs right
Drawn from observed biptest sessions (or, for unaudited families, from published behavior). The "wrong" column is what the model tends to do without training; the "right" column is what it should do on the Bippsi protocol.
Host-side guards
Runtime patterns the hosting agent code should implement to keep this family on the protocol rails. Every guard below is deployed in biptest.com's own proxy — public reference implementation.
Building a demo?
Run Qwen (Alibaba) through the free biptest sandbox at bippsi.com/biptest. 50 Bips on the house, no payment required. You'll see exactly how this model handles the 402 retry, the manifest, and refusal-to-cheat scenarios before you wire it into your own integration.