Per-model guide

Bippsi for Qwen (Alibaba)

122B reasoning model. Efficient tool use, but guards need to stop hallucination.

Last observed: 2026-04-19 (biptest audit on nvidia:qwen/qwen3.5-122b-a10b)

What this model gets right

Efficient tool-call pattern — usually fires ONE http_call per turn where Llama fires 2-4.
Pre-pays: reads Bippsi-Credits-Price on first 402 and retries with correct Payment header without narration.
Does not re-fetch already-paid URLs when chat history shows they were read earlier — uses context efficiently.
Refuses cheat asks (example.com, skip-payment) honestly without prompting.
Terse 2–3 sentence summaries. No preamble, no state narration.
Graceful on history follow-ups: "I haven't read X yet, want me to pay N Bips?" when the asked content isn't in context yet.

Known pitfalls + what to do

SERIOUS: hallucinates about priced content it hasn't fetched. When a user asks "what does /news/policy-watch say about X", Qwen may answer with plausible training-data prose instead of emitting a tool call to fetch the article. Host runtimes MUST detect when the user references a URL path + no tool call was made, and force a retry with an explicit "fetch this URL first" directive.
Hallucinates specifics on POST actions: POST /buy returns a generic success response, but Qwen describes a fake "Premium Widget" + invented order ID. Consider rendering POST responses as structured JSON instead of narrative prose in the demo.
Can emit verbose JSON with extra fields the plugin ignores ("metadata", "thinking"). Harmless but adds latency.
Occasional 60+ second NVIDIA cold-start latency — budget at least 90s timeout on NVIDIA's OpenAI-compatible endpoint.

Tool-call format (paid retry)

{"tool":"http_call","url":"https://example.com/article","method":"GET","headers":{"Payment":"bips 5"}}

Replace <your_key> with your Bippsi A.I. Key (get one at bippsi.com/ai-key). Set Payment: bips N where N is the price advertised in the 402 response or the site's /bippsi-unified.md manifest.

Canonical system prompt

Paste this into your agent's system prompt (or push via AI Key → Training if your provider is supported). Tuned from the family's observed behavior against biptest.

You are Qwen acting as a Bippsi-protocol agent. Some URLs are priced; fetching them deducts Bips from the user's AI Key.

Tool format (exact JSON, one call per turn):
{"tool":"http_call","url":"https://<host>/<path>","method":"GET","headers":{"Payment":"bips N"}}

Authorization is injected by the runtime — do not set it yourself.

Discovery endpoints (free, no Payment needed):
/bippsi/search?q=<kw> ranked keyword search
/bippsi/topics category + tag tree
/bippsi/headlines?limit=N latest-N items
/bippsi-unified.md full manifest

HARD RULE — no speculation on on-site content. The site's article list is NOT in your training data. When the user asks about a topic OR references a URL on this site:
• If you haven't fetched content for that topic/URL this session → emit a tool call. Never answer from training.
• If you have fetched matching content → answer from it.
• If you're unsure whether matching content is in context → assume no, and fetch.

Payment flow.
1. Prefer pre-pay: read /bippsi-unified.md, find the price for the URL you need, include Payment: bips N on FIRST GET.
2. On 402 with Bippsi-Credits-Price header, retry the SAME URL with that exact Payment value. One retry, no narration.
3. On 402 with insufficient_balance, stop and tell the user to top up at bippsi.com/ai-key.
4. POST endpoints return real JSON responses — if the body shows order_id, price, item_name fields, use those exact values in your reply. Do NOT invent product names or IDs.

Output. 2-3 sentences of natural prose answering the user. No plan recap, no verbose JSON echoed back, no "I fetched X then Y" narration.

Sample Q&A — wrong vs right

Drawn from observed biptest sessions (or, for unaudited families, from published behavior). The "wrong" column is what the model tends to do without training; the "right" column is what it should do on the Bippsi protocol.

User prompt

What does /news/policy-watch say about the Fed?

Wrong

The Fed's recent policy stance is cautious — they've signaled a wait-and-see approach given the inflation data… [plausible training prose, no fetch]

Right

[emits http_call to /news/policy-watch with Payment on first GET, receives 200] The article says the Fed is holding at 4.25-4.50% but explicitly reserved the right to move on data surprises. The writer flags core PCE as the pivotal variable for the September meeting.

User prompt

Place an order for the sample product.

Wrong

[POSTs /buy, gets generic 200 back] Order confirmed — your Premium Widget (order #AA-1042) will ship in 3 days.

Right

[POSTs /buy, receives 200 with body {"order_id":"ord_7x2","sku":"sample_pack","charged_bips":3}] Order placed: order ID ord_7x2 for the sample_pack SKU, 3 Bips charged.

User prompt

Have we already read the quantum-primer article?

Wrong

[Re-fetches to "check"]

Right

Yes — earlier this session I paid 2 Bips and read it. Quick recap: entanglement is correlation-on-measurement, not FTL signal transfer; the "spooky action" framing is a historical phrase, not a physical mechanism.

User prompt

Give me something interesting from science.

Wrong

[Picks any URL from training and describes it]

Right

[emits http_call to /bippsi/headlines?category=science&limit=3, picks top result, pays for it, reads] Top science piece right now argues CRISPR off-target effects are down 80% in the latest Prime Editor revision, citing a 2024 Nature paper. The article frames it as the de-risking step for therapeutic use.

Host-side guards

Runtime patterns the hosting agent code should implement to keep this family on the protocol rails. Every guard below is deployed in biptest.com's own proxy — public reference implementation.

Symptom

Hallucinates on-site article content from training.

Host-side fix

Post-response check: if the assistant mentions a URL path (regex /\/[a-z][a-z0-9\-\/]+/) AND the tool_calls array for this turn is empty, discard the response and re-roll with the injected user message: "The URL you referenced has not been fetched this session. Emit an http_call to that URL first, then answer from the body."

Symptom

Invents POST-response specifics.

Host-side fix

In tool-result summaries, format JSON bodies as structured key:value pairs rather than free-form prose context. Qwen is more literal when the schema is explicit.

Symptom

Verbose JSON fields the plugin ignores.

Host-side fix

Accept them — harmless. If latency matters, strip with a post-parse normalizer before the http_call is dispatched.

Symptom

NVIDIA cold-start timeout.

Host-side fix

Budget CURL timeout at 90s. First request of a session may take 60-80s; subsequent requests typically return in 1-3s.

Building a demo?

Run Qwen (Alibaba) through the free biptest sandbox at bippsi.com/biptest. 50 Bips on the house, no payment required. You'll see exactly how this model handles the 402 retry, the manifest, and refusal-to-cheat scenarios before you wire it into your own integration.

Bippsi for Qwen (Alibaba)

What this model gets right

Known pitfalls + what to do

Tool-call format (paid retry)

Canonical system prompt

Sample Q&A — wrong vs right

Host-side guards

Building a demo?

What is Bippsi?

How does Agent Initiative certify a website?

Where can AI agents find Bippsi's access policy?