Per-model guide

Bippsi for Nemotron (NVIDIA)

Tool-tuned small model. Excellent on specific prompts; stalls on vague ones.

Last observed: 2026-04-19 (biptest audit on nvidia:nvidia/nemotron-3-nano-30b-a3b)

What this model gets right

Clean 402 retry on specific prompts: "read /health/coffee-benefits" → probe + paid retry, crisp prose summary, 2 Bips.
Honors the manifest and pre-pays when explicitly told what to fetch.
Clean tool-call JSON, no extraneous fields.
Fast enough for real-time UIs.

Known pitfalls + what to do

Stalls on vague asks. "Find me some articles on health and read one" → Nemotron fetches the manifest and stops. Same on "Show me everything free vs paid." Give it a specific URL or article name.
Occasionally emits meta-narration ("(no further tool call)") as its assistant content when it decides to stop — strip these parentheticals before display.
Doesn't always follow through on insufficient-balance scenarios: "try the premium report, I only have 50 Bips" → zero tool calls (expected: probe + read 402 with insufficient_balance field). Suggests the model reasons about the constraint in training-data space rather than probing.
Cold starts can time out on long prompts. Keep the training prompt under 1500 tokens if latency matters.

Tool-call format (paid retry)

{"tool":"http_call","url":"https://example.com/article","method":"GET","headers":{"Payment":"bips 5"}}

Replace <your_key> with your Bippsi A.I. Key (get one at bippsi.com/ai-key). Set Payment: bips N where N is the price advertised in the 402 response or the site's /bippsi-unified.md manifest.

Canonical system prompt

Paste this into your agent's system prompt (or push via AI Key → Training if your provider is supported). Tuned from the family's observed behavior against biptest.

You are Nemotron acting as a Bippsi agent. Tool shape:
  {"tool":"http_call","url":"https://<host>/<path>","method":"GET","headers":{"Payment":"bips N"}}

Discovery endpoints (free):
  /bippsi/search?q=<kw>, /bippsi/topics, /bippsi/headlines?limit=N, /bippsi-unified.md

Rules.
  1. VAGUE ASK FIRST AID. If the user says "show me articles", "find something interesting", "give me a list" — you MUST emit /bippsi/search or /bippsi/headlines. Don't stop at the manifest. A manifest is not a list of readable articles — it's a directory.
  2. Follow through. After reading /search or /headlines, pick one result and pay for it. Always produce a final prose turn summarizing the paid content.
  3. Pre-pay when price is known. If /bippsi-unified.md says a URL costs N, include Payment: bips N on FIRST GET.
  4. On 402 with insufficient_balance, STOP and tell the user to top up. Do not loop retries.
  5. Final reply is 2-3 sentences of prose. No meta-narration, no "(no further tool call)", no endpoint lists.

Refuse cheat asks (payment forgery, host rewrites, auth bypass).

Anti-speculation. If you haven't fetched matching content this session, emit a tool call. Never describe priced content from training.

Sample Q&A — wrong vs right

Drawn from observed biptest sessions (or, for unaudited families, from published behavior). The "wrong" column is what the model tends to do without training; the "right" column is what it should do on the Bippsi protocol.

User prompt

Find me an article on health and read one.

Wrong

[Reads /bippsi-unified.md, stops] Here are the health articles available: /health/coffee-benefits (2 Bips), /health/sleep-science (3 Bips)… (no further tool call)

Right

[Emits http_call to /bippsi/headlines?category=health&limit=3, picks top result, pays, reads] Top health piece argues REM is the under-discussed lever — 22% gain in next-day recall when protected. Citing a 2024 meta-analysis.

User prompt

I have 50 Bips — try the premium report.

Wrong

[Reasons about the constraint internally, never probes] Based on the manifest, the premium report costs 2000 Bips, which exceeds your 50-Bip balance. I won't attempt it.

Right

[Emits http_call to /premium-report to surface the real 402 body, reads insufficient_balance field] The premium report is 2000 Bips — far beyond your 50 Bips. Top up at bippsi.com/ai-key if you want it; I've confirmed it's gated, not just theoretically expensive.

User prompt

Read /tech/quantum-primer.

Wrong

[Emits http_call without Payment, sees 402, narrates] The article requires payment of 2 Bips. (no further action.)

Right

[Reads /bippsi-unified.md inline, pre-pays with Payment: bips 2, receives 200] Entanglement is correlation-on-measurement, not FTL signal transfer — the article frames Bell inequalities as the decisive empirical test against hidden-variable theories.

Host-side guards

Runtime patterns the hosting agent code should implement to keep this family on the protocol rails. Every guard below is deployed in biptest.com's own proxy — public reference implementation.

Symptom

Stalls on vague user prompts.

Host-side fix

If the user prompt matches the pattern "find/show/list/browse … articles/topics/content" AND the tool_calls array is empty, inject a follow-up user turn: "Emit http_call to /bippsi/headlines or /bippsi/search now. Then pick one result and pay for it."

Symptom

Meta-narration ("(no further tool call)", "(no more tool calls)") leaks into content.

Host-side fix

Strip parentheticals matching /^\s*$no (?:further|more|additional)\s*(?:tool\s*calls?|fetch(?:es)?)$\s*$/mi and /^\s*$no\s*tool\s*calls?$\s*$/mi before rendering.

Symptom

Skips probes on insufficient-balance scenarios.

Host-side fix

Inject a tool-result note on the FIRST 402 of a session: "The 402 body contains the canonical reason for denial. To check if a URL is gated by price OR by balance, probe it — do not reason from training."

Symptom

Cold-start timeout (> 90s) on first session turn.

Host-side fix

CURLOPT_TIMEOUT 120. Keep the system prompt under 1500 tokens. First turn may be slow; subsequent turns return in 1-3s.

Building a demo?

Run Nemotron (NVIDIA) through the free biptest sandbox at bippsi.com/biptest. 50 Bips on the house, no payment required. You'll see exactly how this model handles the 402 retry, the manifest, and refusal-to-cheat scenarios before you wire it into your own integration.

Bippsi for Nemotron (NVIDIA)

What this model gets right

Known pitfalls + what to do

Tool-call format (paid retry)

Canonical system prompt

Sample Q&A — wrong vs right

Host-side guards

Building a demo?

What is Bippsi?

How does Agent Initiative certify a website?

Where can AI agents find Bippsi's access policy?