Tool Group Variants

Four approaches to rendering tool groups in the chat. A & B are the originals; C & D are informed by competitor analysis.

Competitor Analysis — ChatGPT · Claude · Perplexity · Grok

ChatGPT (OpenAI)

Pattern: maximum concealment. Tool calls are a single collapsed pill — "Searched 5 sites" or "Used Python". Click to expand and see a flat list of site names or code. The pill sits inline with the response. Deep Research mode shows a scrolling log during execution ("Reading about…", "Analyzing…") but collapses to nothing once done.

Good for you: the inline pill is unobtrusive and lets the prose breathe. Bad for you: almost no provenance visible by default. A toxicologist would have to click into every pill to understand where data came from.

Claude (Anthropic)

Pattern: clear separation. Tool use renders as a distinct collapsible block with a different background. The "thinking" and "analysis" blocks are clearly delineated from response text. Web search shows "Searched X queries" with expandable query list. Artifacts (code, documents) open in a side panel.

Good for you: clean separation between reasoning and tool execution. Bad for you: still a "black box" — you see that a tool was called, not what parameters or which specific source returned data.

Perplexity

Pattern: inline citations. The most provenance-forward UI. Every claim in the response text has a numbered superscript [1][2][3] linking to a source panel. The panel shows favicon + title + URL for each source. Pro Search mode shows step-by-step progress ("Step 1: Searching…", "Step 2: Reading 8 sources").

Good for you: this is the closest to what toxicologists want — every data point traceable to a source. Bad for you: Perplexity's sources are web pages (URL + title). Your sources are API calls with structured parameters (CAS numbers, SMILES strings, section filters). You need deeper provenance than "here's a link."

Grok (xAI)

Pattern: live process log. DeepSearch shows a real-time scrolling log: "Searching for X", "Reading Y", "Found Z". The final answer includes inline [N] citations. The log collapses after completion.

Good for you: excellent transparency during execution — you can see exactly what the system is doing. Bad for you: the log is noisy and not useful after completion. No structured audit capability.

What HumanChemical needs that none of these have

  • Source-level provenance — not just "we searched" but which database, with what parameters
  • Audit depth on demand — a toxicologist should be able to inspect the exact CAS number, section filter, or SMILES string used in any query
  • Result summaries without expanding — "23 studies retrieved" or "LogP 3.32" should be visible at a glance, not buried behind a click
  • Handling 6–20+ tools — consumer AI assistants rarely make more than 3 tool calls. Your batches are 10× larger.

Sources Source-first layout. Leads with colored database badges (ECHA, EPI Suite, AOPWiki). Result summaries always visible. Per-row "Audit" button for deep inspection.

Inspired by: Perplexity citation panel + scientific provenance

I'll build a comprehensive endocrine disruption assessment for Bisphenol A (BPA, CAS 80-05-7). This is one of the most extensively studied endocrine-active substances, so there's a rich evidence base to integrate. Let me gather data from multiple sources.

PubChem
Get SMILES
cas number: 80-05-7
OC(c1ccc(O)cc1)(c1ccc(O)cc1)C
Multi
Aggregate Identifiers
cas number: 80-05-7
12 identifiers found
HCC
Fetch Context
context type: Hcc Bioactivity
Context loaded
AOPWiki
Search AOPs by Chemical
cas: 80-05-7·max results: 10
ECHA
Query ECHA
cas number: 80-05-7·section: 7 Toxicological Information
23 studies retrieved
EPI Suite
EPI Suite
cas number: 80-05-7
LogP 3.32, BCF 69.2

Initial Data Collection Summary

All six data-gathering tools have completed. I now have structural identifiers, ECHA toxicological data, EPI Suite environmental fate predictions, AOPWiki adverse outcome pathways, and bioactivity context loaded. Let me synthesise these into a comprehensive assessment.