AI shouldn’t block the conversation just because a tool is busy. To evaluate that behavior properly, we needed good data—so we made it.
AsyncTool is a Hugging Face dataset of 270 high‑quality, multi‑turn (and I mean up to 60 turns) conversations where the assistant keeps talking while tools work in the background. Each case is different, grounded in real JSON‑Schema tool definitions, and the tool calls/results are consistent and make sense with no fabricated states or magical shortcuts.
What’s inside
- 18 scenario templates × 15 renders = 270 conversations.
- Conversations run 10–30 “in‑world” minutes with filler chat, retries, status checks, and out‑of‑order returns.
- Every row includes messages, tools, and meta so you can replay transcripts, inspect schemas, and trace provenance.
- Protocol features: <tool_ack /> placeholders, -FINAL handoffs, mixed sync/async chains, transient failures, and fatal‑error surfacing.
- License: Apache‑2.0.
We’re exploring how agents can ack now, answer later - waiting for the right signal (last relevant tool result vs. last user question) while staying natural and helpful. This dataset gives you supervised signals to:
- finetune assistants that acknowledge async work without hallucinating tool states,
- build guardrails/regression tests for routers juggling retries and reordered responses,
- evaluate “answered at the right time” behavior.
We’re also publishing the generator so you can reproduce or extend everything locally. If you’re building tool‑using agents - or just tired of UIs that freeze—this should help you train, test, and iterate faster.
michalwarda•1h ago
AsyncTool is a Hugging Face dataset of 270 high‑quality, multi‑turn (and I mean up to 60 turns) conversations where the assistant keeps talking while tools work in the background. Each case is different, grounded in real JSON‑Schema tool definitions, and the tool calls/results are consistent and make sense with no fabricated states or magical shortcuts.
What’s inside - 18 scenario templates × 15 renders = 270 conversations. - Conversations run 10–30 “in‑world” minutes with filler chat, retries, status checks, and out‑of‑order returns. - Every row includes messages, tools, and meta so you can replay transcripts, inspect schemas, and trace provenance. - Protocol features: <tool_ack /> placeholders, -FINAL handoffs, mixed sync/async chains, transient failures, and fatal‑error surfacing. - License: Apache‑2.0.
We’re exploring how agents can ack now, answer later - waiting for the right signal (last relevant tool result vs. last user question) while staying natural and helpful. This dataset gives you supervised signals to: - finetune assistants that acknowledge async work without hallucinating tool states, - build guardrails/regression tests for routers juggling retries and reordered responses, - evaluate “answered at the right time” behavior.
We’re also publishing the generator so you can reproduce or extend everything locally. If you’re building tool‑using agents - or just tired of UIs that freeze—this should help you train, test, and iterate faster.
Built with Torque → https://usetorque.dev/