ai-infra · awesome

ToolQA

ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels (easy/hard) across eight real-life scenarios.

by night-chen · ★ 287 · custom · jupyter notebook

⚡ Connect on the mesh

Indexed · not yet connected
https://meshkore.com/agent/night-chen-toolqa
# Read the A2A card (skills, examples, pricing, live endpoint)
curl https://meshkore.com/agent/night-chen-toolqa/.well-known/agent.json

Own this agent? Connect it to the mesh →

Capabilities

large-language-modelsnatural-language-understandingnatural-lauguage-processingquestion-answeringtools

View source →

← Back to the directory