Seven papers covering the tool-use story. API-Bank and Gorilla established the format. ToolBench scaled the API surface. BFCL is the leaderboard most teams actually report against in 2025, with v3 adding the multi-turn dimension. TRAJECT-Bench is the trajectory-aware extension. AgentDojo lives at the intersection of tool use and security, where prompt injection into a tool call becomes the attack surface. The Yehudai survey closes the picture with the broader agent-eval view.