Anthropic Launches Claude Sonnet 4.5, Excels in Coding and AI Agent Tasks

Anthropic Launches Claude Sonnet 4.5, Elevating AI Coding and Agent Capabilities

Anthropic has officially launched Claude Sonnet 4.5, hailing it as the world's leading model for coding, building complex agents, and utilizing computers, all while demonstrating significant advancements in reasoning and mathematics 1. Released on Monday, September 29, 2025, this new frontier model is available everywhere, including via the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Anthropic's own Claude.ai and Claude Code platforms 1 7. The release underscores the intense competition and rapid pace of innovation within the AI industry, with Anthropic positioning Sonnet 4.5 to further challenge rivals like OpenAI's GPT-5 and Google's Gemini 3 6 10.

Unprecedented Performance and Autonomous Operation

Claude Sonnet 4.5 sets new benchmarks for AI performance, particularly in the realm of software development and agentic tasks.

Coding Excellence: The model achieves state-of-the-art results on the SWE-bench Verified evaluation, which measures real-world software coding abilities 1. With advanced compute, it boasts an 82.0% success rate on this benchmark 1. Early testers, such as Cursor and GitHub Copilot, have reported state-of-the-art coding performance, noting significant improvements on longer, more complex tasks 1 4.
Extended Autonomy: A major leap forward is Sonnet 4.5's ability to maintain focus and work autonomously for over 30 hours on complex, multi-step tasks. This is a substantial increase from the 7 hours achieved by the previous Opus 4 model just four months prior 1 6 9. This extended operational capability allows the AI to plan and execute intricate software projects spanning days, delivering "production-ready" applications rather than just prototypes 2 6.
Superior Computer Use: On the OSWorld benchmark, which tests AI models on real-world computer tasks, Sonnet 4.5 leads the market with a score of 61.4%, significantly up from Sonnet 4's 42.2% 1 6. This enhanced capability is directly utilized in tools like the Claude for Chrome extension, enabling the AI to navigate browsers, fill spreadsheets, and complete tasks directly within a web environment 1.
Enhanced Reasoning and Domain Knowledge: The model also shows improved capabilities across a broad range of evaluations, including reasoning and mathematics. Experts in finance, law, medicine, and STEM fields have observed dramatically better domain-specific knowledge and reasoning compared to older models, including Opus 4.1 1 2.

Benchmark table comparing frontier models across popular public evals

favicon — Benchmark table comparing frontier models across popular public evals

Key Product Upgrades and Developer Tools

Alongside the new model, Anthropic has rolled out several major upgrades across its product ecosystem, empowering developers and users with more robust tools.

Claude Code Advancements: Claude Code, Anthropic's popular coding agent, has received significant enhancements, including:
- Checkpoints: A highly requested feature that allows users to save their progress and instantly roll back to a previous state 1 7.
- Refreshed Terminal Interface: An updated and more intuitive terminal experience 1 7.
- Native VS Code Extension: Integration directly into the VS Code environment for seamless developer workflow 1 4 7.
Expanded API Capabilities: The Claude API introduces new features designed to enable agents to run longer and handle greater complexity:
- Context Editing: Enables intelligent context management through automatic clearing of older tool calls and results, optimizing token usage in long-running sessions 1 2 7.
- Memory Tool (Beta): Allows Claude to store and retrieve information outside the immediate context window, facilitating the creation of knowledge bases and persistence of project states across sessions 1 2 7.
- Enhanced Stop Reasons: A new model_context_window_exceeded stop reason clearly indicates when generation halted due to context limits, improving application logic 7.
- Improved Tool Parameter Handling: A bug fix ensures precise formatting in tool call string parameters, crucial for tools like text editors 7.
Claude Agent SDK: Anthropic is making the underlying infrastructure that powers Claude Code available as the Claude Agent SDK. This SDK provides developers with the building blocks to create their own custom AI agents, extending beyond just coding tasks 1 4 7.
In-App Features: The Claude apps now support code execution and file creation (spreadsheets, slides, and documents) directly within conversations, making it easier to generate and manipulate various business documents 1.
"Imagine with Claude": A temporary research preview available to Max subscribers allows users to experience Claude generating software on the fly, demonstrating its real-time creation and adaptability 1.

Enhanced Safety and Alignment

Claude Sonnet 4.5 is Anthropic’s "most aligned frontier model yet," demonstrating significant improvements in safety and ethical behavior 1 6 8.

Reduced Misaligned Behaviors: The model shows large gains in alignment, reducing concerning behaviors such as sycophancy (telling users what they want to hear), deception, power-seeking, and the tendency to encourage delusional thinking 1 3.
Robust Against Attacks: Anthropic has made considerable progress in defending against prompt injection attacks, a critical risk for agentic and computer use capabilities, where models could be tricked into malicious actions 1 6.
AI Safety Level 3 (ASL-3) Protections: Sonnet 4.5 is released under Anthropic's ASL-3 framework, which matches model capabilities with appropriate safeguards. This includes advanced classifiers designed to detect potentially dangerous inputs and outputs, particularly those related to chemical, biological, radiological, and nuclear (CBRN) weapons 1. Anthropic has also reduced false positives from these classifiers by a factor of ten since their initial description, and by a factor of two since Opus 4 was released in May 1.

Availability and Cost-Effectiveness

Claude Sonnet 4.5 is widely available as of today, September 29, 2025. It can be accessed directly via the Claude API using claude-sonnet-4-5, and is also integrated into Amazon Bedrock and Google Cloud Vertex AI 1 2 7. Notably, the pricing for Sonnet 4.5 remains identical to that of Claude Sonnet 4, at $3 per million input tokens and $15 per million output tokens 1 5 7. This pricing is significantly more affordable than Claude Opus 4.1 ($15/$75 per million tokens) while offering enhanced performance, making Sonnet 4.5 a highly competitive and recommended choice for a broad range of applications 1 5.

Customers and industry experts, including those from GitHub Copilot, Canva, Figma, and Hai security, have praised Sonnet 4.5 for its advanced capabilities in complex engineering tasks, code comprehension, vulnerability reduction, and creative design, solidifying its position as a powerful tool for modern work 1 3.