Vectorfall.com - AI News and Updates
Illustration of an AI model powering ChatGPT, API tools, and Codex workflows
9 mars, 2026 by Thomas Karlsson
Reading time: 5 min

GPT-5.4 arrives with computer-use agents and 1M-token context

OpenAI is rolling out GPT-5.4 today across ChatGPT, its API, and Codex, positioning it as its most capable and efficient model for professional work. The company is also introducing GPT-5.4 Pro in ChatGPT and the API for users who want maximum performance on complex tasks.

##

GPT-5.4 is framed as a consolidation release, pulling together recent work in reasoning, coding, and agent-style workflows into a single model. OpenAI says it incorporates the “industry-leading” coding capabilities it previously shipped in GPT-5.3-Codex, while improving how the model operates across tools and software environments commonly used in knowledge work, including spreadsheets, presentations, and documents.

In ChatGPT, the model appears as GPT-5.4 Thinking, and one of the headline changes is a new “preview plan” that outlines how the system intends to proceed on longer or more complex prompts. OpenAI says users can adjust that plan mid-response while the model is working, aiming to reduce the need for repeated back-and-forth to get an answer into the desired shape. The company also says GPT-5.4 Thinking improves deep web research for highly specific questions and maintains context better on tasks that require extended reasoning.

A major part of the release is what OpenAI describes as built-in, state-of-the-art computer-use capabilities in the API and Codex. The company calls GPT-5.4 its first general-purpose model shipped with native computer-use functionality, enabling agents to operate computers and carry out multi-application workflows. GPT-5.4 supports up to 1 million tokens of context, which OpenAI says helps agents plan, execute, and verify work over long time horizons.

OpenAI also points to efficiency gains, saying GPT-5.4 uses “significantly fewer tokens” to solve problems than GPT-5.2, which can translate into lower token consumption and faster responses. On a set of benchmarks it shared, GPT-5.4 posted higher scores than GPT-5.2 across several agent and tool-use evaluations, including GDPval (83.0% win-or-tie versus 70.9% for GPT-5.3-Codex and 71.0% for GPT-5.2), SWE-Bench Pro (57.7% versus 55.6% for GPT-5.2), OSWorld-Verified (75.0% versus 47.3% for GPT-5.2), Toolathlon (54.6% versus 46.3% for GPT-5.2), and BrowseComp (82.7% versus 65.8% for GPT-5.2).

For office-style outputs, OpenAI says it focused on improving spreadsheet, presentation, and document creation and editing. It reports that on an internal spreadsheet modeling benchmark meant to resemble work done by a less experienced bank analyst, GPT-5.4 averaged 87.5% compared with 68.4% for GPT-5.2. In a separate set of evaluation prompts, OpenAI says human raters preferred GPT-5.4’s presentations 68.0% of the time over GPT-5.2, citing cleaner layouts, more visual variety, and more effective use of image generation.

The company also claims progress on factuality. On a set of anonymized prompts where users had flagged factual errors, OpenAI says individual claims from GPT-5.4 were 33% less likely to be false and full responses were 18% less likely to contain errors than GPT-5.2.

On the developer side, OpenAI highlights computer-control coding and visual interaction. It says GPT-5.4 performs well at writing code to drive computer actions through libraries such as Playwright, and at generating mouse and keyboard commands from screenshots. The model’s behavior can be guided via developer messages, and OpenAI says developers can configure safety behavior by specifying custom confirmation policies to match different risk tolerances.

The release also includes changes aimed at tool-heavy agent systems. OpenAI introduced “tool search” in the API, so models can receive a concise list of available tools plus a search function, then fetch full tool definitions only when needed. OpenAI says this reduces the token overhead of large tool catalogs and improves speed and cost. In an evaluation of 250 tasks from the Scales MCP Atlas benchmark with 36 MCP servers enabled, OpenAI reports that placing servers behind tool search reduced total token usage by 47% while achieving the same accuracy.

For availability, OpenAI says GPT-5.4 is rolling out gradually in ChatGPT and Codex, and is available in the API as gpt-5.4, with GPT-5.4 Pro as gpt-5.4-pro. In ChatGPT, GPT-5.4 Thinking is available to Plus, Team, and Pro users, while Enterprise and Edu customers can enable early access via admin settings. GPT-5.4 Pro is available for Pro and Enterprise plans.

OpenAI also published updated API pricing. It lists GPT-5.4 at $2.50 per million input tokens, $0.25 per million cached input tokens, and $15 per million output tokens, compared with GPT-5.2 at $1.75 per million input tokens, $0.175 per million cached input tokens, and $14 per million output tokens. For the Pro tier, it lists gpt-5.4-pro at $30 per million input tokens and $180 per million output tokens, compared with gpt-5.2-pro at $21 per million input tokens and $168 per million output tokens. OpenAI says Batch and Flex are priced at half the standard API rate, while priority processing is available at double the standard rate.

On safety, OpenAI says it is treating GPT-5.4 as “high cyber capability” under its Preparedness Framework and deploying it with corresponding safeguards described in its system card documentation. Those measures include an expanded cybersecurity stack, monitoring, trusted access controls, and redirecting or blocking higher-risk requests for some customers. The company also says it has continued research into chain-of-thought monitoring and is releasing an open-source evaluation called “CoT controllability,” reporting that GPT-5.4 Thinking shows low ability to deliberately distort its reasoning to evade monitoring.

Related Articles

Illustration of an AI assistant reviewing source code for security issues
7 mars, 2026 by Thomas Karlsson

OpenAI previews Codex Security for automated code review

OpenAI has introduced Codex Security in a research preview, positioning the tool as an early look at how AI could help teams spot security issues in software code before they ship. The announcement...