Updates and recent posts about GPT-5.4..

Posts
Description

Link

@faun shared a link, 1 year ago

FAUN.dev()

BenchmarkQED: Automated benchmarking of RAG systems

BenchmarkQEDtakes RAG benchmarking to another level. ImagineLazyGraphRAGsmashing through competition—even when wielding a hefty1M-tokencontext. The only hitch? It occasionally stumbles on direct relevance for local queries. But fear not,AutoQis in its corner, crafting a smorgasbord of synthetic quer.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

The AI 4-Shot Testing Flow

4-Shot Testing Flowfuses AI's lightning-fast knack for spotting issues with the human knack for sniffing out those sneaky, context-heavy bugs. Trim QA time and expenses. While AI tears through broad test execution, human testers sharpen the lens, snagging false positives/negatives before they slip t.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

GenAI Meets SLMs: A New Era for Edge Computing

SLMspower up edge computing with speed and privacy finesse. They master real-time decisions and steal the spotlight in cramped settings like telemedicine andsmart cities. On personal devices, they outdoLLMs—trimming the fat with model distillation and quantization. Equipped withONNXandMediaPipe, the.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

Automate Models Training: An MLOps Pipeline with Tekton and Buildpacks

Tekton plusBuildpacks: your secret weapon for training GPT-2 without Dockerfile headaches. They wrap your code in containers, ensuring both security and performance.Tekton Pipelineslean on Kubernetes tasks to deliver isolation and reproducibility. Together, they transform CI/CD for ML into something.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

God is hungry for Context: First thoughts on o3 pro

OpenAIjust took an axe too3pricing—down 80%. Entero3-prowith its $20/$80 show. They boast a star-studded 64% win rate against o3. Forget Opus;o3-pronails picking the right tools and reading the room, flipping task-specific LLM apps on their heads... read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

How we’re responding to The New York Times’ data demands in order to protect user privacy

OpenAI is challenging a court order stemming from The New York Times' copyright lawsuit, which mandates the indefinite retention of user data from ChatGPT and API services. OpenAI contends this requirement violates user privacy commitments and sets a concerning precedent. While the company complies .. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

FinOps X 2025 Cloud Announcements: AI Agents and Increased FOCUS™ Support

AWSjust decreed its new AI-infusedCost Optimization Hub. This gizmo tackles the chaos of tracking overlapping opportunities among millions of resources. Meanwhile,Google CloudunleashedForecasting Enhancements. They claim their AI now wrangles pesky outliers and wild trends, turning financial crystal.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

Are You Over-Engineering Your Tests? – Think Like a Tester

Over-engineering alert:Automating every last thing? Recipe for disaster. Flaky tests galore! Stick to manual edge cases and sharp, atomic checks instead of drowning in script spaghetti.Abstraction overload ahead!Chasing too much abstraction makes maintenance a headache. Keep tests clean and clear.St.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

DevOps Tools Targeted for Cryptojacking

JINX-0132takes a sneaky approach. It exploits Nomad's initial slip-ups to secretly mine crypto. How? By leveraging GitHub for downloads and dodging those pesky Indicators of Compromise (IOCs). Even big players using Nomad to juggle hundreds of clients aren't safe. A simple misconfiguration and poof—.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

What I’ve Learned from Designing Landing Zones On Google Cloud

Cloud Foundation FabricandFASTmake Google Cloud feel more like a well-oiled machine than a hair-pulling puzzle. They slice through the setup with killer precision, laying down a rock-solid, enterprise-grade foundation. No IAM madness. No network disasters waiting to explode. Just scalable, secure co.. read more

GPT-5.4 is OpenAI’s latest frontier AI model designed to perform complex professional and technical work more reliably. It combines advances in reasoning, coding, tool use, and long-context understanding into a single system capable of handling multi-step workflows across software environments. The model builds on earlier GPT-5 releases while integrating the strong coding capabilities previously introduced with GPT-5.3-Codex.

One of the defining features of GPT-5.4 is its ability to operate as part of agent-style workflows. The model can interact with tools, APIs, and external systems to complete tasks that extend beyond simple text generation. It also introduces native computer-use capabilities, allowing AI agents to operate applications using keyboard and mouse commands, screenshots, and browser automation frameworks such as Playwright.

GPT-5.4 supports context windows of up to one million tokens, enabling it to process and reason over very large documents, long conversations, or complex project contexts. This makes it suitable for tasks such as analyzing codebases, generating technical documentation, working with large spreadsheets, or coordinating long-running workflows. The model also introduces a feature called tool search, which allows it to dynamically retrieve tool definitions only when needed. This reduces token usage and makes it more efficient to work with large ecosystems of tools, including environments with dozens of APIs or MCP servers.

In addition to improved reasoning and automation capabilities, GPT-5.4 focuses on real-world productivity tasks. It performs better at generating and editing spreadsheets, presentations, and documents, and it is designed to maintain stronger context across longer reasoning processes. The model also improves factual accuracy and reduces hallucinations compared with previous versions.

GPT-5.4 is available across OpenAI’s ecosystem, including ChatGPT, the OpenAI API, and Codex. A higher-performance variant, GPT-5.4 Pro, is also available for users and developers who require maximum performance for complex tasks such as advanced research, large-scale automation, and demanding engineering workflows. Together, these capabilities position GPT-5.4 as a model aimed not just at conversation, but at executing real work across software systems.