Updates and recent posts about GPT-5.4..

Posts
Description

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

Guardians of the Agents

A new static verification framework wants to make runtime safeguards look lazy. It slaps **mathematical safety proofs** onto LLM-generated workflows *before* they run—no more crossing fingers at execution time. The setup decouples **code from data**, then runs checks with tools like **CodeQL** and .. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

LLM Evaluation: Practical Tips at Booking.com

Booking.com built Judge-LLM, a framework where strong LLMs evaluate other models against a carefully curated golden dataset. Clear metric definitions, rigorous annotation, and iterative prompt engineering make evaluations more scalable and consistent than relying solely on humans. **The takeaway**:.. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

The LinkedIn Generative AI Application Tech Stack: Extending to Build AI Agents

LinkedIn tore down its GenAI stack and rebuilt it for scale—with agents, not monoliths. The new setup leans on distributed, gRPC-powered systems. Central skill registry? Check. Message-driven orchestration? Yep. It’s all about pluggable parts that play nice together. They added sync and async modes.. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

Vibe coding has turned senior devs into ‘AI babysitters,’ but they say it’s worth it

Fastly says95% of developersspend extra time fixing AI-written code. Senior engineers take the brunt. That overhead has even spawned a new gig: “vibe code cleanup specialist.” (Yes, seriously.) As teams lean harder on AI tools, reliability and security start to slide—unless someone steps in. The re.. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

Understanding LLMs: Insights from Mechanistic Interpretability

LLMs generate text by predicting the next word using attention to capture context and MLP layers to store learned patterns. Mechanistic interpretability shows these models build circuits of attention and features, and tools like sparse autoencoders and attribution graphs help unpack superposition, r.. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

GitHub Copilot on autopilot as community complaints persist

GitHub's biggest debates right now? Whether to shut down AI-generated "noise" fromCopilot—stuff like auto-written issues and code reviews. No clear answers from GitHub yet. Frustration is piling up. Some devs are ditching the platform altogether, shifting their projects toCodebergor spinning upself-.. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

Accelerate serverless testing with LocalStack integration in VS Code IDE

The AWS Toolkit for VS Code now hooks straight into **LocalStack**. Run full end-to-end tests for **serverless workflows**—Lambda, SQS, EventBridge, the whole crew—without bouncing between tools or writing boilerplate. Just deploy to LocalStack from the IDE using the **AWS SAM CLI**. It feels like .. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

Writing an operating system kernel from scratch

A barebonestime-sharing OS kernel, written inZig, running onRISC-V. It leans onOpenSBIfor console I/O and timer interrupts. Threads? Statically allocated, each running inuser mode (U-mode). The kernel stays insupervisor mode (S-mode), where it catchessystem callsandcontext switchesvia timer ticks. .. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

Magical systems thinking

AI now writes over **25% of Google’s** and as much as **90% of Anthropic’s** code. That’s not a trend—it’s a regime change. Still, the mess in large public systems reminds us: clever analysis isn’t enough. Complex systems don’t behave; they misbehave. When the machines are churning out code, the .. read more

Link

@faun shared a link, 7 months, 1 week ago

FAUN.dev()

Scaling Prometheus: Managing 80M Metrics Smoothly

Flipkart ditched its creakyStatsD + InfluxDBstack for afederated Prometheussetup—built to handle 80M+ time-series metrics without choking. The move leaned intopull-based collection,PromQL's firepower, andhierarchical federationfor smarter aggregation and long-haul queries. Why it matters:Prometheus.. read more

GPT-5.4 is OpenAI’s latest frontier AI model designed to perform complex professional and technical work more reliably. It combines advances in reasoning, coding, tool use, and long-context understanding into a single system capable of handling multi-step workflows across software environments. The model builds on earlier GPT-5 releases while integrating the strong coding capabilities previously introduced with GPT-5.3-Codex.

One of the defining features of GPT-5.4 is its ability to operate as part of agent-style workflows. The model can interact with tools, APIs, and external systems to complete tasks that extend beyond simple text generation. It also introduces native computer-use capabilities, allowing AI agents to operate applications using keyboard and mouse commands, screenshots, and browser automation frameworks such as Playwright.

GPT-5.4 supports context windows of up to one million tokens, enabling it to process and reason over very large documents, long conversations, or complex project contexts. This makes it suitable for tasks such as analyzing codebases, generating technical documentation, working with large spreadsheets, or coordinating long-running workflows. The model also introduces a feature called tool search, which allows it to dynamically retrieve tool definitions only when needed. This reduces token usage and makes it more efficient to work with large ecosystems of tools, including environments with dozens of APIs or MCP servers.

In addition to improved reasoning and automation capabilities, GPT-5.4 focuses on real-world productivity tasks. It performs better at generating and editing spreadsheets, presentations, and documents, and it is designed to maintain stronger context across longer reasoning processes. The model also improves factual accuracy and reduces hallucinations compared with previous versions.

GPT-5.4 is available across OpenAI’s ecosystem, including ChatGPT, the OpenAI API, and Codex. A higher-performance variant, GPT-5.4 Pro, is also available for users and developers who require maximum performance for complex tasks such as advanced research, large-scale automation, and demanding engineering workflows. Together, these capabilities position GPT-5.4 as a model aimed not just at conversation, but at executing real work across software systems.