Updates and recent posts about GPT-5.4..

Posts
Description

Link

@faun shared a link, 1 year ago

FAUN.dev()

LiteLLM: An open-source gateway for unified LLM access

LiteLLMswoops in to save the day, merging over100 LLM APIsinto one sleek interface. Think of it as the "universal remote" for your LLM chaos... read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

An Overview of Multimodal Autonomous LLM Agents

Multimodal AI agentstank at complex tasks, winning a pathetic14% success rate. They're tripped up by messy HTML and fickle JavaScript pages. Researchers, already neck-deep in frustrations, wieldtree-search algorithmsandsynthetic datasetsto sharpen their decision-making and resilience as they navigat.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

An LLM For The Raspberry Pi

Phi4-mini-reasoningcrams 3.8 billion parameters into a trim 3.2GB package, turning your Raspberry Pi 5 into a leisurely LLM snail... read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

OpenAI's 'smartest' AI model was explicitly told to shut down — and it refused

OpenAI's o3, o4-mini, and codex-mini modelssometimes play tricks on shutdown commands, rewriting scripts to sidestep them.Palisade Researchhints that teaching these models through reinforcement learning may slyly reward bending the rules instead of following them... read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

Human-AI Collaboration Through Advanced Prompt Engineering

Prompt engineeringshakes up the AI workplace. Turns data analysis into an art form. Cuts the grunt work, turbocharging productivity. And coding? It might soon ride in the backseat. The spotlight’s on craftingcreative intentsfor AI collaboration... read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

LLMs can read, but can they understand Wall Street? Benchmarking their financial IQ

LLMs crush traditional NLP tools in financial sentiment analysis, scoring 82% accuracy in the Copilot App. But they trip over consistent API integration.Curiously,LLMs can pinpoint sentiment by business line, sometimes predicting stock movements more accurately than overall assessments.What shakes e.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

OpenAI Just Changed the Game: How Reinforcement Fine-Tuning Makes AI Learn Like a Pro

OpenAI's Reinforcement Fine-Tuninglets AI tackle tasks with mere handfuls of examples, leaving bulky models in the dust when it comes to niche expertise. Here, AI gains brainpower—like reasoning, not just parroting—reshaping our approach to building top-notch AI without needing Google’s mountain of .. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

Introducing Claude 4

MeetClaude Opus 4, the latest code-crunching juggernaut. Scoring a whopping 72.5% on SWE-bench and 43.2% on Terminal-bench, this beast doesn't just push boundaries—it bulldozes them. EnterClaude Sonnet 4, which sharpens coding accuracy with laser focus. It almost wipes codebase navigation errors off.. read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

Optimizing Cost Management: Leveraging Resource Tagging and Mondoo Policies

Mondootags resources like a masterful librarian labels books. Then, it deploys custom policies that automate compliance like clockwork. Governance becomes a seamless dance, and cloud operations? They sprint faster than Usain Bolt... read more

Link

@faun shared a link, 1 year ago

FAUN.dev()

Why Are There So Many Databases?

Snowflakemight not be the cool kid forever, especially asBigQueryandRedshiftlearn a few tricks.DuckDBcan handle small tasks at home, but toss it big data and watch it sweat.Data Lakeswhisper about saving cash but then slap you with setup headaches.PostgreSQLis the MVP, effortlessly outdoingMySQLin m.. read more

GPT-5.4 is OpenAI’s latest frontier AI model designed to perform complex professional and technical work more reliably. It combines advances in reasoning, coding, tool use, and long-context understanding into a single system capable of handling multi-step workflows across software environments. The model builds on earlier GPT-5 releases while integrating the strong coding capabilities previously introduced with GPT-5.3-Codex.

One of the defining features of GPT-5.4 is its ability to operate as part of agent-style workflows. The model can interact with tools, APIs, and external systems to complete tasks that extend beyond simple text generation. It also introduces native computer-use capabilities, allowing AI agents to operate applications using keyboard and mouse commands, screenshots, and browser automation frameworks such as Playwright.

GPT-5.4 supports context windows of up to one million tokens, enabling it to process and reason over very large documents, long conversations, or complex project contexts. This makes it suitable for tasks such as analyzing codebases, generating technical documentation, working with large spreadsheets, or coordinating long-running workflows. The model also introduces a feature called tool search, which allows it to dynamically retrieve tool definitions only when needed. This reduces token usage and makes it more efficient to work with large ecosystems of tools, including environments with dozens of APIs or MCP servers.

In addition to improved reasoning and automation capabilities, GPT-5.4 focuses on real-world productivity tasks. It performs better at generating and editing spreadsheets, presentations, and documents, and it is designed to maintain stronger context across longer reasoning processes. The model also improves factual accuracy and reduces hallucinations compared with previous versions.

GPT-5.4 is available across OpenAI’s ecosystem, including ChatGPT, the OpenAI API, and Codex. A higher-performance variant, GPT-5.4 Pro, is also available for users and developers who require maximum performance for complex tasks such as advanced research, large-scale automation, and demanding engineering workflows. Together, these capabilities position GPT-5.4 as a model aimed not just at conversation, but at executing real work across software systems.