Updates and recent posts about GPT-5.4..

Posts
Description

Link

@faun shared a link, 11 months ago

FAUN.dev()

Massive study detects AI fingerprints in millions of scientific papers

Study finds 13.5% of 2024 PubMed papers bear LLM fingerprints, showcasing a shift to jazzy "stylistic" verbs over stodgy nouns.Upending stuffy academic norms!.. read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

LLM Evaluation Metrics: The Ultimate LLM Evaluation Guide - Confident AI

Dump BLEU and ROUGE. Let LLM-as-a-judge tools like G-Eval propel you to pinpoint accuracy.The old scorers? They whiff on meaning, like a cat batting at a laser dot.DeepEval? It wrangles bleeding-edge metrics with five lines of neat code.Want a personal touch? G-Eval's got your back. DAG keeps benchm.. read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

Building “Auto-Analyst” — A data analytics AI agentic system

DSPyfuels a modular AI machine, drivingagent chainsto weave tidy analysis scripts. But it’s not all sunshine and roses—hallucination errors like to throw reliability under the bus... read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

MCP — The Missing Link Between AI Models and Your Applications

Model Context Protocol (MCP)tackles the "MxN problem" in AI by creating a universal handshake for tool interactions. It simplifies howLLMstap into external resources. MCP leans onJSON-RPC 2.0for streamlined dialogues, building modular, maintainable, and secure ecosystems that boast reusable and inte.. read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

‘Shit in, shit out’: AI is coming for agriculture, but farmers aren’t convinced

Aussie farmers want "more automation, fewer bells and whistles"—technology should work like a tractor, not act like an app:straightforward, adaptable, and rock-solid... read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

Supabase MCP can leak your entire SQL database

Supabase MCP'saccess can barge right past RLS,spilling SQL databaseswhen faced with sneaky inputs. It's a cautionary tale from the world ofLLM system trifecta attacks... read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

Document Search with NLP: What Actually Works (and Why)

NLP document search trounces old-school keyword hunting. It taps into scalable*vector databasesandsemantic vectorsto grasp meaning, not just parrot words.* Pictureword vector arithmetic: "King - Man + Woman = Queen." It's magic. Searches become lightning-fast and drenched in context... read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

A non-anthropomorphized view of LLMs

CallingLLMssentient or ethical? That's a stretch. Behind the curtain, they're just fancy algorithms dressed up as text wizards. Humans? They're a whole mess of complexity... read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

Context Engineering for Agents

Context engineeringcranks an AI agent up to 11 by juggling memory like a slick OS. It writes, selects, compresses, and isolates—never missing a beat despite those pesky token limits. Nail the context, and you've got a dream team. Slip up, though, and you might trigger chaos, like when ChatGPT went r.. read more

Link

@faun shared a link, 11 months ago

FAUN.dev()

My Honest Advice for Aspiring Machine Learning Engineers

Becoming a machine learning engineer requires dedicatingat least 10 hours per weekto studying outside of everyday responsibilities. This can take a minimum of two years, even with an ideal background, due to the complexity of the required skills. Understanding core algorithms and mastering the funda.. read more

GPT-5.4 is OpenAI’s latest frontier AI model designed to perform complex professional and technical work more reliably. It combines advances in reasoning, coding, tool use, and long-context understanding into a single system capable of handling multi-step workflows across software environments. The model builds on earlier GPT-5 releases while integrating the strong coding capabilities previously introduced with GPT-5.3-Codex.

One of the defining features of GPT-5.4 is its ability to operate as part of agent-style workflows. The model can interact with tools, APIs, and external systems to complete tasks that extend beyond simple text generation. It also introduces native computer-use capabilities, allowing AI agents to operate applications using keyboard and mouse commands, screenshots, and browser automation frameworks such as Playwright.

GPT-5.4 supports context windows of up to one million tokens, enabling it to process and reason over very large documents, long conversations, or complex project contexts. This makes it suitable for tasks such as analyzing codebases, generating technical documentation, working with large spreadsheets, or coordinating long-running workflows. The model also introduces a feature called tool search, which allows it to dynamically retrieve tool definitions only when needed. This reduces token usage and makes it more efficient to work with large ecosystems of tools, including environments with dozens of APIs or MCP servers.

In addition to improved reasoning and automation capabilities, GPT-5.4 focuses on real-world productivity tasks. It performs better at generating and editing spreadsheets, presentations, and documents, and it is designed to maintain stronger context across longer reasoning processes. The model also improves factual accuracy and reduces hallucinations compared with previous versions.

GPT-5.4 is available across OpenAI’s ecosystem, including ChatGPT, the OpenAI API, and Codex. A higher-performance variant, GPT-5.4 Pro, is also available for users and developers who require maximum performance for complex tasks such as advanced research, large-scale automation, and demanding engineering workflows. Together, these capabilities position GPT-5.4 as a model aimed not just at conversation, but at executing real work across software systems.