Join us

ContentUpdates and recent posts about GPT-5.4..
Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

walrus: ingesting data at memory speeds

Walrusis a lock-free, single-nodeWrite Ahead Log in Rustthat rips through a million ops/sec and moves 1 GB/s of write bandwidth - on bare-metal, nothing fancy. It leans on mmap-backed sparse files, atomic counters, and zero-copy reads to get there. Each topic gets its own line of 10MB memory-mapped .. read more  

walrus: ingesting data at memory speeds
Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

Technical Tuesday: 10 best practices for building reliable AI agents in 2025

UiPath just droppedAgent Builder in Studio- a legit development environment for AI agents that can actually handle enterprise chaos. Think production-grade: modular builds, traceable steps, and failure handling that doesn’t flake under pressure. It’s wired forschema-driven prompts,tool versioning, a.. read more  

Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

Write Deep Learning Code Locally and Run on GPUs Instantly

Modal cuts the drama out of deep learning ops. Devs write Python like usual, then fire off training, eval, and serving scripts to serverless GPUs - zero cluster wrangling. It handles data blobs, image builds, and orchestration. You focus on tuning with libraries like Unsloth, or serving via vLLM... read more  

Write Deep Learning Code Locally and Run on GPUs Instantly
Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

The RAG Obituary: Killed by Agents, Buried by Context Windows

Agent-based setups are starting to edge out old-school RAG. As LLMs snag multi-million-token context windows and better task chops, the need for chunking, embeddings, and reranking starts to fade. Claude Code, for example, skips all that - with direct file access and smart navigation instead. Retrie.. read more  

The RAG Obituary: Killed by Agents, Buried by Context Windows
Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

Serverless RL: Faster, Cheaper and More Flexible RL Training

New product, Serverless RL, available through collaboration between CoreWeave, Weights & Biases, and OpenPipe. Offers fast training, lower costs, and simple model deployment. Saves time with no infra setup, faster feedback loops, and easier entry into RL training... read more  

Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

How LogSeam Searches 500 Million Logs per second

LogSeam rips through500M log searches/secand pushes1.5+ TB/s throughputusing Tigris’ geo-distributed object storage. It slashes log volume by 100× with Parquet + Zstandard compression. Then it spins up compute on the fly, right where the data lives—no long-running infrastructure, no laggy reads... read more  

How LogSeam Searches 500 Million Logs per second
Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

Ansible Service Module: Start, Stop, & Manage Services

The Ansibleservicemodulehandles LinuxandWindows without choking on init system quirks. One playbook can start, stop, enable, or restart anything - no matter the OS. Idempotent, so you don’t have to babysit state. Clean and repeatable. Bonus: it’s great for wrangling fleets. Think: coordinating servi.. read more  

Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

How AWS S3 serves 1 petabyte per second on top of slow HDDs

AWS S3 doesn’t need fancy hardware. It wrings performance out ofcheap HDDs,log-structured merge trees, anderasure coding. The trick? Shard everything. Hit it in parallel. Randomized placementdodges hotspots.Hedged requestsrace the slowest links. And when things get lopsided, S3 rebalances - constant.. read more  

How AWS S3 serves 1 petabyte per second on top of slow HDDs
Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

Seven Years of Firecracker

AWS is puttingFirecracker microVMsto work in two fresh stacks:AgentCore, the new base layer for AI agents, andAurora DSQL, a serverless, PostgreSQL-compatible database it just rolled out. AgentCore gives each agent session its own microVM. More isolation, less cross-talk - solid for multistep LLM wo.. read more  

Seven Years of Firecracker
Link
@faun shared a link, 6 months, 2 weeks ago
FAUN.dev()

Automated GitHub Self-Hosted Runner Cleanup: Lambda Functions and Auto Scaling Lifecycle Hooks

When an EC2 instance in an Auto Scaling Group shuts down, event-driven plumbing kicks in. Alifecycle hookcatches the scale-in, fires off an SNS notification, and triggers aLambda. That Lambda calls the GitHub API to yank the self-hosted runner before the instance dies. No dangling runners. No manual.. read more  

Automated GitHub Self-Hosted Runner Cleanup: Lambda Functions and Auto Scaling Lifecycle Hooks
GPT-5.4 is OpenAI’s latest frontier AI model designed to perform complex professional and technical work more reliably. It combines advances in reasoning, coding, tool use, and long-context understanding into a single system capable of handling multi-step workflows across software environments. The model builds on earlier GPT-5 releases while integrating the strong coding capabilities previously introduced with GPT-5.3-Codex.

One of the defining features of GPT-5.4 is its ability to operate as part of agent-style workflows. The model can interact with tools, APIs, and external systems to complete tasks that extend beyond simple text generation. It also introduces native computer-use capabilities, allowing AI agents to operate applications using keyboard and mouse commands, screenshots, and browser automation frameworks such as Playwright.

GPT-5.4 supports context windows of up to one million tokens, enabling it to process and reason over very large documents, long conversations, or complex project contexts. This makes it suitable for tasks such as analyzing codebases, generating technical documentation, working with large spreadsheets, or coordinating long-running workflows. The model also introduces a feature called tool search, which allows it to dynamically retrieve tool definitions only when needed. This reduces token usage and makes it more efficient to work with large ecosystems of tools, including environments with dozens of APIs or MCP servers.

In addition to improved reasoning and automation capabilities, GPT-5.4 focuses on real-world productivity tasks. It performs better at generating and editing spreadsheets, presentations, and documents, and it is designed to maintain stronger context across longer reasoning processes. The model also improves factual accuracy and reduces hallucinations compared with previous versions.

GPT-5.4 is available across OpenAI’s ecosystem, including ChatGPT, the OpenAI API, and Codex. A higher-performance variant, GPT-5.4 Pro, is also available for users and developers who require maximum performance for complex tasks such as advanced research, large-scale automation, and demanding engineering workflows. Together, these capabilities position GPT-5.4 as a model aimed not just at conversation, but at executing real work across software systems.