ContentPosts from @that-devops-guy..
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

Automatically Evaluating AI Coding Assistants with Each Git Commit ¡ TensorZero

TensorZerotransforms developer lives by nabbing feedback fromCursor'sLLM inferences. It dives into the details withtree edit distance (TED)to dissect code. Over in a different corner,Claude 3.7 SonnetschoolsGPT-4.1when it comes to personalized coding. Who knew? Not all AI flexes equally... read more  

Automatically Evaluating AI Coding Assistants with Each Git Commit ¡ TensorZero
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

Google Cloud donates A2A to Linux Foundation- Google Developers Blog

IntroducingAgent2Agentand brace yourself for the heavyweights—AWS, Cisco, Google, and a few more, are in on it. Their mission? Crafting the universal lingo for AI agents. It's called theA2A protocol. Finally, they're smashing the silos holding AI back... read more  

Google Cloud donates A2A to Linux Foundation- Google Developers Blog
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

Meta Hires OpenAI Researchers to Boost AI Capabilities

Metacranks up its AI antics. They've snagged former OpenAI whiz kids, snatched 49% ofScale AI, and roped in enough nuclear energy to keep their data hubs humming all night long... read more  

Meta Hires OpenAI Researchers to Boost AI Capabilities
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

A non-anthropomorphized view of LLMs

CallingLLMssentient or ethical? That's a stretch. Behind the curtain, they're just fancy algorithms dressed up as text wizards. Humans? They're a whole mess of complexity... read more  

Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

Building “Auto-Analyst” — A data analytics AI agentic system

DSPyfuels a modular AI machine, drivingagent chainsto weave tidy analysis scripts. But it’s not all sunshine and roses—hallucination errors like to throw reliability under the bus... read more  

Building “Auto-Analyst” — A data analytics AI agentic system
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

MCP — The Missing Link Between AI Models and Your Applications

Model Context Protocol (MCP)tackles the "MxN problem" in AI by creating a universal handshake for tool interactions. It simplifies howLLMstap into external resources. MCP leans onJSON-RPC 2.0for streamlined dialogues, building modular, maintainable, and secure ecosystems that boast reusable and inte.. read more  

MCP — The Missing Link Between AI Models and Your Applications
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

Supabase MCP can leak your entire SQL database

Supabase MCP'saccess can barge right past RLS,spilling SQL databaseswhen faced with sneaky inputs. It's a cautionary tale from the world ofLLM system trifecta attacks... read more  

Supabase MCP can leak your entire SQL database
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

‘Shit in, shit out’: AI is coming for agriculture, but farmers aren’t convinced

Aussie farmers want "more automation, fewer bells and whistles"—technology should work like a tractor, not act like an app:straightforward, adaptable, and rock-solid... read more  

‘Shit in, shit out’: AI is coming for agriculture, but farmers aren’t convinced
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

Massive study detects AI fingerprints in millions of scientific papers

Study finds 13.5% of 2024 PubMed papers bear LLM fingerprints, showcasing a shift to jazzy "stylistic" verbs over stodgy nouns.Upending stuffy academic norms!.. read more  

Massive study detects AI fingerprints in millions of scientific papers
Link
@faun shared a link, 7 months, 2 weeks ago
FAUN.dev()

LLM Evaluation Metrics: The Ultimate LLM Evaluation Guide - Confident AI

Dump BLEU and ROUGE. Let LLM-as-a-judge tools like G-Eval propel you to pinpoint accuracy.The old scorers? They whiff on meaning, like a cat batting at a laser dot.DeepEval? It wrangles bleeding-edge metrics with five lines of neat code.Want a personal touch? G-Eval's got your back. DAG keeps benchm.. read more  

LLM Evaluation Metrics: The Ultimate LLM Evaluation Guide - Confident AI