Join us

FAUN.dev() is where engineers from GitHub, Netflix, and Shopify go to stay ahead — fast.

An effortless, straightforward way to keep up with technologies...so you can keep your tabs closed and your mind open!

70,000+ developers already joined our ecosystem ⭐⭐⭐⭐⭐
Trusted by engineers at:

Google • Microsoft • AWS • Netflix

Slurm

Slurm is an open-source workload manager and job scheduler for Linux clusters, providing resource allocation, job execution, and queue management for large-scale high-performance computing environmen…

Featured Course(s)

DevSecOps in Practice

A Hands-On Guide to Operationalizing DevSecOps at Scale

> Get Your Copy

Content

Updates and recent posts about Slurm..

Posts
Description

Link

@varbear shared a link, 4 months, 2 weeks ago

FAUN.dev()

Use Python for Scripting!

Shell scripts love to break across macOS and Linux. Blame all the GNU vs BSD quirks;sed,date,readlink, take your pick. The mess adds up fast, especially in build pipelines and CI systems. This post makes the case for a cleaner way:Python 3. Standard library. Predictable behavior. Same results whethe.. read more

Use Python for Scripting!

Link

@varbear shared a link, 4 months, 2 weeks ago

FAUN.dev()

How Reddit Migrated Comments Functionality from Python to Go

Reddit successfully migrated its monolithic, high-traffic Comments service from legacy Python to modern Go microservices with zero user disruption. This was achieved by using a "tap compare" for reads and isolated "sister datastores" for writes, ensuring safe verification of the new code against pro.. read more

Link

@kaptain shared a link, 4 months, 2 weeks ago

FAUN.dev()

Why Kubernetes Won: Perfect Timing & Developer Culture

Kubernetes won big because the stars aligned, DevOps took off, Docker exploded, and enterprises finally stopped side-eyeing open source. Then came the institutional tailwind: CNCF pushed hard, GCP bet big, and the rest followed. Kubernetes isn't just tech. It's a new operating model, built in the op.. read more

Link

@kaptain shared a link, 4 months, 2 weeks ago

FAUN.dev()

An In-Depth Look at Istio Ambient Mode with Calico

Tigera just wiredIstio Ambient Modeinto Calico. That means you getsidecarless service mesh, think mTLS, L4/L7 policy, and observability, without stuffing every pod with a sidecar. It’s all handled by lean zTunnel and Waypoint proxies. Ports stay visible, soCalico and Istio policiesplay nice. No rewr.. read more

Link

@kaptain shared a link, 4 months, 2 weeks ago

FAUN.dev()

Kubernetes Made Simple: A Guide for JVM Developers

A sharp walkthrough for JVM devs shipping aKotlin Spring Boot app on Kubernetes. It covers the full deployment arc, packaging with Docker, wiring upDeploymentandServicemanifests, and managing config withConfigMapsandSecrets. There's a cleanPostgreSQLintegration baked in. It even gets intoheader-base.. read more

Kubernetes Made Simple: A Guide for JVM Developers

Link

@kaptain shared a link, 4 months, 2 weeks ago

FAUN.dev()

The “Inception” of Kubernetes: A Deep Dive into vCluster Architecture and Benefits

vCluster, a CNCF sandbox project, spins up real-deal Kubernetes control planes inside pods. Each lives in its own namespace but behaves like a full cluster, admin access, CRDs, Helm, the works. It reuses the host’s worker nodes using a syncer that routes vCluster workloads onto the real thing... read more

The “Inception” of Kubernetes: A Deep Dive into vCluster Architecture and Benefits

Link

@kaptain shared a link, 4 months, 2 weeks ago

FAUN.dev()

Compose to Kubernetes to Cloud With Kanvas

Docker just droppedKanvas, a new visual toy for building multi-cloud Kubernetes setups, without drowning in YAML. It bolts onto Docker Desktop and runs onMeshery. Drag and drop services into a topology, then bring them to life across AWS, GCP, or Azure. Mix inpolicy-driven validationandreal-time mut.. read more

Compose to Kubernetes to Cloud With Kanvas

Link

@kaptain shared a link, 4 months, 2 weeks ago

FAUN.dev()

A Deep Dive into Kubernetes Headless Service

Headless Serviceis a powerfulKubernetesfeature enabling direct pod-to-pod communication forstateful applicationsand preciseservice discoverywithout traditional load balancing.No automatic load balancing, pod IP changes, andspecial use casesmake it ideal for specific scenarios, not general workloads... read more

Link

@kaptain shared a link, 4 months, 2 weeks ago

FAUN.dev()

How to Troubleshoot Common Kubernetes Errors

A fresh Kubernetes troubleshooting guide lays out real-world tactics for tracking down 12 common cluster headaches. Think:kubectlsleuthing, poking through system logs, scraping observability metrics, and jumping intodebug containers. The guide breaks down howAIOpsis stepping in, digesting event data.. read more

How to Troubleshoot Common Kubernetes Errors

Link

@kaptain shared a link, 4 months, 2 weeks ago

FAUN.dev()

Kubernetes 1.35 - New security features

Kubernetes 1.35 is done with legacy baggage. cgroups v1? Deprecated. Image pull credentials? Now re-verified by default—no more freeloading. kubectl SPDY API upgrades? Locked down. You’ll needcreatepermissions just to speak the protocol. Expect breakage if your workflows leaned on old assumptions. U.. read more

Kubernetes 1.35 - New security features

Slurm Workload Manager is an open-source, fault-tolerant, and highly scalable cluster management and scheduling system widely used in high-performance computing (HPC). Designed to operate without kernel modifications, Slurm coordinates thousands of compute nodes by allocating resources, launching and monitoring jobs, and managing contention through its flexible scheduling queue.

At its core, Slurm uses a centralized controller (slurmctld) to track cluster state and assign work, while lightweight daemons (slurmd) on each node execute tasks and communicate hierarchically for fault tolerance. Optional components like slurmdbd and slurmrestd extend Slurm with accounting and REST APIs. A rich set of commands—such as srun, squeue, scancel, and sinfo—gives users and administrators full visibility and control.

Slurm’s modular plugin architecture supports nearly every aspect of cluster operation, including authentication, MPI integration, container runtimes, resource limits, energy accounting, topology-aware scheduling, preemption, and GPU management via Generic Resources (GRES). Nodes are organized into partitions, enabling sophisticated policies for job size, priority, fairness, oversubscription, reservation, and resource exclusivity.

Widely adopted across academia, research labs, and enterprise HPC environments, Slurm serves as the backbone for many of the world’s top supercomputers, offering a battle-tested, flexible, and highly configurable framework for large-scale distributed computing.