Join us

ContentUpdates and recent posts about Slurm..
Link
@faun shared a link, 7 months, 1 week ago
FAUN.dev()

Microservices Are a Tax Your Startup Probably Can’t Afford

Premature microservicesare like planting seeds in concrete. They'll stall your startup's momentum. A monolith is your friend here—simple, reliable, with the vast realm of open-source at your disposal. A crispmonorepotightens team synergy and sidesteps the quagmire of complexity, unlike those headach.. read more  

Microservices Are a Tax Your Startup Probably Can’t Afford
Link
@faun shared a link, 7 months, 1 week ago
FAUN.dev()

Major Updates to VS Code Docker: Introducing Container Tools

Dockertransforms intoContainer Tools, handing developers the keys to tool customization and runtime selection. A pivotal shift for those who dwell in the land of containers... read more  

Major Updates to VS Code Docker: Introducing Container Tools
Link
@faun shared a link, 7 months, 1 week ago
FAUN.dev()

The Kubernetes Gateway API through beginner’s eyes

Gateway API, the sassy heir to Ingress, jugglesL4 & L7 protocolslike it was born for it. Tosses out those annoying, vendor-specific annotations to clean up Kubernetes networking. On a whim, I swapped an external cronjob for aKubernetes CronJob—because tinkering is a blast, and, let's face it, automa.. read more  

The Kubernetes Gateway API through beginner’s eyes
Story
@laura_garcia shared a post, 7 months, 1 week ago
Software Developer, RELIANOID

Women in STEM

🚺✨ The rise of women in STEM is inspiring change, and nowhere is this more evident than in Cybersecurity. Despite making up only 24% of the workforce, women are increasingly leading the charge in securing our digital world. RELIANOID is proud to champion gender diversity in the cybersecurity sector...

Blog women and girls in STEM and Cybersecurity RELIANOID
Link
@anjali shared a link, 7 months, 1 week ago
Customer Marketing Manager, Last9

CloudWatch vs OpenTelemetry: Choosing What Fits Your Stack

CloudWatch vs OpenTelemetry: Understand the trade-offs and choose the observability approach that fits your team's architecture and workflows.

otel
Link
@anjali shared a link, 7 months, 1 week ago
Customer Marketing Manager, Last9

OpenTelemetry PHP: A Detailed Implementation Guide

Learn how to set up OpenTelemetry PHP to collect traces, metrics, and logs from your PHP apps and improve observability across your stack.

logging
Story
@laura_garcia shared a post, 7 months, 1 week ago
Software Developer, RELIANOID

Hack Space Con 2025

Mark your calendars for Hack Space Con 2025 – where cybersecurity meets space technology! Taking place from May 11-15 at the Kennedy Space Center & Radisson Resort at the Port in Cape Canaveral, this event unites cybersecurity professionals, ethical hackers, and space tech enthusiasts. Don’t miss th..

HACKSPACECON2025 florida RELIANOID.
Link
@anjali shared a link, 7 months, 1 week ago
Customer Marketing Manager, Last9

The Complete Guide to Observing RabbitMQ

Learn how to monitor, troubleshoot, and improve RabbitMQ performance with the right metrics, tools, and observability practices.

rabbit
Link
@anjali shared a link, 7 months, 2 weeks ago
Customer Marketing Manager, Last9

Kubernetes Alerting That Won’t Burn You Out

A practical guide to Kubernetes alerting—cut the noise, catch what matters, and avoid those unnecessary 3AM wake-up calls.

kubernetes
Link
@anjali shared a link, 7 months, 2 weeks ago
Customer Marketing Manager, Last9

Essential Python Monitoring Techniques You Need to Know

Learn the key techniques to monitor Python performance, catch bottlenecks early, and keep your applications fast and reliable at scale.

Python Logging Best Practices_ The Ultimate Guide
Slurm Workload Manager is an open-source, fault-tolerant, and highly scalable cluster management and scheduling system widely used in high-performance computing (HPC). Designed to operate without kernel modifications, Slurm coordinates thousands of compute nodes by allocating resources, launching and monitoring jobs, and managing contention through its flexible scheduling queue.

At its core, Slurm uses a centralized controller (slurmctld) to track cluster state and assign work, while lightweight daemons (slurmd) on each node execute tasks and communicate hierarchically for fault tolerance. Optional components like slurmdbd and slurmrestd extend Slurm with accounting and REST APIs. A rich set of commands—such as srun, squeue, scancel, and sinfo—gives users and administrators full visibility and control.

Slurm’s modular plugin architecture supports nearly every aspect of cluster operation, including authentication, MPI integration, container runtimes, resource limits, energy accounting, topology-aware scheduling, preemption, and GPU management via Generic Resources (GRES). Nodes are organized into partitions, enabling sophisticated policies for job size, priority, fairness, oversubscription, reservation, and resource exclusivity.

Widely adopted across academia, research labs, and enterprise HPC environments, Slurm serves as the backbone for many of the world’s top supercomputers, offering a battle-tested, flexible, and highly configurable framework for large-scale distributed computing.