Atlantis, a tool for planning and applying Terraform changes, faced slow restarts of up to 30 minutes due to a safe default in Kubernetes that became a bottleneck as the persistent volume used by Atlantis grew to millions of files. After investigation, a one-line change to fsGroupChangePolicy reduced restart time to about 30 seconds, saving roughly 50 hours of blocked engineering time per month.
Why this matters: Kubernetes safe defaults can become bottlenecks at scale. Audit fsGroupChangePolicy and PV permission settings on large stateful workloads.










