Instrumentation with Prometheus in Practice
Summaries: High-Accuracy Quantiles with Limitations
Summaries, like histograms, are used to track the count, sum, and (optionally) precomputed quantiles of observed values, but they do so in a different way.
Unlike histograms, summaries do not store raw bucketed data, which means their quantile estimations are calculated on the client side (in the instrumented application) rather than on the server side (in Prometheus). This has several implications that limit their usefulness in distributed systems.
So instead of buckets:
[...]
duration_seconds{bucket="0.1"} 5
duration_seconds{bucket="0.2"} 15
duration_seconds{bucket="0.5"} 30
duration_seconds_count 42
duration_seconds_sum 21.5
[...]
Summaries directly provide quantile estimates:
[...]
duration_seconds{quantile="0.3"} 0.25
duration_seconds{quantile="0.5"} 0.5
duration_seconds{quantile="0.9"} 0.75
duration_seconds_count 42
duration_seconds_sum 21.5
[...]
If the application recorded this since x minutes ago, it means that during that time:
- 30% of the recorded durations were less than or equal to 0.25 seconds.
- 50% of the recorded durations were less than or equal to 0.5 seconds.
- 90% of the recorded durations were less than or equal to 0.75 seconds.
These values represent the distribution of observed durations since x minutes ago. In practice, Prometheus client libraries use a rotating buffer. For example, the Go client library uses the following configuration:
- They keep 10 quantile snapshots in memory.
- Each quantile tracks 1 minute of data.
- Together, they hold about 10 minutes of observations.
// Default values for SummaryOpts.
const (
// DefMaxAge is the default duration for which observations stay
// relevant.
DefMaxAge time.Observability with Prometheus and Grafana
A Complete Hands-On Guide to Operational Clarity in Cloud-Native SystemsEnroll now to unlock all content and receive all future updates for free.
