Skip to main content

Configuring Prometheus storage retention

How can you control how much history Prometheus keeps?
Prometheus stores time series and their samples on disk. Given that disk space is a finite resource, you want some limit on how much of it Prometheus will use. Historically this was done with the --storage.tsdb.retention flag, which specifies the time range which Prometheus will keep available. This is a minimum, so it'll keep an entire block if some of it is still within the retention window. If you know your ingestion rate in samples per second then you can multiply it by the typical bytes per sample (1.5ish, 2 to be safe) and the retention time to get an idea of how much disk space will be used.
As of Prometheus 2.7 and 2.8 there's new flags and options. 2.7 introduced an option for size-based retention with --storage.tsdb.retention.size, that is specifying a maximum amount of disk space used by blocks. This is still a bit best effort though as it does not (yet) include the space taken by the WAL or blocks being populated by compaction. To be safe allow for space for the WAL and one maximum size block (which is the smaller of 10% of the retention time and a month).
How do all these flags interact though? The behaviour was made more obvious in 2.8. There are three flags: --storage.tsdb.retention.size, --storage.tsdb.retention.time, and --storage.tsdb.retention. --storage.tsdb.retention is deprecated, superseded by --storage.tsdb.retention.time for clarity. This table summarises how they work together:
--storage.tsdb.
retention.size
--storage.tsdb.
retention.time
--storage.tsdb.
retention
Result
Not setNot setNot setDefault 15d retention applies.
Not set20dNot se20d retention applies.
Not setNot set10d10d retention applies.
Not set20d10d20d retention applies.
1TBNot setNot set1TB size retention applies, no time limit.
1TB20dNot set1TB size and 20d time retention apply - which ever happens first.
1TB20d10d1TB size and 20d time retention apply - which ever happens first.

As you can see the --storage.tsdb.retention.time overrides the deprecated --storage.tsdb.retention, and both time and size based retention can be in force at once.
One common confusion I'd also like to cover is that Prometheus storage retention is not limited to 15 days. That's merely a default that was chosen many years ago, two storage backends back. As the above shows you can have it higher than that, or even not apply at all.

Comments

Popular posts from this blog

CKA Simulator Kubernetes 1.22

  https://killer.sh Pre Setup Once you've gained access to your terminal it might be wise to spend ~1 minute to setup your environment. You could set these: alias k = kubectl                         # will already be pre-configured export do = "--dry-run=client -o yaml"     # k get pod x $do export now = "--force --grace-period 0"   # k delete pod x $now Vim To make vim use 2 spaces for a tab edit ~/.vimrc to contain: set tabstop=2 set expandtab set shiftwidth=2 More setup suggestions are in the tips section .     Question 1 | Contexts Task weight: 1%   You have access to multiple clusters from your main terminal through kubectl contexts. Write all those context names into /opt/course/1/contexts . Next write a command to display the current context into /opt/course/1/context_default_kubectl.sh , the command should use kubectl . Finally write a second command doing the same thing into ...

OWASP Top 10 Threats and Mitigations Exam - Single Select

Last updated 4 Aug 11 Course Title: OWASP Top 10 Threats and Mitigation Exam Questions - Single Select 1) Which of the following consequences is most likely to occur due to an injection attack? Spoofing Cross-site request forgery Denial of service   Correct Insecure direct object references 2) Your application is created using a language that does not support a clear distinction between code and data. Which vulnerability is most likely to occur in your application? Injection   Correct Insecure direct object references Failure to restrict URL access Insufficient transport layer protection 3) Which of the following scenarios is most likely to cause an injection attack? Unvalidated input is embedded in an instruction stream.   Correct Unvalidated input can be distinguished from valid instructions. A Web application does not validate a client’s access to a resource. A Web action performs an operation on behalf of the user without checkin...