> If you have finite storage, you need to do something.
Sure.
> A very simple rule is to just drop everything older than a threshold.
Yep, makes sense.
> (Or another simple rule: you keep x GiB of information, and drop the oldest logs until you fit into that size limit.)
Same thing, it seems like.
> A slightly more complicated rules is: drop your logs with increasing probability as they age.
This doesn't really make sense to me. Logs aren't probabilistic. If something happened on 2024-11-01T12:01:01Z, and I have logs for that timestamp, then I should be able to see that thing which happened at that time.
> Sampling metrics also loses information.
I mean, literally not, right? Metrics are explicitly the pillar of observability which can aggregate over time without loss of information. You never need to sample metric data. You just aggregate at whatever layers exist in your system. You can roll-up metric data from whatever input granularity D1 to whatever output granularity e.g. 10·D1, and that "loses" information in some sense, I guess, but the information isn't really lost, it's just made more coarse, or less specific. It's not in any way the same as sampling of e.g. log data, which literally deletes information. Right?
Sure.
> A very simple rule is to just drop everything older than a threshold.
Yep, makes sense.
> (Or another simple rule: you keep x GiB of information, and drop the oldest logs until you fit into that size limit.)
Same thing, it seems like.
> A slightly more complicated rules is: drop your logs with increasing probability as they age.
This doesn't really make sense to me. Logs aren't probabilistic. If something happened on 2024-11-01T12:01:01Z, and I have logs for that timestamp, then I should be able to see that thing which happened at that time.
> Sampling metrics also loses information.
I mean, literally not, right? Metrics are explicitly the pillar of observability which can aggregate over time without loss of information. You never need to sample metric data. You just aggregate at whatever layers exist in your system. You can roll-up metric data from whatever input granularity D1 to whatever output granularity e.g. 10·D1, and that "loses" information in some sense, I guess, but the information isn't really lost, it's just made more coarse, or less specific. It's not in any way the same as sampling of e.g. log data, which literally deletes information. Right?