A heapdump will be captured when the usage trespasses the threshold.
Staying above the threshold won't trigger another heapdump.
If the usage goes down, then back up, that is considered another
"episode" to be captured in a heapdump.
This feature is driven by three parameters:
* HeapdumpDir: the directory where the watchdog will write the heapdump.
It will be created if it doesn't exist upon initialization. An error when
creating the dir will not prevent heapdog initialization; it will just
disable the heapdump capture feature.
If zero-valued, the feature is disabled. Heapdumps will be written to path:
<HeapdumpDir>/<RFC3339Nano formatted timestamp>.heap.
* HeapdumpMaxCaptures: sets the maximum amount of heapdumps a process will
generate. This limits the amount of episodes that will be captured, in case
the utilization climbs repeatedly over the threshold. By default, it is 10.
* HeapdumpThreshold: sets the utilization threshold that will trigger a
heap dump to be taken automatically. A zero value disables this feature.
By default, it is disabled.
This commit introduces the cgroup-driven watchdog. It can be
initialized by calling watchdog.CgroupDriven().
This watchdog infers the limit from the process' cgroup, which
is either derived from /proc/self/cgroup, or from the root
cgroup if the PID == 1 (running in a container).
Tests have been added/refactored to accommodate running locally
and in a Docker container.
Certain test cases now must be isolated from one another, to
prevent side-effects from dirty go runtimes. A Makefile has been
introduced to run all tests.
This commit introduces a major rewrite of go-watchdog.
* HeapDriven and SystemDriven are now distinct run modes.
* WIP ProcessDriven that uses cgroups.
* Policies are now stateless, pure and greatly simplified.
* Policies now return the next utilization at which GC
should run. The watchdog enforces that value differently
depending on the run mode.
* The heap-driven run mode adjusts GOGC dynamically. This
places the responsibility on the Go runtime to honour the
trigger point, and results in more robust logic that is not
vulnerable to very quick bursts within sampling periods.
* The heap-driven run mode is no longer polling (interval-driven).
Instead, it relies entirely on GC signals.
* The Silence and Emergency features of the watermark policy
have been removed. If utilization is above the last watermark,
the policy will request immediate GC.
* Races removed.