mirror of
https://github.com/waku-org/nwaku.git
synced 2025-01-18 19:01:24 +00:00
429 lines
13 KiB
Markdown
429 lines
13 KiB
Markdown
# nim-metrics
|
|
|
|
[![Build Status (Travis)](https://img.shields.io/travis/status-im/nim-metrics/master.svg?label=Linux%20/%20macOS "Linux/macOS build status (Travis)")](https://travis-ci.org/status-im/nim-metrics)
|
|
[![Windows build status (Appveyor)](https://img.shields.io/appveyor/ci/nimbus/nim-metrics/master.svg?label=Windows "Windows build status (Appveyor)")](https://ci.appveyor.com/project/nimbus/nim-metrics)
|
|
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
|
|
[![License: Apache](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
|
|
![Stability: experimental](https://img.shields.io/badge/stability-experimental-orange.svg)
|
|
|
|
## Introduction
|
|
|
|
Nim metrics client library supporting the [Prometheus](https://prometheus.io/)
|
|
monitoring toolkit, [StatsD](https://github.com/statsd/statsd/wiki) and
|
|
[Carbon](https://graphite.readthedocs.io/en/latest/feeding-carbon.html).
|
|
Designed to be thread-safe and efficient, it's disabled by default so libraries
|
|
can use it without any overhead for those library users not interested in
|
|
metrics.
|
|
|
|
## Installation
|
|
|
|
You can install the development version of the library through Nimble with the
|
|
following command:
|
|
```
|
|
nimble install https://github.com/status-im/nim-metrics@#master
|
|
```
|
|
|
|
## Usage
|
|
|
|
To enable metrics, compile your code with `-d:metrics --threads:on`.
|
|
|
|
To avoid depending on PCRE, compile with `-d:withoutPCRE`.
|
|
|
|
## Architectural overview
|
|
|
|
`Collector` objects holding various `Metric` objects are registered in one or
|
|
more `Registry` objects. There is a default registry being used for the most
|
|
common case.
|
|
|
|
Metric values are `float64`, but the API also accepts `int64` parameters which
|
|
are then cast to `float64`.
|
|
|
|
By starting an HTTP server, custom metrics (and some default ones) can be
|
|
pulled by Prometheus. By specifying backends, those same custom metrics will be
|
|
pushed to StatsD or Carbon servers, as soon as they are modified. They can also
|
|
be serialised to strings for some quick and dirty logging. Integration with the
|
|
[Chronicles](https://github.com/status-im/nim-chronicles) logging library is
|
|
available in a separate module.
|
|
|
|
That HTTP server used for pulling is running in its own thread. Metric pushing
|
|
also uses a dedicated thread for networking, in order to minimise the overhead.
|
|
|
|
## Collector types
|
|
|
|
### Counter
|
|
|
|
A counter's value can only be incremented.
|
|
|
|
```nim
|
|
# Declare a variable `myCounter` holding a `Counter` object with a `Metric`
|
|
# having the same name as the variable. The help string is mandatory. The initial
|
|
# value is 0 and it's automatically added to `defaultRegistry`.
|
|
declareCounter myCounter, "an example counter"
|
|
|
|
# increment it by 1
|
|
myCounter.inc()
|
|
|
|
# increment it by 10
|
|
myCounter.inc(10)
|
|
|
|
# count all exceptions in a block
|
|
someCounter.countExceptions:
|
|
foo()
|
|
|
|
# or just an exception type
|
|
otherCounter.countExceptions(ValueError):
|
|
bar()
|
|
|
|
# do you need a variable that's being exported from the module?
|
|
declarePublicCounter seenPeers, "number of seen peers"
|
|
# it's the equivalent of `var seenPeers* = ...`
|
|
|
|
# want to avoid declaring a variable, giving it a help string, or anything else for that matter?
|
|
counter("one_off_counter").inc()
|
|
# What this does is generate a {.global.} var, so as long as you use the same
|
|
# string, you're using the same counter. Using strings instead of identifiers
|
|
# skips any compiler protection in case of typos, so this API is not recommended
|
|
# for serious use.
|
|
```
|
|
|
|
### Gauge
|
|
|
|
Gauges can be incremented, decremented or set to a given value.
|
|
|
|
```nim
|
|
declareGauge myGauge, "an example gauge" # or `declarePublicGauge` to export it
|
|
myGauge.inc(4.5)
|
|
myGauge.dec(2)
|
|
myGauge.set(10)
|
|
|
|
myGauge.setToCurrentTime() # Unix timestamp in seconds
|
|
|
|
myGauge.trackInProgress:
|
|
# myGauge is incremented at the start of the block (a `myGauge.inc()` is being inserted here)
|
|
foo()
|
|
# and decremented at the end (`myGauge.dec()`)
|
|
|
|
# set the gauge to the runtime of a block, in seconds
|
|
myGauge.time:
|
|
bar()
|
|
|
|
# alternative, unrecommended API
|
|
gauge("one_off_gauge").set(42)
|
|
```
|
|
|
|
### Summary
|
|
|
|
Summaries sample observations and provide a total count and the sum of all observed values.
|
|
|
|
```nim
|
|
declareSummary mySummary, "an example summary" # or `declarePublicSummary` to export it
|
|
mySummary.observe(10)
|
|
mySummary.observe(0.5)
|
|
echo mySummary
|
|
```
|
|
|
|
This will print out:
|
|
|
|
```text
|
|
# HELP mySummary an example summary
|
|
# TYPE mySummary summary
|
|
mySummary_sum 10.5 1569332171696
|
|
mySummary_count 2.0 1569332171696
|
|
mySummary_created 1569332171.0
|
|
```
|
|
|
|
```nim
|
|
# observe the execution duration of a block, in seconds
|
|
mySummary.time:
|
|
foo()
|
|
|
|
# alternative, unrecommended API
|
|
summary("one_off_summary").observe(10)
|
|
```
|
|
|
|
### Histogram
|
|
|
|
These cumulative histograms store the count and total sum of observed values,
|
|
just like summaries. Further more, they place the observed values in
|
|
configurable buckets and provide per-bucket counts.
|
|
|
|
Note that an observed value will be counted in all buckets that have a size greater or equal to it.
|
|
|
|
```nim
|
|
declareHistogram myHistogram, "an example histogram" # or `declarePublicHistogram` to export it
|
|
# This uses the default bucket sizes: [0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1.0,
|
|
# 2.5, 5.0, 7.5, 10.0, Inf]
|
|
|
|
# You can customise the buckets:
|
|
declareHistogram withCustomBuckets, "custom buckets", buckets = [0.0, 1.0, 2.0]
|
|
# if you leave out the "Inf" bucket, it's added for you
|
|
withCustomBuckets.observe(0.5)
|
|
withCustomBuckets.observe(1)
|
|
withCustomBuckets.observe(1.5)
|
|
withCustomBuckets.observe(3.7)
|
|
echo withCustomBuckets
|
|
```
|
|
|
|
This will print out:
|
|
|
|
```text
|
|
# HELP withCustomBuckets custom buckets
|
|
# TYPE withCustomBuckets histogram
|
|
withCustomBuckets_sum 6.7 1569334493506
|
|
withCustomBuckets_count 4.0 1569334493506
|
|
withCustomBuckets_created 1569334493.0
|
|
withCustomBuckets_bucket{le="0.0"} 0.0
|
|
withCustomBuckets_bucket{le="1.0"} 2.0 1569334493506
|
|
withCustomBuckets_bucket{le="2.0"} 3.0 1569334493506
|
|
withCustomBuckets_bucket{le="+Inf"} 4.0 1569334493506
|
|
```
|
|
|
|
```nim
|
|
# observe the execution duration of a block, in seconds
|
|
myHistogram.time:
|
|
foo()
|
|
|
|
# alternative, unrecommended API
|
|
histogram("one_off_histogram").observe(10)
|
|
```
|
|
|
|
## Labels
|
|
|
|
Metric labels are supported for the Prometheus backend, as a way to add extra
|
|
dimensions corresponding to each combination of metric name and label values.
|
|
This can quickly get out of hand, as you can guess, so don't go overboard with
|
|
this feature. (See also the [relevant warnings in Prometheus' docs](https://prometheus.io/docs/practices/instrumentation/#do-not-overuse-labels).)
|
|
|
|
You declare label names when defining the collector and label values each time
|
|
you update it:
|
|
|
|
```nim
|
|
declareCounter lCounter, "example counter with labels", ["foo", "bar"]
|
|
lCounter.inc(labelValues = ["1", "a"]) # the label values must be strings
|
|
lCounter.inc(labelValues = ["2", "b"])
|
|
# How many metrics are now in this collector? Two, because we used two sets of label values:
|
|
echo lCounter
|
|
```
|
|
|
|
```text
|
|
# HELP lCounter example counter with labels
|
|
# TYPE lCounter counter
|
|
lCounter_total{foo="1",bar="a"} 1.0 1569340503703
|
|
lCounter_created{foo="1",bar="a"} 1569340503.0
|
|
lCounter_total{foo="2",bar="b"} 1.0 1569340503703
|
|
lCounter_created{foo="2",bar="b"} 1569340503.0
|
|
```
|
|
|
|
(OK, there are four metrics in total, because each one gets a `*_created` buddy.)
|
|
|
|
So if you must use labels, make sure there's a finite and small number of
|
|
possible label values being set.
|
|
|
|
## Metric name and label name validation
|
|
|
|
We use Prometheus standards for that, so metric names must comply with the
|
|
`^[a-zA-Z_:][a-zA-Z0-9_:]*$` regex while label names have to comply with
|
|
`^[a-zA-Z_][a-zA-Z0-9_]*$`.
|
|
|
|
In the examples you've seen so far, all collectors declared with
|
|
`declare<CollectorType>` had more stringent naming rules, since their names were
|
|
also identifiers for Nim variables - which can't have colons in them.
|
|
|
|
To overcome this, without relying on the discouraged alternative API, use the `name` parameter:
|
|
|
|
```nim
|
|
declareCounter cCounter, "counter with colons in name", name = "foo:bar:baz"
|
|
cCounter.inc()
|
|
echo cCounter
|
|
```
|
|
|
|
```text
|
|
# HELP foo:bar:baz counter with colons in name
|
|
# TYPE foo:bar:baz counter
|
|
foo:bar:baz_total 1.0 1569341756504
|
|
foo:bar:baz_created 1569341756.0
|
|
```
|
|
|
|
## Logging
|
|
|
|
Metrics are not logs, but you might want to log them nonetheless. The `$`
|
|
procedure is defined for collectors and registries, so you can just use the
|
|
built-in string serialisation to print them:
|
|
|
|
```nim
|
|
echo myCounter, myGauge
|
|
echo defaultRegistry
|
|
```
|
|
|
|
Integration with [Chronicles](https://github.com/status-im/nim-chronicles) is available in a separate module:
|
|
|
|
```nim
|
|
import chronicles, metrics, metrics/chronicles_support
|
|
|
|
# ...
|
|
|
|
info "myCounter", myCounter
|
|
debug "default registry", defaultRegistry
|
|
```
|
|
|
|
## Testing
|
|
|
|
When testing, you might want to isolate some collectors by registering them
|
|
into a custom registry:
|
|
|
|
```nim
|
|
var myRegistry = newRegistry()
|
|
declareCounter myCounter, "help", registry = myRegistry
|
|
echo myRegistry
|
|
|
|
# this means that `myCounter` is no longer registered in `defaultRegistry`
|
|
echo defaultRegistry
|
|
```
|
|
|
|
These unoptimised (read "very inefficient") `value()` and `valueByName()`
|
|
procedures for accessing metric values should only be used inside test suites:
|
|
|
|
```nim
|
|
suite "counter":
|
|
test "basic":
|
|
declareCounter myCounter, "help"
|
|
check myCounter.value == 0
|
|
myCounter.inc()
|
|
check myCounter.value == 1
|
|
|
|
declareSummary cSummary, "summary with colons in name", name = "foo:bar:baz"
|
|
cSummary.observe(10)
|
|
check cSummary.valueByName("foo:bar:baz_count") == 1
|
|
check cSummary.valueByName("foo:bar:baz_sum") == 10
|
|
```
|
|
|
|
## Prometheus endpoint
|
|
|
|
Start an HTTP server listening on 127.0.0.1:8000 from which the Prometheus
|
|
daemon can pull the metrics from all collectors in `defaultRegistry`, plus the
|
|
default ones:
|
|
|
|
```nim
|
|
startHttpServer()
|
|
```
|
|
|
|
Or set your own address and port to listen to:
|
|
|
|
```nim
|
|
import metrics, net
|
|
|
|
startHttpServer("127.0.0.1", Port(8000))
|
|
```
|
|
|
|
The HTTP server will run in its own thread. You can open
|
|
"http://127.0.0.1:8000/metrics" in a browser to see what Prometheus sees.
|
|
|
|
Default metrics available (see also [the relevant Prometheus docs](https://prometheus.io/docs/instrumenting/writing_clientlibs/#standard-and-runtime-collectors)):
|
|
|
|
```text
|
|
process_cpu_seconds_total
|
|
process_open_fds
|
|
process_max_fds
|
|
process_virtual_memory_bytes
|
|
process_resident_memory_bytes
|
|
process_start_time_seconds
|
|
nim_gc_mem_bytes
|
|
nim_gc_mem_occupied_bytes
|
|
nim_gc_heap_instance_occupied_bytes[type_name]
|
|
```
|
|
|
|
The `process_*` metrics are only available on Linux, for now.
|
|
|
|
`nim_gc_heap_instance_occupied_bytes` is only available when compiling with
|
|
`-d:nimTypeNames` and holds the top 5 instance types, in reverse order of their
|
|
total heap usage, at the time the metric is created.
|
|
|
|
Screenshot of [Grafana showing data from Prometheus that pulls it from Nimbus which uses nim-metrics](https://github.com/status-im/nimbus/tree/devel#metric-visualisation):
|
|
|
|
![Grafana screenshot](https://i.imgur.com/AdtavDA.png)
|
|
|
|
## StatsD
|
|
|
|
Add a [StatsD](https://github.com/statsd/statsd/wiki) export backend where
|
|
metric updates will be pushed as soon as they are created:
|
|
|
|
```nim
|
|
import metrics, net
|
|
|
|
when defined(metrics):
|
|
addExportBackend(
|
|
metricProtocol = STATSD,
|
|
netProtocol = UDP,
|
|
address = "127.0.0.1",
|
|
port = Port(8125)
|
|
)
|
|
|
|
declareCounter myCounter, "some counter"
|
|
myCounter.inc()
|
|
|
|
# When we incremented the counter, the corresponding data was sent over the wire to the StatsD daemon.
|
|
```
|
|
|
|
The only supported collector types are counters and gauges. There's a dedicated
|
|
thread that does the networking part. When you update these collectors, data is
|
|
sent over a channel to that thread. If the channel's buffer is full, the data
|
|
is silently dropped. Same for an unreachable backend or any other networking
|
|
error. Reconnections are tried automatically and there's one socket per backend
|
|
being reused.
|
|
|
|
All the complexity is hidden from the API user and additional latency is kept
|
|
to a minimum. Exported metrics are treated like disposable data and dropped at
|
|
the first sign of trouble.
|
|
|
|
Counters support an additional parameter just for StatsD: `sampleRate`. This
|
|
allows sending just a percentage of the increments to the StatsD daemon.
|
|
Nothing else changes on the client side.
|
|
|
|
```nim
|
|
declareCounter sCounter, "counter with a sample rate set", sampleRate = 0.1
|
|
sCounter.inc()
|
|
|
|
# Now only 10% (on average) of this counter's updates will be sent over the
|
|
# wire. We throw a dice when the time comes, using a simple PRNG. We also
|
|
# inform the StatsD daemon about this rate, so it can adjust its estimated value
|
|
# accordingly.
|
|
```
|
|
|
|
## Carbon
|
|
|
|
Add a [Carbon](https://graphite.readthedocs.io/en/latest/feeding-carbon.html)
|
|
export backend where metric updates will be pushed as soon as they are created:
|
|
|
|
```nim
|
|
import metrics, net
|
|
|
|
when defined(metrics):
|
|
addExportBackend(
|
|
metricProtocol = CARBON,
|
|
netProtocol = TCP,
|
|
address = "127.0.0.1",
|
|
port = Port(2003)
|
|
)
|
|
```
|
|
|
|
The implementation is very similar to the StatsD metric exporting described above.
|
|
|
|
You may add as many export backends as you want, but deleting them from the
|
|
`exportBackends` global variable is unsupported.
|
|
|
|
## Contributing
|
|
|
|
When submitting pull requests, please add test cases for any new features or
|
|
fixes and make sure `nimble test` is still able to execute the entire test
|
|
suite successfully.
|
|
|
|
## License
|
|
|
|
Licensed and distributed under either of
|
|
|
|
* MIT license: [LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT
|
|
* Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
|
|
|
|
at your option. These files may not be copied, modified, or distributed except according to those terms.
|
|
|