mirror of
https://github.com/waku-org/nwaku.git
synced 2025-01-16 01:45:40 +00:00
docs: add monitoring guide for prometheus and grafana (#1168)
* docs: add monitoring guide for prometheus and grafana * docs: address comments
This commit is contained in:
parent
6420669c47
commit
269bb4944b
111
docs/operators/how-to/monitor.md
Normal file
111
docs/operators/how-to/monitor.md
Normal file
@ -0,0 +1,111 @@
|
||||
# Monitor nwaku using Prometheus and Grafana
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. A running nwaku instance with HTTP metrics server enabled (i.e. with `--metrics-server:true`)
|
||||
2. [Prometheus](https://prometheus.io/) and [Grafana](https://grafana.com/) installed
|
||||
|
||||
### Installing Prometheus
|
||||
|
||||
Prometheus can be installed by downloading and extracting
|
||||
the latest release for your system distribution from the [Prometheus download page](https://prometheus.io/download/).
|
||||
|
||||
For example, on a DebianOS distribution you could run
|
||||
|
||||
```bash
|
||||
wget https://github.com/prometheus/prometheus/releases/download/v2.38.0/prometheus-2.38.0.linux-amd64.tar.gz
|
||||
tar xvfz prometheus-2.38.0.linux-amd64.tar.gz
|
||||
```
|
||||
|
||||
For more advanced installations,
|
||||
Prometheus has a handy [Getting Started](https://prometheus.io/docs/prometheus/latest/getting_started/) page to guide you through the process.
|
||||
There are also many third party guides on installing Prometheus for specific distributions,
|
||||
such as [this old but still relevant one](https://www.digitalocean.com/community/tutorials/how-to-install-prometheus-on-ubuntu-16-04) from DigitalOcean.
|
||||
We also suggest running Prometheus as a service,
|
||||
as explained by [this guide](https://www.devopsschool.com/blog/how-to-run-prometheus-server-as-a-service/).
|
||||
Bear in mind that we'll be creating our own `prometheus.yml` configuration file later on when you encounter this in any of the guides.
|
||||
|
||||
### Installing Grafana
|
||||
|
||||
Follow the [installation instructions](https://grafana.com/docs/grafana/latest/setup-grafana/installation/) appropriate to your distribution to install Grafana.
|
||||
The stable version of the Grafana Enterprise Edition is the free, recommended edition to install.
|
||||
|
||||
## Configure Prometheus
|
||||
|
||||
1. Create a file called `prometheus.yml` with the following content:
|
||||
|
||||
```yml
|
||||
global:
|
||||
scrape_interval: 15s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: 'prometheus'
|
||||
scrape_interval: 5s
|
||||
static_configs:
|
||||
- targets: ['localhost:9090']
|
||||
- job_name: 'nwaku'
|
||||
scrape_interval: 1s
|
||||
static_configs:
|
||||
- targets: ['localhost:<nwaku_port>']
|
||||
```
|
||||
|
||||
Replace `<nwaku_port>` with the metrics HTTP server port of your running nwaku instance.
|
||||
For default configurations metrics are reported on port `8008` of the `localhost`.
|
||||
If you've used `--ports-shift`, or explicitly set the metrics port using `--metrics-server-port`, this port will be different from the default.
|
||||
It's possible to extract the metrics server port from the startup logs of the nwaku node.
|
||||
Look for a log with the format below and substitute `nwaku_port` with the value reported after `serverPort=`:
|
||||
|
||||
```
|
||||
INF 2022-09-16 12:14:12.739+01:00 Metrics HTTP server started topics="wakunode.setup.metrics" tid=6243 file=wakunode2_setup_metrics.nim:29 serverIp=127.0.0.1 serverPort=8009
|
||||
```
|
||||
|
||||
2. Start Prometheus using the config file you created above:
|
||||
|
||||
```bash
|
||||
./path/to/prometheus --config.file=/path/to/prometheus.yml &
|
||||
```
|
||||
|
||||
3. Verify that Prometheus is running correctly.
|
||||
|
||||
Once Prometheus is running, it exposes by default a management console on port `9090`.
|
||||
If you are running Prometheus locally, for example,
|
||||
you can visit http://localhost:9090/ in a browser to view basic info about the running instance.
|
||||
http://localhost:9090/targets shows the state of the different metrics server endpoints that we configured in `prometheus.yml`.
|
||||
In our case we'd expect Prometheus to successfully scrape metrics off two endpoints,
|
||||
the running nwaku instance and Prometheus itself.
|
||||
|
||||
## Configure Grafana
|
||||
|
||||
1. Start the Grafana server, if it's not running already after installation.
|
||||
|
||||
```bash
|
||||
sudo systemctl start grafana-server
|
||||
```
|
||||
|
||||
2. Open Grafana in your browser.
|
||||
|
||||
Grafana exposes its interface by default on port `3000`.
|
||||
For example, if you are running Grafana locally,
|
||||
you can find it by navigating to http://localhost:3000/.
|
||||
If you are prompted for a username and password,
|
||||
the default is `admin` in both cases.
|
||||
|
||||
3. Set Prometheus as your data source.
|
||||
|
||||
[These instructions](https://grafana.com/docs/grafana/latest/datasources/add-a-data-source/) describe how to add a new data source.
|
||||
The default values for setting up a Prometheus data source should be sufficient.
|
||||
|
||||
4. Create a new dashboard or import an existing one.
|
||||
|
||||
You can now visualize metrics off your running nwaku instance by [creating a new dashboard and adding panels](https://grafana.com/docs/grafana/latest/dashboards/add-organize-panels/) for the metric(s) of your choice.
|
||||
To get you started,
|
||||
we have published a [basic monitoring dashboard for a single nwaku node](https://github.com/status-im/nwaku/blob/d4e899fba77389d20ca19c73a9443501039cdef2/metrics/waku-single-node-dashboard.json)
|
||||
which you can [import to your Grafana instance](https://grafana.com/docs/grafana/latest/dashboards/manage-dashboards/#import-a-dashboard).
|
||||
|
||||
5. Happy monitoring!
|
||||
|
||||
Some of the most important metrics to keep an eye on include:
|
||||
- `libp2p_peers` as an indication of how many peers your node is connected to,
|
||||
- `waku_node_messages_total` to view the total amount of network traffic relayed by your node and
|
||||
- `waku_node_errors` as a rough indication of basic operating errors logged by the node.
|
||||
|
Loading…
x
Reference in New Issue
Block a user