6.7 KiB
Waku fleet: management & monitoring
Background
Status currently maintains two fleets for nwaku
nodes,
the waku.test
fleet and the waku.sandbox
(sandbox) fleet.
They'll be referred to as test
and sandbox
in this document.
Status fleet nodes and addresses can be viewed here.
Fleet overview
At the time of writing this, each fleet consists of three waku nodes,
with a websockify WebSocket-to-TCP bridge for each node.
Waku peers can choose to connect either directly to a node's TCP endpoint
or the bridged WebSocket depending on their own supported transports.
The sandbox
fleet also has a deployed chat2bridge
,
which serves as a bridge between the Waku toy-chat and Matterbridge.
The chat2bridge
is currently deployed to the node-01.do-ams3
datacentre
and configured to bridge toy-chat messages to the #waku channel
on the Vac Discord Server.
Fleet deployment rationale
The test
fleet is automatically updated after every commit to the nwaku
repository master
branch and is therefore the most up to date representation of Waku development.
It is suitable for testing new features before they're rolled out to the (more) stable sandbox
fleet.
In general only the latest release of nwaku
is deployed to the sandbox
fleet.
It requires manual updating and should therefore be more stable than test
.
See the section on Jenkins below for more on the deployment process.
Related repos
The infra-docs
repo contains the most comprehensive overview of Status infrastructure.
This is a private repository.
Feel free to contact someone in the team to request access.
The infra-nim-waku
repo contains the infrastructure definitions for Waku nodes implemented in Nim.
Monitoring and management
The rest of this document highlights some infra services of specific interest to Waku fleet monitoring and management:
- Consul to view the health status of Waku nodes.
- Kibana to view and filter logs.
- Grafana to view and filter metrics.
- Jenkins to configure and deploy new builds to the fleets.
1. Consul for health checks
Consul provides a useful high-level view of the health of the nwaku
fleets.
It aggregates the result of various monitoring checks
and shows the health status for the node itself, the RPC API, exposed WebSocket and metrics.
The datacentre can be changed in the upper left-hand corner.
2. Kibana for logs
Kibana is a powerful visualisation tool for Elasticsearch data.
For Waku fleets it can be used to retrieve, filter and view the logs for all deployed services.
For example, to view the latest logs for sandbox
,
Kibana can be opened in "Discover" mode with an active filter for fleet: waku.sandbox
.
3. Grafana for metrics
The Nim-Waku
Grafana dashboard displays live and historical metrics for Waku nodes.
The default view includes metrics from both fleets,
though it's possible to filter by Hostname
, Fleet name
or Data Center
.
The time range can also be configured -
by default the latest metrics will be shown.
The dashboard itself includes an "At a glance" summary with an overview of the latest connected peers, total messages, CPU usage, reported errors, etc. The "General" collection contains a more in-depth look at node, libp2p and performance-related metrics. This is followed by separate panel collections showing per-protocol metrics.
A copy of the Nim-Waku
fleets dashboard is maintained in the nwaku
repo.
From time to time certain Prometheus queries may fail,
often when the underlying metrics are renamed.
Please report any broken panels via our Discord channels or by creating an issue in nwaku
.
4. Jenkins for deployment
The nim-waku
jobs on Jenkins are configured to deploy nwaku
builds to the fleets.
deploy-waku-test
is triggered automatically after every commit to thenwaku
master
branch.deploy-waku-sandbox
must be triggered manually. Usually this job is only built after a tagged release innwaku
.
Each job can be manually triggered using the "Build with Parameters" option. Options under "Configure" include the build triggers, build target and branches to build. These should only be changed with care.
See Continuous Integration docs for more.
Quick links
chat2bridge
- Consul for do-ams3
- Consul for ac-cn-hongkong-c
- Consul for gc-us-central1-a
- Grafana Nim-Waku dashboard
infra-docs
repoinfra-waku
repo- Jenkins jobs for
nim-waku
- Jenkins deploy-waku-sandbox manual trigger
- Jenkins deploy-waku-test manual trigger
- Kibana logs for
sandbox
- Kibana logs for
test
- Status fleets
- Status fleets - Table
- Websockify