Add information EIP: Common Prometheus metrics (#2159)

* Add common metrics EIP.

* Fix spelling error.

* Assign EIP number, added more information on what Prometheus does.

* Add link to prometheus website for further info.

* Fix link.

* Fix discussion link.

* Switch to standards track - interface instead of informational.

* Add motivation.
This commit is contained in:
Adrian Sutton 2019-07-02 19:44:19 +10:00 committed by Alex Beregszaszi
parent c2d4e98474
commit 69a78cca16
1 changed files with 61 additions and 0 deletions

61
EIPS/eip-2159.md Normal file
View File

@ -0,0 +1,61 @@
---
eip: 2159
title: Common Prometheus Metrics Names for Clients
author: Adrian Sutton (@ajsutton)
discussions-to: https://ethereum-magicians.org/t/common-chain-metrics/3415
status: Draft
type: Standards Track
category: Interface
created: 2019-07-01
---
<!--You can leave these HTML comments in your merged EIP and delete the visible duplicate text guides, they will not appear and may be helpful to refer to if you edit it again. This is the suggested template for new EIPs. Note that an EIP number will be assigned by an editor. When opening a pull request to submit your EIP, please use an abbreviated title in the filename, `eip-draft_title_abbrev.md`. The title should be 44 characters or less.-->
## Simple Summary
<!--"If you can't explain it simply, you don't understand it well enough." Provide a simplified and layman-accessible explanation of the EIP.-->
Standardized names for common metrics Ethereum clients to use with the [Prometheus](https://prometheus.io), a widely used monitoring and alerting solution.
## Abstract
<!--A short (~200 word) description of the technical issue being addressed.-->
Many Ethereum clients expose a range of metrics in a format compatible with Prometheus to allow operators to monitor the client's behaviour and performance and raise alerts if the chain isn't progressing or there are other indications of errors.
While the majority of these metrics are highly client-specific, reporting on internal implementation details of the client, some are applicable to all clients.
By standardizing the naming and format of these common metrics, operators are able to monitor the operation of multiple clients in a single dashboard or alerting configuration.
## Motivation
<!--The motivation is critical for EIPs that want to change the Ethereum protocol. It should clearly explain why the existing protocol specification is inadequate to address the problem that the EIP solves. EIP submissions without sufficient motivation may be rejected outright.-->
Using common names and meanings for metrics which apply to all clients allows node operators to monitor clusters of nodes using heterogeneous clients using a single dashboard and alerting configuration.
Currently there are no agreed names or meanings, leaving client developers to invent their own making it difficult to monitor a heterogeneous cluster.
## Specification
<!--The technical specification should describe the syntax and semantics of any new feature. The specification should be detailed enough to allow competing, interoperable implementations for any of the current Ethereum platforms (go-ethereum, parity, cpp-ethereum, ethereumj, ethereumjs, and [others](https://github.com/ethereum/wiki/wiki/Clients)).-->
The table below defines metrics which may be captured by Ethereum clients which expose metrics to Prometheus. Clients may expose additional metrics however these should not use the `ethereum_` prefix.
| Name | Metric type | Definition | JSON-RPC Equivalent |
|----------------------------------|-------------|-------------------------------------------------------------------|---------------------------------------------------------------------|
| ethereum_blockchain_height | Gauge | The current height of the canonical chain | `eth_blockNumber` |
| ethereum_best_known_block_number | Gauge | The estimated highest block available | `highestBlock` of `eth_syncing` or `eth_blockNumber` if not syncing |
| ethereum_peer_count | Gauge | The current number of peers connected | `net_peerCount` |
| ethereum_peer_limit | Gauge | The maximum number of peers this node allows to connect | No equivalent |
Note that `ethereum_best_known_block_number` always has a value. When the `eth_syncing` JSON-RPC method would return `false`, the current chain height is used.
## Rationale
<!--The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work, e.g. how the feature is supported in other languages. The rationale may also provide evidence of consensus within the community, and should discuss important objections or concerns raised during discussion.-->
The defined metrics are independent of Ethereum client implementation but provide sufficient information to create an overview dashboard to support monitoring a group of Ethereum nodes.
There is a similar, though more prescriptive, specification for [beacon chain client metrics](https://github.com/ethereum/eth2.0-metrics/blob/master/metrics.md).
The specific details of how to expose the metrics has been omitted as there is variance in existing implementations and standardising this does not provide any significant benefit.
## Backwards Compatibility
<!--All EIPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The EIP must explain how the author proposes to deal with these incompatibilities. EIP submissions without a sufficient backwards compatibility treatise may be rejected outright.-->
This is *not* a consensus affecting change.
Clients may already be publishing these metrics using different names and changing to the new form may break existing alerts or dashboards. Clients that want to avoid this incompatibility can expose the metrics under both the old and new names.
## Implementation
<!--The implementations must be completed before any EIP is given status "Final", but it need not be completed before the EIP is accepted. While there is merit to the approach of reaching consensus on the specification and rationale before writing code, the principle of "rough consensus and running code" is still useful when it comes to resolving many discussions of API details.-->
These metrics are currently captured by [Pantheon](https://pegasys.tech) under `pantheon_` prefixed names. They will be renamed to match this EIP in the future.
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).