diff --git a/website/content/docs/connect/observability/index.mdx b/website/content/docs/connect/observability/index.mdx index ab5d91c387..1ba9c16850 100644 --- a/website/content/docs/connect/observability/index.mdx +++ b/website/content/docs/connect/observability/index.mdx @@ -24,10 +24,9 @@ metrics destination and service protocol you may want to enable [configuration entries](/docs/agent/options#config_entries) and [centralized service configuration](/docs/agent/options#enable_central_service_config). -If you -are using Kubernetes, the Helm chart can simplify much of the necessary -configuration, which you can learn about in the [observability -tutorial](https://learn.hashicorp.com/tutorials/consul/kubernetes-layer7-observability). +### Kubernetes +If you are using Kubernetes, the Helm chart can simplify much of the configuration needed to enable observability. See +our [Kubernetes observability docs](/docs/k8s/connect/observability/metrics) for more information. ### Metrics Destination diff --git a/website/content/docs/connect/observability/ui-visualization.mdx b/website/content/docs/connect/observability/ui-visualization.mdx index b678d4da37..fcd2ff46e2 100644 --- a/website/content/docs/connect/observability/ui-visualization.mdx +++ b/website/content/docs/connect/observability/ui-visualization.mdx @@ -33,6 +33,10 @@ Consul has built-in support for overlaying metrics from a be supported using a new and experimental JavaScript API. See [Custom Metrics Providers](#custom-metrics-providers). +## Kubernetes +If running Consul in Kubernetes, the Helm chart can automatically configure Consul's UI to display topology +visualizations. See our [Kubernetes observability docs](/docs/k8s/connect/observability/metrics) for more information. + ## Configuring the UI To Display Metrics To configure Consul's UI to fetch metrics there are two required configuration settings. @@ -70,6 +74,8 @@ ui_config { } ``` +Similarly, to configure the UI on Kubernetes, use this [reference](/docs/k8s/connect/observability/metrics). + ## Configuring Dashboard URLs Since Consul's visualization is intended as an overview of your mesh and not a diff --git a/website/content/docs/k8s/connect/index.mdx b/website/content/docs/k8s/connect/index.mdx index ec891e23aa..21d5aa4d2c 100644 --- a/website/content/docs/k8s/connect/index.mdx +++ b/website/content/docs/k8s/connect/index.mdx @@ -275,7 +275,7 @@ Annotations can be used to configure the injection behavior. consul.hashicorp.com/service-meta-bar: baz ``` -- `consul.hashicorp.com/sidecar-proxy-` - override default resource settings for +- `consul.hashicorp.com/sidecar-proxy-` - Override default resource settings for the sidecar proxy container. The defaults are set in Helm config via the [`connectInject.sidecarProxy.resources`](/docs/k8s/helm#v-connectinject-sidecarproxy-resources) key. @@ -284,6 +284,14 @@ Annotations can be used to configure the injection behavior. - `consul.hashicorp.com/sidecar-proxy-memory-limit` - Override the default memory limit. - `consul.hashicorp.com/sidecar-proxy-memory-request` - Override the default memory request. +- `consul.hashicorp.com/enable-metrics` - Override the default Helm value [`connectInject.metrics.defaultEnabled`](/docs/k8s/helm#v-connectinject-metrics-defaultenabled). +- `consul.hashicorp.com/enable-metrics-merging` - Override the default Helm value [`connectInject.metrics.defaultEnableMerging`](/docs/k8s/helm#v-connectinject-metrics-defaultenablemerging). +- `consul.hashicorp.com/merged-metrics-port` - Override the default Helm value [`connectInject.metrics.defaultMergedMetricsPort`](/docs/k8s/helm#v-connectinject-metrics-defaultmergedmetricsport). +- `consul.hashicorp.com/prometheus-scrape-port` - Override the default Helm value [`connectInject.metrics.defaultPrometheusScrapePort`](/docs/k8s/helm#v-connectinject-metrics-defaultprometheusscrapeport). +- `consul.hashicorp.com/prometheus-scrape-path` - Override the default Helm value [`connectInject.metrics.defaultPrometheusScrapePath`](/docs/k8s/helm#v-connectinject-metrics-defaultprometheusscrapepath). +- `consul.hashicorp.com/service-metrics-port` - Set the port where the Connect service exposes metrics. +- `consul.hashicorp.com/service-metrics-path` - Set the path where the Connect service exposes metrics. + ### Deployments, StatefulSets, etc. The annotations for configuring Connect must be on the pod specification. diff --git a/website/content/docs/k8s/connect/observability/metrics.mdx b/website/content/docs/k8s/connect/observability/metrics.mdx new file mode 100644 index 0000000000..ba6df8bd64 --- /dev/null +++ b/website/content/docs/k8s/connect/observability/metrics.mdx @@ -0,0 +1,146 @@ +--- +layout: docs +page_title: Metrics +sidebar_title: Metrics +description: Metrics for Consul on Kubernetes +--- + +# Metrics + +Consul on Kubernetes integrates with Prometheus and Grafana to provide metrics for Consul Service Mesh. The metrics +available are: + +* Connect Service metrics +* Sidecar proxy metrics +* Consul agent metrics +* Ingress, Terminating and Mesh Gateway metrics + +Specific sidecar proxy metrics can also be seen in the Consul UI Topology Visualization view. This section documents how to enable each of these. + +-> **Note:** Metrics will be supported in Consul-helm >= `0.31.0` and consul-k8s >= `0.25.0`. However, enabling the [metrics merging feature](#connect-service-and-sidecar-metrics-with-metrics-merging) with Helm value (`defaultEnableMerging`) or +annotation (`consul.hashicorp.com/enable-metrics-merging`) can only be used with Consul `1.10.0-alpha1` and above. The +other metrics configuration can still be used before Consul `1.10.0-alpha1`. + +## Connect Service and Sidecar Metrics with Metrics Merging + +Prometheus annotations are used to instruct Prometheus to scrape metrics from Pods. Prometheus annotations only support +scraping from one endpoint on a Pod, so Consul on Kubernetes supports metrics merging whereby service metrics and +sidecar proxy metrics are merged into one endpoint. If there are no service metrics, it also supports just scraping the +sidecar proxy metrics. + +The diagram below illustrates how the metrics integration works when merging is enabled: + +[![Metrics Merging Architecture](/img/metrics-arch.png)](/img/metrics-arch.png) + +Connect service metrics can be configured with the Helm values nested under [`connectInject.metrics`](/docs/k8s/helm#v-connectinject-metrics). + +Metrics and metrics merging can be enabled by default for all connect-injected Pods with the following Helm values: +```yaml +connectInject: + metrics: + defaultEnabled: true # by default, this inherits from the value global.metrics.enabled + defaultEnableMerging: true +``` +They can also be overridden on a per-Pod basis using the annotations `consul.hashicorp.com/enable-metrics` and +`consul.hashicorp.com/enable-metrics-merging`. + +~> In most cases, the default settings will be sufficient. If you are encountering issues with colliding ports or service +metrics not being merged, you may need to change the defaults. + +The Prometheus annotations configure the endpoint to scrape the metrics from. As shown in the diagram, the annotations point to a listener on `0.0.0.0:20200` on the Envoy sidecar. This listener and the corresponding Prometheus annotations can be configured with the following Helm values (or overridden on a per-Pod basis with Consul annotations `consul.hashicorp.com/prometheus-scrape-port` and `consul.hashicorp.com/prometheus-scrape-path`): +```yaml +connectInject: + metrics: + defaultPrometheusScrapePort: 20200 + defaultPrometheusScrapePath: "/metrics" +``` +Those Helm values will result in the following Prometheus annotations being automatically added to the Pod for scraping: +```yaml +metadata: + annotations: + prometheus.io/scrape: "true" + prometheus.io/path: "/metrics" + prometheus.io/port: "20200" +``` + +When metrics alone are enabled, the listener in the diagram on `0.0.0.0:20200` would point directly at the sidecar +metrics endpoint, rather than the merged metrics endpoint. The Prometheus scraping annotations would stay the same. + +When metrics and metrics merging are *both* enabled, metrics are combined from the service and the sidecar proxy, and +exposed via a local server on the Consul sidecar for scraping. This endpoint is called the merged metrics endpoint and +defaults to `127.0.0.1:20100/stats/prometheus`. The listener will target the merged metrics endpoint in the above case. +It can be configured with the following Helm values (or overridden on a per-Pod basis with +`consul.hashicorp.com/merged-metrics-port`): +```yaml +connectInject: + metrics: + defaultMergedMetricsPort: 20100 +``` + +The endpoint to scrape service metrics from can be configured only on a per-Pod basis via the Pod annotations `consul.hashicorp.com/service-metrics-port` and `consul.hashicorp.com/service-metrics-path`. If these are not configured, the service metrics port will default to the port used to register the service with Consul (`consul.hashicorp.com/connect-service-port`), which in turn defaults to the first port on the first container of the Pod. The service metrics path will default to `/metrics`. + +## Consul Agent Metrics + +Metrics from the Consul server and client Pods can be scraped via Prometheus by setting the field `global.metrics.enableAgentMetrics` to `true`. Additionally, one can configure the metrics retention time on the agents by configuring +the field `global.metrics.agentMetricsRetentionTime` which expects a duration and defaults to `"1m"`. This value must be greater than `"0m"` for the Consul servers and clients to emit metrics at all. As the Prometheus deployment currently does not support scraping TLS endpoints, agent metrics are currently *unsupported* when TLS is enabled. + +```yaml +global: + metrics: + enabled: true + enableAgentMetrics: true + agentMetricsRetentionTime: "1m" +``` + +## Gateway Metrics + +Metrics from the Consul gateways, namely the Ingress Gateways, Terminating Gateways and the Mesh Gateways can be scraped +via Prometheus by setting the field `global.metrics.enableGatewayMetrics` to `true`. The gateways emit standard Envoy proxy +metrics. To ensure that the metrics are not exposed to the public internet (as Mesh and Ingress gateways can have public +IPs), their metrics endpoints are exposed on the Pod IP of the respective gateway instance, rather than on all +interfaces on `0.0.0.0`. + +```yaml +global: + metrics: + enabled: true + enableGatewayMetrics: true +``` + +## Metrics in the UI Topology Visualization + +Consul’s built-in UI has a topology visualization for services part of the Consul Service Mesh. The topology visualization has the ability to fetch basic metrics from a metrics provider for each service and display those metrics as part of the [topology visualization](/docs/connect/observability/ui-visualization). + +The diagram below illustrates how the UI displays service metrics for a sample application: + +[![UI Topology View](/img/ui-topology.png)](/img/ui-topology.png) + +The topology view is configured under `ui.metrics`. This will enable the Consul UI to query the provider specified by +`ui.metrics.provider` at the URL of the Prometheus server `ui.metrics.baseURL` to display sidecar proxy metrics for the +service. The UI will display some specific sidecar proxy Prometheus metrics when `ui.metrics.enabled` is `true` and +`ui.enabled` is true. The value of `ui.metrics.enabled` defaults to `"-"` which means it will inherit from the value of +`global.metrics.enabled.` + +```yaml +ui: + enabled: true + metrics: + enabled: true # by default, this inherits from the value global.metrics.enabled + provider: "prometheus" + baseURL: http://prometheus-server +``` + +## Deploying Prometheus and Grafana (_for demo and non-production use-cases only_) + +The Helm chart contains demo manifests for Prometheus and Grafana. They can be installed with Helm via `prometheus.enabled` and `grafana.enabled`. These manifests are based on the community manifests for Prometheus and Grafana. +These are designed to allow quick bootstrapping for trial and demo use cases and not for production use-cases. + +Prometheus and Grafana will be installed in the same namespace that Consul will be installed in and will be installed +and uninstalled along with the Consul installation. + +```yaml +prometheus: + enabled: true +grafana: + enabled: true +``` diff --git a/website/content/docs/k8s/helm.mdx b/website/content/docs/k8s/helm.mdx index e23ddcda09..32611c938b 100644 --- a/website/content/docs/k8s/helm.mdx +++ b/website/content/docs/k8s/helm.mdx @@ -216,6 +216,24 @@ and consider if they're appropriate for your deployment. `-federation` (if setting `global.name`), otherwise `-consul-federation`. Requires consul-k8s 0.15.0+. + - `metrics` ((#v-global-metrics)) - Configures metrics for Consul service mesh + + - `enabled` ((#v-global-metrics-enabled)) (`boolean: false`) - Configures the Helm chart’s components + to expose Prometheus metrics for the Consul service mesh. By default + this includes gateway metrics and sidecar metrics. + + - `enableAgentMetrics` ((#v-global-metrics-enableagentmetrics)) (`boolean: false`) - Configures consul agent metrics. Only applicable if + `global.metrics.enabled` is true. + + - `agentMetricsRetentionTime` ((#v-global-metrics-agentmetricsretentiontime)) (`string: 1m`) - Configures the retention time for metrics in Consul clients and + servers. This must be greater than 0 for Consul clients and servers + to expose any metrics at all. + Only applicable if `global.metrics.enabled` is true. + + - `enableGatewayMetrics` ((#v-global-metrics-enablegatewaymetrics)) (`boolean: true`) - If true, mesh, terminating, and ingress gateways will expose their + Envoy metrics on port `20200` at the `/metrics` path and all gateway pods + will have Prometheus scrape annotations. Only applicable if `global.metrics.enabled` is true. + - `consulSidecarContainer` ((#v-global-consulsidecarcontainer)) (`map`) - The consul sidecar ensures the Consul services are always registered with their local Consul clients and is used by the ingress/terminating/mesh gateways as well as with every Connect-injected service. @@ -278,7 +296,7 @@ and consider if they're appropriate for your deployment. enable `server.exposeGossipAndRPCPorts` and `client.exposeGossipPorts`, that will configure the LAN gossip ports on the servers and clients to be hostPorts, so if you are running clients and servers on the same node the - ports will conflict if they are both 8301. When you enable + ports will conflict if they are both 8301. When you enable `server.exposeGossipAndRPCPorts` and `client.exposeGossipPorts`, you must change this from the default to an unused port on the host, e.g. 9301. By default the LAN gossip port is 8301 and configured as a containerPort on @@ -806,6 +824,18 @@ and consider if they're appropriate for your deployment. 'annotation-key': annotation-value ``` + - `metrics` ((#v-ui-metrics)) - Configurations for displaying metrics in the UI. + + - `enabled` ((#v-ui-metrics-enabled)) (`boolean: global.metrics.enabled`) - Enable displaying metrics in the UI. The default value of "-" + will inherit from `global.metrics.enabled` value. + + - `provider` ((#v-ui-metrics-provider)) (`string: prometheus`) - Provider for metrics. See + https://www.consul.io/docs/agent/options#ui_config_metrics_provider + This value is only used if `ui.enabled` is set to true. + + - `baseURL` ((#v-ui-metrics-baseurl)) (`string: http://prometheus-server`) - baseURL is the URL of the prometheus server, usually the service URL. + This value is only used if `ui.enabled` is set to true. + - `syncCatalog` ((#v-synccatalog)) - Configure the catalog sync process to sync K8S with Consul services. This can run bidirectional (default) or unidirectionally (Consul to K8S or K8S to Consul only). @@ -982,6 +1012,42 @@ and consider if they're appropriate for your deployment. - `reconcilePeriod` ((#v-connectinject-healthchecks-reconcileperiod)) (`string: 1m`) - If `healthChecks.enabled` is set to `true`, `reconcilePeriod` defines how often a full state reconcile is done after the initial reconcile at startup is completed. + - `metrics` ((#v-connectinject-metrics)) - Configures metrics for Consul Connect services. All values are overridable + via annotations on a per-pod basis. + + - `defaultEnabled` ((#v-connectinject-metrics-defaultenabled)) (`string: -`) - If true, the connect-injector will automatically + add prometheus annotations to connect-injected pods. It will also + add a listener on the Envoy sidecar to expose metrics. The exposed + metrics will depend on whether metrics merging is enabled: + - If metrics merging is enabled: + the Consul sidecar will run a merged metrics server + combining Envoy sidecar and Connect service metrics, + i.e. if your service exposes its own Prometheus metrics. + - If metrics merging is disabled: + the listener will just expose Envoy sidecar metrics. + This will inherit from `global.metrics.enabled`. + + - `defaultEnableMerging` ((#v-connectinject-metrics-defaultenablemerging)) (`boolean: false`) - Configures the Consul sidecar to run a merged metrics server + to combine and serve both Envoy and Connect service metrics. + + - `defaultMergedMetricsPort` ((#v-connectinject-metrics-defaultmergedmetricsport)) (`integer: 20100`) - Configures the port at which the Consul sidecar will listen on to return + combined metrics. This port only needs to be changed if it conflicts with + the application's ports. + + - `defaultPrometheusScrapePort` ((#v-connectinject-metrics-defaultprometheusscrapeport)) (`integer: 20200`) - Configures the port Prometheus will scrape metrics from, by configuring + the Pod annotation `prometheus.io/port` and the corresponding listener in + the Envoy sidecar. + NOTE: This is *not* the port that your application exposes metrics on. + That can be configured with the + `consul.hashicorp.com/service-metrics-port` annotation. + + - `defaultPrometheusScrapePath` ((#v-connectinject-metrics-defaultprometheusscrapepath)) (`string: /metrics`) - Configures the path Prometheus will scrape metrics from, by configuring the pod + annotation `prometheus.io/path` and the corresponding handler in the Envoy + sidecar. + NOTE: This is *not* the path that your application exposes metrics on. + That can be configured with the + `consul.hashicorp.com/service-metrics-path` annotation. + - `cleanupController` ((#v-connectinject-cleanupcontroller)) - Cleanup controller cleans up Consul service instances that remain registered despite their pods no longer running. This could happen if the pod's `preStop` hook failed to execute for some reason. @@ -1462,6 +1528,16 @@ and consider if they're appropriate for your deployment. - `name` ((#v-terminatinggateways-gateways-name)) (`string: terminating-gateway`) +- `prometheus` ((#v-prometheus)) - Configures a demo Prometheus installation. + + - `enabled` ((#v-prometheus-enabled)) (`boolean: false`) - When true, the Helm chart will install a demo Prometheus server instance + alongside Consul. + +- `grafana` ((#v-grafana)) - Configures a demo Grafana installation. + + - `enabled` ((#v-grafana-enabled)) (`boolean: false`) - When true, the Helm chart will install a demo Grafana instance + alongside Consul. + - `tests` ((#v-tests)) - Control whether a test Pod manifest is generated when running helm template. When using helm install, the test Pod is not submitted to the cluster so this is only useful when running helm template. diff --git a/website/data/docs-navigation.js b/website/data/docs-navigation.js index f7e7db5ccf..3f575073e3 100644 --- a/website/data/docs-navigation.js +++ b/website/data/docs-navigation.js @@ -178,6 +178,7 @@ export default [ 'connect-ca-provider', 'ambassador', 'health', + { category: 'observability', content: ['metrics'], name: "Observability" }, ], }, 'service-sync', diff --git a/website/public/img/metrics-arch.png b/website/public/img/metrics-arch.png new file mode 100644 index 0000000000..71dce8914e Binary files /dev/null and b/website/public/img/metrics-arch.png differ diff --git a/website/public/img/ui-topology.png b/website/public/img/ui-topology.png new file mode 100644 index 0000000000..5ccce62f1e Binary files /dev/null and b/website/public/img/ui-topology.png differ