From 1857f736690b3d86ec0cfa7bb526a74d44d6f1ce Mon Sep 17 00:00:00 2001 From: John Landa Date: Thu, 29 Feb 2024 18:40:19 -0700 Subject: [PATCH] Johnlanda/fault injection docs (#20713) * fault injection docs * Add link to the fault injection docs from nav * Fix formatting * Update enterprise docs * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/enterprise/index.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/enterprise/index.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update website/content/docs/connect/manage-traffic/fault-injection.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Update docs-nav-data.json add fault injection to nav * Update docs-nav-data.json * Update docs-nav-data.json * Update docs-nav-data.json * Update v1_18_x.mdx * Update v1_4_x.mdx * Update v1_4_x.mdx --------- Co-authored-by: David Yu Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> --- .../manage-traffic/fault-injection.mdx | 204 ++++++++++++++++++ website/content/docs/enterprise/index.mdx | 8 +- .../docs/release-notes/consul-k8s/v1_4_x.mdx | 16 +- .../docs/release-notes/consul/v1_18_x.mdx | 2 +- website/data/docs-nav-data.json | 8 + 5 files changed, 227 insertions(+), 11 deletions(-) create mode 100644 website/content/docs/connect/manage-traffic/fault-injection.mdx diff --git a/website/content/docs/connect/manage-traffic/fault-injection.mdx b/website/content/docs/connect/manage-traffic/fault-injection.mdx new file mode 100644 index 0000000000..2b81328466 --- /dev/null +++ b/website/content/docs/connect/manage-traffic/fault-injection.mdx @@ -0,0 +1,204 @@ +--- +layout: docs +page_title: Fault injection +description: Learn how to use fault injection in a Consul service mesh to inject artificial latency, response codes, or bandwidth limits for application testing. +--- + +# Fault Injection + +This topic describes the process to inject different types of faults into the services in Consul's service mesh to help you test your network's resilience. + + This feature is available in Consul Enterprise. + +## Introduction + +Consul allows you to configure fault injection filters to alter the responses from an upstream service. By injecting HTTP and gRPC statuses, bandwidth limits, or delays, users can test the resilience of their system to different unexpected issues. + +Consul applies fault injection filters to an upstream service as part of its service defaults. For example, to simulate that an upstream returned a 500 HTTP status code 50 percent of the time, add an `Abort` configuration to the Envoy extensions block in the service default configuration for the upstream service. + +The fault injection filters may be used individually, or in a combination of the three types by adding multiple blocks to the `Arguments.Config` section of the configuration. + +## Requirements + +Consul Enterprise v1.18.0 or later + +## Inject delays + +Specify the fault injection behavior in the [service defaults configuration entry](/consul/docs/connect/config-entries/service-defaults). + + + + +To inject a delay fault when Consul runs on virtual machines, configure the following parameters: + +1. `Arguments.Config.Delay.Duration`: The delay in milliseconds to use when injecting a delay type fault. +1. `Arguments.Config.Delay.Percentage`: The percentage of responses to inject the delay behavior into. + +The following example configures the default behavior for a service named `billing`. This configuration injects a 100 ms delay into 50 percent of responses from the billing service. + +```hcl +Kind = "service-defaults" +Name = "billing" +EnvoyExtensions = [ +{ + Name = "builtin/fault-injection" + Arguments = { + Config = { + Delay = { + Duration = 100 + Percentage = 50 + } + } + } +} +] +``` + + + + +To inject a delay fault when Consul runs on Kubernetes, configure the following parameters: + +1. `spec.arguments.config.delay.duration`: The delay in milliseconds to use when injecting a delay type fault. +1. `spec.arguments.config.delay.percentage`: The percentage of responses to inject the delay behavior into. + +The following example configures the default behavior for a service named `billing`. This configuration injects a 100 ms delay into 50 percent of responses from the billing service. + +```yaml +kind: ServiceDefaults +metadata: + name: billing +spec: + envoyExtensions: + - name: "builtin/fault-injection" + arguments: + config: + delay: + duration: 100 + percentage: 50 +``` + + + + + +Refer to the [service defaults configuration entry reference](/consul/docs/connect/config-entries/service-defaults) for additional specifications and example configurations. + +## Inject statuses + + + +To inject a status code when aborting a response, configure the following parameters: + +1. `Arguments.Config.Abort.HttpStatus`: The HTTP status code to inject into responses when injecting an abort type fault. You may specify either a HTTP status or a gRPC status but not both. +1. `Arguments.Config.Abort.GrpcStatus`: The gRPC status to inject into responses when injecting an abort type fault. You may specify either a HTTP status or a gRPC status but not both. +1. `Arguments.Config.Abort.Percentage`: The percentage of responses to inject the abort behavior into. + +The following example configures the default behavior for a service named `billing`. This configuration injects an HTTP status code of 503 into 5 percent of responses from the billing service. + +```hcl +Kind = "service-defaults" +Name = "billing" +EnvoyExtensions = [ +{ + Name = "builtin/fault-injection" + Arguments = { + Config = { + Abort = { + HttpStatus = 503 + Percentage = 5 + } + } + } +} +] +``` + + + + +To inject a status code when aborting a response, configure the following parameters: + +1. `spec.arguments.config.abort.httpStatus`: The HTTP status code to inject into responses when injecting an abort type fault. You may specify either a HTTP status or a gRPC status but not both. +1. `spec.arguments.config.abort.grpcStatus`: The gRPC status to inject into responses when injecting an abort type fault. You may specify either a HTTP status or a gRPC status but not both. +1. `spec.arguments.config.abort.percentage`: The percentage of responses to inject the abort behavior into. + +The following example configures the default behavior for a service named `billing`. This configuration injects an HTTP status code of 503 into 5 percent of responses from the billing service. + +```yaml +kind: ServiceDefaults +metadata: + name: billing +spec: + envoyExtensions: + - name: "builtin/fault-injection" + arguments: + config: + abort: + httpStatus: 503 + percentage: 5 +``` + + + + +Refer to the [service defaults configuration entry reference](/consul/docs/connect/config-entries/service-defaults) for additional specifications and example configurations. + +## Inject bandwidth limit + + + + +To inject a rate limit into a response, configure the following parameters: + +1. `Arguments.Config.Bandwidth.Limit`: The amount of data allowed through the filter when applied. This value is specified in KiB/s. When the limit is exceeded, requests, operations, or connections will be subjected to rate limiting. +1. `Arguments.Config.Bandwidth.Percentage`: The percent of responses to inject the bandwidth limit behavior into. + +The following example configures the default behavior for a service named `billing`. This configuration limits the bandwidth of outgoing requests to 100 KiB/s for 50 percent of responses from the billing service. + +```hcl +Kind = "service-defaults" +Name = "billing" +EnvoyExtensions = [ +{ + Name = "builtin/fault-injection" + Arguments = { + Config = { + Bandwidth = { + Limit = 100 + Percentage = 50 + } + } + } +} +] +``` + + + + +To inject a rate limit into a response, configure the following parameters: + +1. `spec.arguments.config.bandwidth.limit`: The amount of data allowed through the filter when applied. This value is specified in KiB/s. When the limit is exceeded, requests, operations, or connections will be subjected to rate limiting. +1. `spec.arguments.config.bandwidth.percentage`: The percent of responses to inject the bandwidth limit behavior into. + +The following example configures the default behavior for a service named `billing`. This configuration limits the bandwidth of outgoing requests to 100 KiB/s for 50 percent of responses from the billing service. + +```yaml +kind: ServiceDefaults +metadata: + name: billing +spec: + envoyExtensions: + - name: "builtin/fault-injection" + arguments: + config: + bandwidth: + limit: 100 + percentage: 50 +``` + + + + +Refer to the [service defaults configuration entry reference](/consul/docs/connect/config-entries/service-defaults) for additional specifications and example configurations. diff --git a/website/content/docs/enterprise/index.mdx b/website/content/docs/enterprise/index.mdx index c1964bf3dd..6427926b17 100644 --- a/website/content/docs/enterprise/index.mdx +++ b/website/content/docs/enterprise/index.mdx @@ -27,8 +27,9 @@ The following features are [available in several forms of Consul Enterprise](#co - [Automated Backups](/consul/docs/enterprise/backups): Configure the automatic backup of Consul state. - [Redundancy Zones](/consul/docs/enterprise/redundancy): Deploy backup voting Consul servers to efficiently improve Consul fault tolerance - [Server request rate limits per source IP](/consul/docs/agent/limits/usage/limit-request-rates-from-ips): Limit gRPC and RPC traffic to servers for source IP addresses. -- [Traffic rate limiting for services](/consul/docs/connect/manage-traffic/limit-request-rates): Limit the rate of HTTP requests a service receives per service instance. +- [Traffic rate limiting for services](/consul/docs/connect/manage-traffic/limit-request-rates): Limit the rate of HTTP requests a service receives per service instance. - [Locality-aware routing](/consul/docs/connect/manage-traffic/route-to-local-upstreams): Prioritize upstream services in the same region and zone as the downstream service. +- [Fault injection](/consul/docs/connect/manage-traffic/fault-injection): Explore the resiliency of downstream services in response to problems with an upstream service, such as errors, latency, or response rate limits. ### Scalability @@ -99,6 +100,7 @@ Available Enterprise features per Consul form and license include: | [Automated Server Upgrades](/consul/docs/enterprise/upgrades) | All tiers | Yes | Yes | | [Consul-Terraform-Sync Enterprise](/consul/docs/nia/enterprise) | All tiers | Yes | Yes | | [Enhanced Read Scalability](/consul/docs/enterprise/read-scale) | No | Yes | With Global Visibility, Routing, and Scale module | +| [Fault injection](/consul/docs/connect/manage-traffic/fault-injection) | Yes | Yes | No | | [FIPS 140-2 Compliance](/consul/docs/enterprise/fips) | No | Yes | No | | [JWT verification for API gateways](/consul/docs/connect/gateways/api-gateway/secure-traffic/verify-jwts-vms) | Yes | Yes | Yes | | [Locality-aware routing](/consul/docs/connect/manage-traffic/route-to-local-upstreams) | Yes | Yes | Yes | @@ -112,7 +114,6 @@ Available Enterprise features per Consul form and license include: | [Server request rate limits per source IP](/consul/docs/agent/limits/usage/limit-request-rates-from-ips) | All tiers | Yes | With Governance and Policy module | | [Traffic rate limiting for services](/consul/docs/connect/manage-traffic/limit-request-rates) | Yes | Yes | Yes | - [HashiCorp Cloud Platform (HCP) Consul]: https://cloud.hashicorp.com/products/consul [Consul Enterprise]: https://www.hashicorp.com/products/consul/ @@ -131,6 +132,7 @@ Consul Enterprise feature availability can change depending on your server and c | [Automated Server Backups](/consul/docs/enterprise/backups) | ✅ | ✅ | ✅ | | [Automated Server Upgrades](/consul/docs/enterprise/upgrades) | ✅ | ✅ | ✅ | | [Enhanced Read Scalability](/consul/docs/enterprise/read-scale) | ✅ | ✅ | ✅ | +| [Fault injection](/consul/docs/connect/manage-traffic/fault-injection) | ✅ | ✅ | ✅ | | [FIPS 140-2 Compliance](/consul/docs/enterprise/fips) | ✅ | ✅ | ✅ | | [JWT verification for API gateways](/consul/docs/connect/gateways/api-gateway/secure-traffic/verify-jwts-vms) | ✅ | ✅ | ❌ | | [Locality-aware routing](/consul/docs/connect/manage-traffic/route-to-local-upstreams) | ✅ | ✅ | ✅ | @@ -155,6 +157,7 @@ Consul Enterprise feature availability can change depending on your server and c | [Automated Server Backups](/consul/docs/enterprise/backups) | ✅ | ✅ | ✅ | | [Automated Server Upgrades](/consul/docs/enterprise/upgrades) | ❌ | ❌ | ❌ | | [Enhanced Read Scalability](/consul/docs/enterprise/read-scale) | ❌ | ❌ | ❌ | +| [Fault injection](/consul/docs/connect/manage-traffic/fault-injection) | ✅ | ✅ | ✅ | | [FIPS 140-2 Compliance](/consul/docs/enterprise/fips) | ✅ | ✅ | ✅ | | [JWT verification for API gateways](/consul/docs/connect/gateways/api-gateway/secure-traffic/verify-jwts-k8s) | ✅ | ✅ | ❌ | | [Locality-aware routing](/consul/docs/connect/manage-traffic/route-to-local-upstreams) | ✅ | ✅ | ✅ | @@ -178,6 +181,7 @@ Consul Enterprise feature availability can change depending on your server and c | [Automated Server Backups](/consul/docs/enterprise/backups) | ✅ | ✅ | ✅ | | [Automated Server Upgrades](/consul/docs/enterprise/upgrades) | ✅ | ✅ | ✅ | | [Enhanced Read Scalability](/consul/docs/enterprise/read-scale) | ❌ | ❌ | ❌ | +| [Fault injection](/consul/docs/connect/manage-traffic/fault-injection) | ✅ | ✅ | ✅ | | [FIPS 140-2 Compliance](/consul/docs/enterprise/fips) | ❌ | ❌ | ❌ | | [JWT verification for API gateways](/consul/docs/connect/gateways/api-gateway/secure-traffic/verify-jwts-vms) | ✅ | ✅ | ❌ | | [Locality-aware routing](/consul/docs/connect/manage-traffic/route-to-local-upstreams) | ✅ | ✅ | ✅ | diff --git a/website/content/docs/release-notes/consul-k8s/v1_4_x.mdx b/website/content/docs/release-notes/consul-k8s/v1_4_x.mdx index 724981a676..8d72391677 100644 --- a/website/content/docs/release-notes/consul-k8s/v1_4_x.mdx +++ b/website/content/docs/release-notes/consul-k8s/v1_4_x.mdx @@ -11,19 +11,19 @@ We are pleased to announce the following Consul updates. ## Release highlights -**Consul catalog v2 API updates:** Additional improvements were made to the v2 catalog API, which supports [registering a service with multiple ports or selecting a workload using multiple services on Kubernetes deployments](/consul/docs/k8s/multiport). End user facing changes include the ability to use a service's ports in xRoute configurations as opposed to the workload port for reference in the parentRef stanza of the xRoute configuration, and the change in Traffic Permissions to only allow `ACTION_DENY` for Consul Enterprise as it will enable future governance related workflows. +- **Consul catalog v2 API updates:** Additional improvements were made to the v2 catalog API, which supports [registering a service with multiple ports or selecting a workload using multiple services on Kubernetes deployments](/consul/docs/k8s/multiport). End user facing changes include the ability to use a service's ports in xRoute configurations as opposed to the workload port for reference in the parentRef stanza of the xRoute configuration, and the change in Traffic Permissions to only allow `ACTION_DENY` for Consul Enterprise as it will enable future governance related workflows. -The v2 catalog API is currently available for Consul on Kubernetes deployments only. Refer to [Consul v2 architecture](/consul/docs/architecture/v2) for more information. + The v2 catalog API is currently available for Consul on Kubernetes deployments only. Refer to [Consul v2 architecture](/consul/docs/architecture/v2) for more information. -**Consul Long Term Support (LTS) (Enterprise):** Consul Enterprise users can now receive Long Term Support for approximately 2 years on some Consul releases, starting with Consul Enterprise v1.15. During the LTS window, eligible fixes are provided through a new minor release on the affected LTS release branch. +- **Consul Long Term Support (LTS) (Enterprise):** Consul Enterprise users can now receive Long Term Support for approximately 2 years on some Consul releases, starting with Consul Enterprise v1.15. During the LTS window, eligible fixes are provided through a new minor release on the affected LTS release branch. -An LTS release is planned for every 3 releases, and it is maintained until it is 6 releases from the latest major release. For example, Consul Enterprise v1.15.x, v1.18x, and v1.21.x are the next three planned releases. The LTS period for Consul Enterprise v1.15.x ends when Consul Enterprise v1.21.0 releases. + An LTS release is planned for every 3 releases, and it is maintained until it is 6 releases from the latest major release. For example, Consul Enterprise v1.15.x, v1.18x, and v1.21.x are the next three planned releases. The LTS period for Consul Enterprise v1.15.x ends when Consul Enterprise v1.21.0 releases. -For more information, refer to [Consul Enterprise Long Term Support](/consul/docs/enterprise/long-term-support). + For more information, refer to [Consul Enterprise Long Term Support](/consul/docs/enterprise/long-term-support). -**Fault injection (Enterprise):** For Enterprise users, Consul’s service mesh now supports managing traffic with fault injection behavior that is defined in an upstream service’s service defaults CRD. Supported fault injection behavior includes delaying, aborting, and rate limiting traffic between services. You can use fault injection to test your network’s resilience under different conditions. +- **Fault injection (Enterprise):** For Enterprise users, Consul’s service mesh now supports managing traffic with fault injection behavior that is defined in an upstream service’s service defaults CRD. Supported fault injection behavior includes delaying, aborting, and rate limiting traffic between services. You can use fault injection to test your network’s resilience under different conditions. -For more information, refer to [fault injection](/consul/docs/manage-traffic/fault-injection). + For more information, refer to [fault injection](/consul/docs/connect/manage-traffic/fault-injection). ## Supported software @@ -47,4 +47,4 @@ The changelogs for this major release version and any maintenance versions are l These links take you to the changelogs on the GitHub website. -- [1.4.0](https://github.com/hashicorp/consul-k8s/releases/tag/v1.4.0) \ No newline at end of file +- [1.4.0](https://github.com/hashicorp/consul-k8s/releases/tag/v1.4.0) diff --git a/website/content/docs/release-notes/consul/v1_18_x.mdx b/website/content/docs/release-notes/consul/v1_18_x.mdx index 33866bdc12..80fae2a34a 100644 --- a/website/content/docs/release-notes/consul/v1_18_x.mdx +++ b/website/content/docs/release-notes/consul/v1_18_x.mdx @@ -25,7 +25,7 @@ We are pleased to announce the following Consul updates. - **Fault injection (Enterprise):** For Enterprise users, Consul’s service mesh now supports managing traffic with fault injection behavior that is defined in an upstream service’s service defaults configuration entry or CRD. Supported fault injection behavior includes delaying, aborting, and rate limiting traffic between services. You can use fault injection to test your network’s resilience under different conditions. - For more information, refer to [fault injection](/consul/docs/manage-traffic/fault-injection). + For more information, refer to [fault injection](/consul/docs/connect/manage-traffic/fault-injection). - **Consul admin partition support for Nomad:** Nomad now supports configurations for a Consul datacenter’s admin partition when scheduling applications. Admin partitions are a Consul Enterprise feature that enable multi-tenancy for Consul servers so that several teams in your organization can use a single secure Consul datacenter. Admin partitions also enable cluster peering as a strategy for extending your service mesh east-west across regions in a single cloud provider or across multiple cloud providers. diff --git a/website/data/docs-nav-data.json b/website/data/docs-nav-data.json index 02aad8632f..97a1e87df5 100644 --- a/website/data/docs-nav-data.json +++ b/website/data/docs-nav-data.json @@ -676,6 +676,10 @@ "title": "Limit request rates to services", "path": "connect/manage-traffic/limit-request-rates" }, + { + "title": "Fault Injection", + "path": "connect/manage-traffic/fault-injection" + }, { "title": "Failover", "routes": [ @@ -1861,6 +1865,10 @@ "title": "Enhanced Read Scalability", "path": "enterprise/read-scale" }, + { + "title": "Fault injection", + "href": "/docs/connect/manage-traffic/fault-injection" + }, { "title": "FIPS", "path": "enterprise/fips"