Commit Graph

127 Commits

Author SHA1 Message Date
Derek Menteer 3e8ec8d18e
Fix SAN matching on terminating gateways (#20417)
Fixes issue: hashicorp/consul#20360

A regression was introduced in hashicorp/consul#19954 where the SAN validation
matching was reduced from 4 potential types down to just the URI.

Terminating gateways will need to match on many fields depending on user
configuration, since they make egress calls outside of the cluster. Having more
than one matcher behaves like an OR operation, where any match is sufficient to
pass the certificate validation. To maintain backwards compatibility with the
old untyped `match_subject_alt_names` Envoy behavior, we should match on all 4
enum types.

https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/transport_sockets/tls/v3/common.proto#enum-extensions-transport-sockets-tls-v3-subjectaltnamematcher-santype
2024-01-31 12:17:45 -06:00
John Murret d925e4b812
NET-6946 / NET-6941 - Replace usage of deprecated Envoy fields envoy.config.route.v3.HeaderMatcher.safe_regex_match and envoy.type.matcher.v3.RegexMatcher.google_re2 (#20013)
* NET-6946 - Replace usage of deprecated Envoy field envoy.config.route.v3.HeaderMatcher.safe_regex_match

* removing unrelated changes

* update golden files

* do not set engine type
2024-01-03 09:53:39 -07:00
John Murret 90cd56c5c3
NET-4774 - replace usage of deprecated Envoy field match_subject_alt_names (#19954) 2023-12-22 18:34:44 +00:00
Nitya Dhanushkodi 9975b8bd73
[NET-5455] Allow disabling request and idle timeouts with negative values in service router and service resolver (#19992)
* add coverage for testing these timeouts
2023-12-19 15:36:07 -08:00
Derek Menteer dfab5ade50
Fix ClusterLoadAssignment timeouts dropping endpoints. (#19871)
When a large number of upstreams are configured on a single envoy
proxy, there was a chance that it would timeout when waiting for
ClusterLoadAssignments. While this doesn't always immediately cause
issues, consul-dataplane instances appear to consistently drop
endpoints from their configurations after an xDS connection is
re-established (the server dies, random disconnect, etc).

This commit adds an `xds_fetch_timeout_ms` config to service registrations
so that users can set the value higher for large instances that have
many upstreams. The timeout can be disabled by setting a value of `0`.

This configuration was introduced to reduce the risk of causing a
breaking change for users if there is ever a scenario where endpoints
would never be received. Rather than just always blocking indefinitely
or for a significantly longer period of time, this config will affect
only the service instance associated with it.
2023-12-11 09:25:11 -06:00
John Murret 780e91688d
Migrate remaining individual resource tests for service mesh to TestAllResourcesFromSnapshot (#19583)
* migrate expose checks and paths  tests to resources_test.go

* fix failing expose paths tests

* fix the way endpoint resources get created to make expose tests pass.

* remove endpoint resources that are already inlined on local_app clusters

* renaiming and comments

* migrate remaining service mesh tests to resources_test.go

* cleanup

* update proxystateconverter to skip ading alpn to clusters and listener filterto match v1 behavior
2023-11-09 20:08:37 +00:00
John Murret f5bf256425
Migrate individual resource tests for API Gateway to TestAllResourcesFromSnapshot (#19584)
migrate individual api gateway tests to resources_test.go
2023-11-09 17:01:54 +00:00
John Murret a94fa4c3ed
Migrate individual resource tests for Mesh Gateway to TestAllResourcesFromSnapshot (#19502)
migrate mesh-gateway tests to resources_test.go
2023-11-09 16:39:16 +00:00
John Murret 4aa95f3d1f
Migrate individual resource tests for Ingress Gateway to TestAllResourcesFromSnapshot (#19506)
migrate ingress-gateway tests to resources_test.go
2023-11-09 16:08:07 +00:00
John Murret 2553d6e8b9
Migrate individual resource tests for Terminating Gateway to TestAllResourcesFromSnapshot (#19505)
migrate terminating-gateway tests to resources_test.go
2023-11-09 08:38:33 -07:00
John Murret 903ff7fccb
Migrate individual resource tests for custom configuration to TestAllResourcesFromSnapshot (#19512)
* Configure TestAllResourcesFromSnapshot to run V2 tests

* migrate custom configuration tests to resources_test.go
2023-11-08 10:34:23 -07:00
John Murret 09f73d1abf
Migrate individual resource tests for expose paths and checks to TestAllResourcesFromSnapshot (#19513)
* migrate expose checks and paths  tests to resources_test.go

* fix failing expose paths tests
2023-11-08 14:24:27 +00:00
John Murret 7bc2581c81
Migrate individual resource tests for Discovery Chains to TestAllResourcesFromSnapshot (#19508)
migrate disco chain tests to resources_test.go
2023-11-08 01:34:42 +00:00
Andrew Stucki e414cbee4a
Use strict DNS for mesh gateways with hostnames (#19268)
* Use strict DNS for mesh gateways with hostnames

* Add changelog
2023-10-24 15:04:14 -04:00
Blake Covarrubias 019c62e1ba
xds: Use downstream protocol when connecting to local app (#18573)
Configure Envoy to use the same HTTP protocol version used by the
downstream caller when forwarding requests to a local application that
is configured with the protocol set to either `http2` or `grpc`.

This allows upstream applications that support both HTTP/1.1 and
HTTP/2 on a single port to receive requests using either protocol. This
is beneficial when the application primarily communicates using HTTP/2,
but also needs to support HTTP/1.1, such as to respond to Kubernetes
HTTP readiness/liveness probes.

Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>
2023-09-19 14:32:28 -07:00
R.B. Boyer a69e901660
xds: update golden tests to be deterministic (#18707) 2023-09-11 11:40:19 -05:00
sarahalsmiller e235c8be3c
NET-5115 Add retry + timeout filters for api-gateway (#18324)
* squash, implement retry/timeout in consul core

* update tests
2023-08-08 16:39:46 -05:00
Dan Stough 284e3bdb54
[OSS] test: xds coverage for routes (#18369)
test: xds coverage for routes
2023-08-03 15:03:02 -04:00
Dan Stough 8e3a1ddeb6
[OSS] Improve xDS Code Coverage - Endpoints and Misc (#18222)
test: improve xDS endpoints code coverage
2023-07-21 17:48:25 -04:00
Dan Stough 2793761702
[OSS] Improve xDS Code Coverage - Clusters (#18165)
test: improve xDS cluster code coverage
2023-07-20 18:02:21 -04:00
Ronald 18bc04165c
Improve XDS test coverage: JWT auth edition (#18183)
* Improve XDS test coverage: JWT auth edition

more tests

* test: xds coverage for jwt listeners

---------

Co-authored-by: DanStough <dan.stough@hashicorp.com>
2023-07-19 17:19:00 -04:00
Dan Stough 33d898b857
[OSS] test: improve xDS listener code coverage (#18138)
test: improve xDS listener code coverage
2023-07-17 13:49:40 -04:00
Dan Stough d935c7b466
[OSS] gRPC Blocking Queries (#17426)
* feat: initial grpc blocking queries

* changelog and docs update
2023-05-23 17:29:10 -04:00
Connor 0789661ce5
Rename hcp-metrics-collector to consul-telemetry-collector (#17327)
* Rename hcp-metrics-collector to consul-telemetry-collector

* Fix docs

* Fix doc comment

---------

Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>
2023-05-16 14:36:05 -04:00
Semir Patel 5eaeb7b8e5
Support Envoy's MaxEjectionPercent and BaseEjectionTime config entries for passive health checks (#15979)
* Add MaxEjectionPercent to config entry

* Add BaseEjectionTime to config entry

* Add MaxEjectionPercent and BaseEjectionTime to protobufs

* Add MaxEjectionPercent and BaseEjectionTime to api

* Fix integration test breakage

* Verify MaxEjectionPercent and BaseEjectionTime in integration test upstream confings

* Website docs for MaxEjectionPercent and BaseEjection time

* Add `make docs` to browse docs at http://localhost:3000

* Changelog entry

* so that is the difference between consul-docker and dev-docker

* blah

* update proto funcs

* update proto

---------

Co-authored-by: Maliz <maliheh.monshizadeh@hashicorp.com>
2023-04-26 15:59:48 -07:00
Derek Menteer 2236975011
Change partition for peers in discovery chain targets (#16769)
This commit swaps the partition field to the local partition for
discovery chains targeting peers. Prior to this change, peer upstreams
would always use a value of default regardless of which partition they
exist in. This caused several issues in xds / proxycfg because of id
mismatches.

Some prior fixes were made to deal with one-off id mismatches that this
PR also cleans up, since they are no longer needed.
2023-03-24 15:40:19 -05:00
Andrew Stucki 501b87fd31
[API Gateway] Fix invalid cluster causing gateway programming delay (#16661)
* Add test for http routes

* Add fix

* Fix tests

* Add changelog entry

* Refactor and fix flaky tests
2023-03-17 13:31:04 -04:00
Eric Haberkorn 57e034b746
fix confusing spiffe ids in golden tests (#16643) 2023-03-15 14:30:36 -04:00
Ashvitha f95ffe0355
Allow HCP metrics collection for Envoy proxies
Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>

Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory
where a unix socket will be created with the name
`<namespace>_<proxy_id>.sock` to forward Envoy metrics.

If set, this will configure:
- In bootstrap configuration a local stats_sink and static cluster.
  These will forward metrics to a loopback listener sent over xDS.

- A dynamic listener listening at the socket path that the previously
  defined static cluster is sending metrics to.

- A dynamic cluster that will forward traffic received at this listener
  to the hcp-metrics-collector service.


Reasons for having a static cluster pointing at a dynamic listener:
- We want to secure the metrics stream using TLS, but the stats sink can
  only be defined in bootstrap config. With dynamic listeners/clusters
  we can use the proxy's leaf certificate issued by the Connect CA,
  which isn't available at bootstrap time.

- We want to intelligently route to the HCP collector. Configuring its
  addreess at bootstrap time limits our flexibility routing-wise. More
  on this below.

Reasons for defining the collector as an upstream in `proxycfg`:
- The HCP collector will be deployed as a mesh service.

- Certificate management is taken care of, as mentioned above.

- Service discovery and routing logic is automatically taken care of,
  meaning that no code changes are required in the xds package.

- Custom routing rules can be added for the collector using discovery
  chain config entries. Initially the collector is expected to be
  deployed to each admin partition, but in the future could be deployed
  centrally in the default partition. These config entries could even be
  managed by HCP itself.
2023-03-10 13:52:54 -07:00
Eric Haberkorn 595131fca9
Refactor the disco chain -> xds logic (#16392) 2023-02-23 11:32:32 -05:00
Andrew Stucki b3ddd4d24e
Inline API Gateway TLS cert code (#16295)
* Include secret type when building resources from config snapshot

* First pass at generating envoy secrets from api-gateway snapshot

* Update comments for xDS update order

* Add secret type + corresponding golden files to existing tests

* Initialize test helpers for testing api-gateway resource generation

* Generate golden files for new api-gateway xDS resource test

* Support ADS for TLS certificates on api-gateway

* Configure TLS on api-gateway listeners

* Inline TLS cert code

* update tests

* Add SNI support so we can have multiple certificates

* Remove commented out section from helper

* regen deep-copy

* Add tcp tls test

---------

Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
2023-02-17 12:46:03 -05:00
Eric Haberkorn 8d923c1789
Add the Lua Envoy extension (#15906) 2023-01-06 12:13:40 -05:00
cskh 04bf24c8c1
feat(ingress-gateway): support outlier detection of upstream service for ingress gateway (#15614)
* feat(ingress-gateway): support outlier detection of upstream service for ingress gateway

* changelog

Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
2022-12-13 11:51:37 -05:00
Derek Menteer 418bd62c44
Fix mesh gateway configuration with proxy-defaults (#15186)
* Fix mesh gateway proxy-defaults not affecting upstreams.

* Clarify distinction with upstream settings

Top-level mesh gateway mode in proxy-defaults and service-defaults gets
merged into NodeService.Proxy.MeshGateway, and only gets merged with
the mode attached to an an upstream in proxycfg/xds.

* Fix mgw mode usage for peered upstreams

There were a couple issues with how mgw mode was being handled for
peered upstreams.

For starters, mesh gateway mode from proxy-defaults
and the top-level of service-defaults gets stored in
NodeService.Proxy.MeshGateway, but the upstream watch for peered data
was only considering the mesh gateway config attached in
NodeService.Proxy.Upstreams[i]. This means that applying a mesh gateway
mode via global proxy-defaults or service-defaults on the downstream
would not have an effect.

Separately, transparent proxy watches for peered upstreams didn't
consider mesh gateway mode at all.

This commit addresses the first issue by ensuring that we overlay the
upstream config for peered upstreams as we do for non-peered. The second
issue is addressed by re-using setupWatchesForPeeredUpstream when
handling transparent proxy updates.

Note that for transparent proxies we do not yet support mesh gateway
mode per upstream, so the NodeService.Proxy.MeshGateway mode is used.

* Fix upstream mesh gateway mode handling in xds

This commit ensures that when determining the mesh gateway mode for
peered upstreams we consider the NodeService.Proxy.MeshGateway config as
a baseline.

In absense of this change, setting a mesh gateway mode via
proxy-defaults or the top-level of service-defaults will not have an
effect for peered upstreams.

* Merge service/proxy defaults in cfg resolver

Previously the mesh gateway mode for connect proxies would be
merged at three points:

1. On servers, in ComputeResolvedServiceConfig.
2. On clients, in MergeServiceConfig.
3. On clients, in proxycfg/xds.

The first merge returns a ServiceConfigResponse where there is a
top-level MeshGateway config from proxy/service-defaults, along with
per-upstream config.

The second merge combines per-upstream config specified at the service
instance with per-upstream config specified centrally.

The third merge combines the NodeService.Proxy.MeshGateway
config containing proxy/service-defaults data with the per-upstream
mode. This third merge is easy to miss, which led to peered upstreams
not considering the mesh gateway mode from proxy-defaults.

This commit removes the third merge, and ensures that all mesh gateway
config is available at the upstream. This way proxycfg/xds do not need
to do additional overlays.

* Ensure that proxy-defaults is considered in wc

Upstream defaults become a synthetic Upstream definition under a
wildcard key "*". Now that proxycfg/xds expect Upstream definitions to
have the final MeshGateway values, this commit ensures that values from
proxy-defaults/service-defaults are the default for this synthetic
upstream.

* Add changelog.

Co-authored-by: freddygv <freddy@hashicorp.com>
2022-11-09 10:14:29 -06:00
Eric Haberkorn 1bdad89026
fix bug that resulted in generating Envoy configs that use CDS with an EDS configuration (#15140) 2022-10-25 14:49:57 -04:00
Luke Kysow d3aa2bd9c5
ingress-gateways: don't log error when registering gateway (#15001)
* ingress-gateways: don't log error when registering gateway

Previously, when an ingress gateway was registered without a
corresponding ingress gateway config entry, an error was logged
because the watch on the config entry returned a nil result.
This is expected so don't log an error.
2022-10-25 10:55:44 -07:00
Kyle Havlovitz aaf892a383 Extend tcp keepalive settings to work for terminating gateways as well 2022-10-14 17:05:46 -07:00
Kyle Havlovitz 2c569f6b9c Update docs and add tcp_keepalive_probes setting 2022-10-14 17:05:46 -07:00
Kyle Havlovitz 2242d1ec4a Add TCP keepalive settings to proxy config for mesh gateways 2022-10-14 17:05:46 -07:00
DanStough 77ab28c5c7 feat: xDS updates for peerings control plane through mesh gw 2022-10-07 08:46:42 -06:00
Eric Haberkorn 1633cf20ea
Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic (#14817)
Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic
2022-10-06 09:54:14 -04:00
Freddy d9fe3578ac
Merge pull request #14734 from hashicorp/NET-643-update-mesh-gateway-envoy-config-for-inbound-peering-control-plane-traffic 2022-10-03 12:54:11 -06:00
freddygv b15d41534f Update xds generation for peering over mesh gws
This commit adds the xDS resources needed for INBOUND traffic from peer
clusters:

- 1 filter chain for all inbound peering requests.
- 1 cluster for all inbound peering requests.
- 1 endpoint per voting server with the gRPC TLS port configured.

There is one filter chain and cluster because unlike with WAN
federation, peer clusters will not attempt to dial individual servers.
Peer clusters will only dial the local mesh gateway addresses.
2022-10-03 12:42:27 -06:00
cskh 69f40df548
feat(ingress gateway: support configuring limits in ingress-gateway c… (#14749)
* feat(ingress gateway: support configuring limits in ingress-gateway config entry

- a new Defaults field with max_connections, max_pending_connections, max_requests
  is added to ingress gateway config entry
- new field max_connections, max_pending_connections, max_requests in
  individual services to overwrite the value in Default
- added unit test and integration test
- updated doc

Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
2022-09-28 14:56:46 -04:00
Eric Haberkorn 6570d5f004
Enable outbound peered requests to go through local mesh gateway (#14763) 2022-09-27 09:49:28 -04:00
Eric Haberkorn aa8268e50c
Implement Cluster Peering Redirects (#14445)
implement cluster peering redirects
2022-09-09 13:58:28 -04:00
malizz b3ac8f48ca
Add additional parameters to envoy passive health check config (#14238)
* draft commit

* add changelog, update test

* remove extra param

* fix test

* update type to account for nil value

* add test for custom passive health check

* update comments and tests

* update description in docs

* fix missing commas
2022-09-01 09:59:11 -07:00
Eric Haberkorn 3726a0ab7a
Finish up cluster peering failover (#14396) 2022-08-30 11:46:34 -04:00
Eric Haberkorn 72f90754ae
Update max_ejection_percent on outlier detection for peered clusters to 100% (#14373)
We can't trust health checks on peered services when service resolvers,
splitters and routers are used.
2022-08-29 13:46:41 -04:00
Eric Haberkorn ebd5513d4b
Refactor failover code to use Envoy's aggregate clusters (#14178) 2022-08-12 14:30:46 -04:00