consul

Commit Graph

Author	SHA1	Message	Date
Nathan Coleman	9af713ff17	[NET-5772] Make tcp external service registered on terminating gw reachable from peered cluster (#19881 ) * Include SNI + root PEMs from peered cluster on terminating gw filter chain This allows an external service registered on a terminating gateway to be exported to and reachable from a peered cluster * Abstract existing logic into re-usable function * Regenerate golden files w/ new listener logic * Add changelog entry * Use peering bundles that are stable across test runs	2024-04-03 12:38:09 -04:00
Derek Menteer	0ac8ae6c3b	Fix xDS deadlock due to syncLoop termination. (#20867 ) * Fix xDS deadlock due to syncLoop termination. This fixes an issue where agentless xDS streams can deadlock permanently until a server is restarted. When this issue occurs, no new proxies are able to successfully connect to the server. Effectively, the trigger for this deadlock stems from the following return statement: https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L199-L202 When this happens, the entire `syncLoop()` terminates and stops consuming from the following channel: https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L182-L192 Which results in the `ConfigSource.cleanup()` function never receiving a response and holding a mutex indefinitely: https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L241-L247 Because this mutex is shared, it effectively deadlocks the server's ability to process new xDS streams. ---- The fix to this issue involves removing the `chan chan struct{}` used like an RPC-over-channels pattern and replacing it with two distinct channels: + `stopSyncLoopCh` - indicates that the `syncLoop()` should terminate soon. + `syncLoopDoneCh` - indicates that the `syncLoop()` has terminated. Splitting these two concepts out and deferring a `close(syncLoopDoneCh)` in the `syncLoop()` function ensures that the deadlock above should no longer occur. We also now evict xDS connections of all proxies for the corresponding `syncLoop()` whenever it encounters an irrecoverable error. This is done by hoisting the new `syncLoopDoneCh` upwards so that it's visible to the xDS delta processing. Prior to this fix, the behavior was to simply orphan them so they would never receive catalog-registration or service-defaults updates. * Add changelog.	2024-03-15 13:57:11 -05:00
Derek Menteer	eabff257d7	Various bug-fixes and improvements (#20866 ) * Shuffle the list of servers returned by `pbserverdiscovery.WatchServers`. This randomizes the list of servers to help reduce the chance of clients all connecting to the same server simultaneously. Consul-dataplane is one such client that does not randomize its own list of servers. * Fix potential goroutine leak in xDS recv loop. This commit ensures that the goroutine which receives xDS messages from proxies will not block forever if the stream's context is cancelled but the `processDelta()` function never consumes the message (due to being terminated). * Add changelog.	2024-03-15 13:10:48 -05:00
sarahalsmiller	262f435800	NET-6821 Disable Terminating Gateway Auto Host Header Rewrite (#20802 ) * disable terminating gateway auto host rewrite * add changelog * clean up unneeded additional snapshot fields * add new field to docs * squash * fix test	2024-03-12 15:37:20 -05:00
skpratt	57bad0df85	add traffic permissions excludes and tests (#20453 ) * add traffic permissions tests * review fixes * Update internal/mesh/internal/controllers/sidecarproxy/builder/local_app.go Co-authored-by: John Landa <jonathanlanda@gmail.com> --------- Co-authored-by: John Landa <jonathanlanda@gmail.com>	2024-02-07 20:21:44 +00:00
Chris S. Kim	b6f10bc58f	Skip filter chain created by permissive mtls (#20406 )	2024-01-31 16:39:12 -05:00
Derek Menteer	3e8ec8d18e	Fix SAN matching on terminating gateways (#20417 ) Fixes issue: hashicorp/consul#20360 A regression was introduced in hashicorp/consul#19954 where the SAN validation matching was reduced from 4 potential types down to just the URI. Terminating gateways will need to match on many fields depending on user configuration, since they make egress calls outside of the cluster. Having more than one matcher behaves like an OR operation, where any match is sufficient to pass the certificate validation. To maintain backwards compatibility with the old untyped `match_subject_alt_names` Envoy behavior, we should match on all 4 enum types. https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/transport_sockets/tls/v3/common.proto#enum-extensions-transport-sockets-tls-v3-subjectaltnamematcher-santype	2024-01-31 12:17:45 -06:00
Matt Keeler	34a32d4ce5	Remove V2 PeerName field from pbresource.Tenancy (#19865 ) The peer name will eventually show up elsewhere in the resource. For now though this rips it out of where we don’t want it to be.	2024-01-29 15:08:31 -05:00
skpratt	0abf8f8426	Net 5092/internal l7 traffic permissions (#20276 ) * wire up L7 Traffic Permissions * testing * update comment	2024-01-23 20:07:58 -06:00
Lord-Y	758ddf84e9	Case sensitive route match (#19647 ) Add case insensitive param on service route match This commit adds in a new feature that allows service routers to specify that paths and path prefixes should ignore upper / lower casing when matching URLs. Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>	2024-01-22 09:23:24 -06:00
John Murret	7a410d7c5b	NET-6945 - Replace usage of deprecated Envoy field envoy.config.core.v3.HeaderValueOption.append (#20078 ) * NET-6945 - Replace usage of deprecated Envoy field envoy.config.core.v3.HeaderValueOption.append * update proto for v2 and then update xds v2 logic * add changelog * Update 20078.txt to be consistent with existing changelog entries * swap enum values tomatch envoy.	2024-01-04 00:36:25 +00:00
John Murret	d925e4b812	NET-6946 / NET-6941 - Replace usage of deprecated Envoy fields envoy.config.route.v3.HeaderMatcher.safe_regex_match and envoy.type.matcher.v3.RegexMatcher.google_re2 (#20013 ) * NET-6946 - Replace usage of deprecated Envoy field envoy.config.route.v3.HeaderMatcher.safe_regex_match * removing unrelated changes * update golden files * do not set engine type	2024-01-03 09:53:39 -07:00
John Murret	2f335113f8	NET-6943 - Replace usage of deprecated Envoy field envoy.config.router.v3.WeightedCluster.total_weight. (#20011 )	2023-12-22 19:49:44 +00:00
John Murret	90cd56c5c3	NET-4774 - replace usage of deprecated Envoy field match_subject_alt_names (#19954 )	2023-12-22 18:34:44 +00:00
John Murret	21ea5c92fd	NET-6944 - Replace usage of deprecated Envoy field envoy.extensions.filters.http.lua.v3.Lua.inline_code (#20012 )	2023-12-22 17:20:41 +00:00
Nitya Dhanushkodi	9975b8bd73	[NET-5455] Allow disabling request and idle timeouts with negative values in service router and service resolver (#19992 ) * add coverage for testing these timeouts	2023-12-19 15:36:07 -08:00
Derek Menteer	dfab5ade50	Fix ClusterLoadAssignment timeouts dropping endpoints. (#19871 ) When a large number of upstreams are configured on a single envoy proxy, there was a chance that it would timeout when waiting for ClusterLoadAssignments. While this doesn't always immediately cause issues, consul-dataplane instances appear to consistently drop endpoints from their configurations after an xDS connection is re-established (the server dies, random disconnect, etc). This commit adds an `xds_fetch_timeout_ms` config to service registrations so that users can set the value higher for large instances that have many upstreams. The timeout can be disabled by setting a value of `0`. This configuration was introduced to reduce the risk of causing a breaking change for users if there is ever a scenario where endpoints would never be received. Rather than just always blocking indefinitely or for a significantly longer period of time, this config will affect only the service instance associated with it.	2023-12-11 09:25:11 -06:00
Derek Menteer	0ac958f27b	Fix xDS missing endpoint race condition. (#19866 ) This fixes the following race condition: - Send update endpoints - Send update cluster - Recv ACK endpoints - Recv ACK cluster Prior to this fix, it would have resulted in the endpoints NOT existing in Envoy. This occurred because the cluster update implicitly clears the endpoints in Envoy, but we would never re-send the endpoint data to compensate for the loss, because we would incorrectly ACK the invalid old endpoint hash. Since the endpoint's hash did not actually change, they would not be resent. The fix for this is to effectively clear out the invalid pending ACKs for child resources whenever the parent changes. This ensures that we do not store the child's hash as accepted when the race occurs. An escape-hatch environment variable `XDS_PROTOCOL_LEGACY_CHILD_RESEND` was added so that users can revert back to the old legacy behavior in the event that this produces unknown side-effects. Visit the following thread for some extra context on why certainty around these race conditions is difficult: https://github.com/envoyproxy/envoy/issues/13009 This bug report and fix was mostly implemented by @ksmiley with some minor tweaks. Co-authored-by: Keith Smiley <ksmiley@salesforce.com>	2023-12-08 11:37:12 -06:00
Thomas Eckert	8125a32a4e	Add CE version of Gateway Upstream Disambiguation (#19860 ) * Add CE version of gateway-upstream-disambiguation * Use NamespaceOrDefault and PartitionOrDefault * Add Changelog entry * Remove the unneeded reassignment * Use c.ID()	2023-12-07 17:56:14 -05:00
Dhia Ayachi	d93f7f730d	parse config protocol on write to optimize disco-chain compilation (#19829 ) * parse config protocol on write to optimize disco-chain compilation * add changelog	2023-12-07 13:46:46 -05:00
John Murret	780e91688d	Migrate remaining individual resource tests for service mesh to TestAllResourcesFromSnapshot (#19583 ) * migrate expose checks and paths tests to resources_test.go * fix failing expose paths tests * fix the way endpoint resources get created to make expose tests pass. * remove endpoint resources that are already inlined on local_app clusters * renaiming and comments * migrate remaining service mesh tests to resources_test.go * cleanup * update proxystateconverter to skip ading alpn to clusters and listener filterto match v1 behavior	2023-11-09 20:08:37 +00:00
John Murret	f5bf256425	Migrate individual resource tests for API Gateway to TestAllResourcesFromSnapshot (#19584 ) migrate individual api gateway tests to resources_test.go	2023-11-09 17:01:54 +00:00
John Murret	a94fa4c3ed	Migrate individual resource tests for Mesh Gateway to TestAllResourcesFromSnapshot (#19502 ) migrate mesh-gateway tests to resources_test.go	2023-11-09 16:39:16 +00:00
John Murret	4aa95f3d1f	Migrate individual resource tests for Ingress Gateway to TestAllResourcesFromSnapshot (#19506 ) migrate ingress-gateway tests to resources_test.go	2023-11-09 16:08:07 +00:00
John Murret	2553d6e8b9	Migrate individual resource tests for Terminating Gateway to TestAllResourcesFromSnapshot (#19505 ) migrate terminating-gateway tests to resources_test.go	2023-11-09 08:38:33 -07:00
John Murret	7de0b45ba4	Fix xds v2 from creating envoy endpoint resources when already inlined in the cluster (#19580 ) * migrate expose checks and paths tests to resources_test.go * fix failing expose paths tests * fix the way endpoint resources get created to make expose tests pass. * wip * remove endpoint resources that are already inlined on local_app clusters * renaiming and comments	2023-11-08 22:18:51 +00:00
John Murret	5aff19f9bc	Migrate individual resource tests for JWT Provider to TestAllResourcesFromSnapshot (#19511 ) migrate jwt provider tests to resources_test.go	2023-11-08 14:34:40 -07:00
John Murret	903ff7fccb	Migrate individual resource tests for custom configuration to TestAllResourcesFromSnapshot (#19512 ) * Configure TestAllResourcesFromSnapshot to run V2 tests * migrate custom configuration tests to resources_test.go	2023-11-08 10:34:23 -07:00
John Murret	09f73d1abf	Migrate individual resource tests for expose paths and checks to TestAllResourcesFromSnapshot (#19513 ) * migrate expose checks and paths tests to resources_test.go * fix failing expose paths tests	2023-11-08 14:24:27 +00:00
John Murret	7bc2581c81	Migrate individual resource tests for Discovery Chains to TestAllResourcesFromSnapshot (#19508 ) migrate disco chain tests to resources_test.go	2023-11-08 01:34:42 +00:00
John Murret	f115cdb1d5	NET-6385 - Static routes that are inlined in listener filters are also created as a resource. (#19459 ) * cover all protocols in local_app golden tests * fix xds tests * updating latest * fix broken test * add sorting of routers to TestBuildLocalApp to get rid of the flaking * cover all protocols in local_app golden tests * cover all protocols in local_app golden tests * cover all protocols in local_app golden tests * process envoy resource by walking the map. use a map rather than array for envoy resource to prevent duplication. * cleanup. doc strings. * update to latest * fix broken test * update tests after adding sorting of routers in local_app builder tests * do not make endpoints for local_app * fix catalog destinations only by creating clusters for any cluster not already created by walking the graph. * Configure TestAllResourcesFromSnapshot to run V2 tests * wip * fix processing of failover groups * add endpoints and clusters for any clusters that were not created from walking the listener -> path * fix xds v2 golden files for clusters to include failover group clusters	2023-11-07 08:00:08 -07:00
John Murret	74daaa5043	XDS V1 should not make runs for TCP Disco Chains. (#19496 ) * XDS V1 should not make runs for TCP Disco Chains. * update TestEnvoyExtenderWithSnapshot	2023-11-03 14:53:17 -06:00
John Murret	f0cf8f2f40	NET-6294 - v1 Agentless proxycfg datasource errors after v2 changes (#19365 )	2023-10-27 14:06:38 -06:00
Michael Zalimeni	a7803bd829	[NET-6305] xds: Ensure v2 route match and protocol are populated for gRPC (#19343 ) * xds: Ensure v2 route match is populated for gRPC Similar to HTTP, ensure that route match config (which is required by Envoy) is populated when default values are used. Because the default matches generated for gRPC contain a single empty `GRPCRouteMatch`, and that proto does not directly support prefix-based config, an interpretation of the empty struct is needed to generate the same output that the `HTTPRouteMatch` is explicitly configured to provide in internal/mesh/internal/controllers/routes/generate.go. * xds: Ensure protocol set for gRPC resources Add explicit protocol in `ProxyStateTemplate` builders and validate it is always set on clusters. This ensures that HTTP filters and `http2_protocol_options` are populated in all the necessary places for gRPC traffic and prevents future unintended omissions of non-TCP protocols. Co-authored-by: John Murret <john.murret@hashicorp.com> --------- Co-authored-by: John Murret <john.murret@hashicorp.com>	2023-10-25 17:43:58 +00:00
Andrew Stucki	e414cbee4a	Use strict DNS for mesh gateways with hostnames (#19268 ) * Use strict DNS for mesh gateways with hostnames * Add changelog	2023-10-24 15:04:14 -04:00
Michael Zalimeni	5e517c5980	[NET-6221] Ensure LB policy set for locality-aware routing (CE) (#19283 ) Ensure LB policy set for locality-aware routing (CE) `overprovisioningFactor` should be overridden with the expected value (100,000) when there are multiple endpoint groups. Update code and tests to enforce this. This is an Enterprise feature. This commit represents the CE portions of the change; tests are added in the corresponding `consul-enterprise` change.	2023-10-19 10:13:27 -04:00
John Maguire	b78465b491	[NET-5810] CE changes for multiple virtual hosts (#19246 ) CE changes for multiple virtual hosts	2023-10-17 15:08:04 +00:00
Thomas Eckert	76c60fdfac	Golden File Tests for TermGW w/ Cluster Peering (#19096 ) Add intention to create golden file for terminating gateway peered trust bundle	2023-10-13 11:56:58 -04:00
Nitya Dhanushkodi	95d9b2c7e4	[NET-4931] xdsv2, sidecarproxycontroller, l4 trafficpermissions: support L7 (#19185 ) * xdsv2: support l7 by adding xfcc policy/headers, tweaking routes, and make a bunch of listeners l7 tests pass * sidecarproxycontroller: add l7 local app support * trafficpermissions: make l4 traffic permissions work on l7 workloads * rename route name field for consistency with l4 cluster name field * resolve conflicts and rebase * fix: ensure route name is used in l7 destination route name as well. previously it was only in the route names themselves, now the route name and l7 destination route name line up	2023-10-12 23:45:45 +00:00
John Maguire	7a323c492b	[NET-5457] Golden Files for Multiple Virtual Hosts (#19131 ) * Add new golden file tests * Update with latest deterministic code	2023-10-11 18:11:29 +00:00
John Maguire	8bebfc147d	[NET-5457] Fix CE code for jwt multiple virtual hosts bug (#19123 ) * Fix CE code for jwt multiple virtual hosts bug * Fix struct definition * fix bug with always appending route to jwt config * Update comment to be correct * Update comment	2023-10-10 16:25:36 -04:00
Chris S. Kim	92ce814693	Remove old build tags (#19128 )	2023-10-10 10:58:06 -04:00
Thomas Eckert	342306c312	Allow connections through Terminating Gateways from peered clusters NET-3463 (#18959 ) * Add InboundPeerTrustBundle maps to Terminating Gateway * Add notify and cancelation of watch for inbound peer trust bundles * Pass peer trust bundles to the RBAC creation function * Regenerate Golden Files * add changelog, also adds another spot that needed peeredTrustBundles * Add basic test for terminating gateway with peer trust bundle * Add intention to cluster peered golden test * rerun codegen * update changelog * really update the changelog --------- Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com>	2023-10-05 21:54:23 +00:00
Eric Haberkorn	f2b7b4591a	Fix Traffic Permissions Default Deny (#19028 ) Whenver a traffic permission exists for a given workload identity, turn on default deny. Previously, this was only working at the port level.	2023-10-04 09:58:28 -04:00
sarahalsmiller	9addd9ed7c	[NET-5788] Fix needed for JWTAuth in Consul Enterprise (#19038 ) change needed for fix in consul-enterprise	2023-10-03 09:48:50 -05:00
Nitya Dhanushkodi	9a48266712	remove log (#19029 )	2023-09-29 16:11:50 -07:00
Eric Haberkorn	7ce6ebaeb3	Handle Traffic Permissions With Empty Sources Properly (#19024 ) Fix issues with empty sources * Validate that each permission on traffic permissions resources has at least one source. * Don't construct RBAC policies when there aren't any principals. This resulted in Envoy rejecting xDS updates with a validation error. ``` error= \| rpc error: code = Internal desc = Error adding/updating listener(s) public_listener: Proto constraint validation failed (RBACValidationError.Rules: embedded message failed validation \| caused by RBACValidationError.Policies[consul-intentions-layer4-1]: embedded message failed validation \| caused by PolicyValidationError.Principals: value must contain at least 1 item(s)): rules { ```	2023-09-28 15:11:59 -04:00
Iryna Shustava	e6b724d062	catalog,mesh,auth: Move resource types to the proto-public module (#18935 )	2023-09-22 15:50:56 -06:00
Iryna Shustava	d88888ee8b	catalog,mesh,auth: Bump versions to v2beta1 (#18930 )	2023-09-22 10:51:15 -06:00
Nitya Dhanushkodi	0a11499588	net-5689 fix disabling panic threshold logic (#18958 )	2023-09-21 15:52:30 -07:00

1 2 3 4 5 ...

591 Commits