consul

Commit Graph

Author	SHA1	Message	Date
Derek Menteer	0ac8ae6c3b	Fix xDS deadlock due to syncLoop termination. (#20867 ) * Fix xDS deadlock due to syncLoop termination. This fixes an issue where agentless xDS streams can deadlock permanently until a server is restarted. When this issue occurs, no new proxies are able to successfully connect to the server. Effectively, the trigger for this deadlock stems from the following return statement: https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L199-L202 When this happens, the entire `syncLoop()` terminates and stops consuming from the following channel: https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L182-L192 Which results in the `ConfigSource.cleanup()` function never receiving a response and holding a mutex indefinitely: https://github.com/hashicorp/consul/blob/v1.18.0/agent/proxycfg-sources/catalog/config_source.go#L241-L247 Because this mutex is shared, it effectively deadlocks the server's ability to process new xDS streams. ---- The fix to this issue involves removing the `chan chan struct{}` used like an RPC-over-channels pattern and replacing it with two distinct channels: + `stopSyncLoopCh` - indicates that the `syncLoop()` should terminate soon. + `syncLoopDoneCh` - indicates that the `syncLoop()` has terminated. Splitting these two concepts out and deferring a `close(syncLoopDoneCh)` in the `syncLoop()` function ensures that the deadlock above should no longer occur. We also now evict xDS connections of all proxies for the corresponding `syncLoop()` whenever it encounters an irrecoverable error. This is done by hoisting the new `syncLoopDoneCh` upwards so that it's visible to the xDS delta processing. Prior to this fix, the behavior was to simply orphan them so they would never receive catalog-registration or service-defaults updates. * Add changelog.	2024-03-15 13:57:11 -05:00
sarahalsmiller	262f435800	NET-6821 Disable Terminating Gateway Auto Host Header Rewrite (#20802 ) * disable terminating gateway auto host rewrite * add changelog * clean up unneeded additional snapshot fields * add new field to docs * squash * fix test	2024-03-12 15:37:20 -05:00
Dhia Ayachi	d641998641	Fix to not create a watch to `Internal.ServiceDump` when mesh gateway is not used (#20168 ) This add a fix to properly verify the gateway mode before creating a watch specific to mesh gateways. This watch have a high performance cost and when mesh gateways are not used is not used. This also adds an optimization to only return the nodes when watching the Internal.ServiceDump RPC to avoid unnecessary disco chain compilation. As watches in proxy config only need the nodes.	2024-01-18 16:44:53 -06:00
Nitya Dhanushkodi	9975b8bd73	[NET-5455] Allow disabling request and idle timeouts with negative values in service router and service resolver (#19992 ) * add coverage for testing these timeouts	2023-12-19 15:36:07 -08:00
Derek Menteer	dfab5ade50	Fix ClusterLoadAssignment timeouts dropping endpoints. (#19871 ) When a large number of upstreams are configured on a single envoy proxy, there was a chance that it would timeout when waiting for ClusterLoadAssignments. While this doesn't always immediately cause issues, consul-dataplane instances appear to consistently drop endpoints from their configurations after an xDS connection is re-established (the server dies, random disconnect, etc). This commit adds an `xds_fetch_timeout_ms` config to service registrations so that users can set the value higher for large instances that have many upstreams. The timeout can be disabled by setting a value of `0`. This configuration was introduced to reduce the risk of causing a breaking change for users if there is ever a scenario where endpoints would never be received. Rather than just always blocking indefinitely or for a significantly longer period of time, this config will affect only the service instance associated with it.	2023-12-11 09:25:11 -06:00
Dhia Ayachi	d93f7f730d	parse config protocol on write to optimize disco-chain compilation (#19829 ) * parse config protocol on write to optimize disco-chain compilation * add changelog	2023-12-07 13:46:46 -05:00
John Murret	4aa95f3d1f	Migrate individual resource tests for Ingress Gateway to TestAllResourcesFromSnapshot (#19506 ) migrate ingress-gateway tests to resources_test.go	2023-11-09 16:08:07 +00:00
John Murret	caaff73337	add DeliverLatest as common function for use by Manager and ProxyTracker Open (#19564 ) Open add DeliverLatest as common function for use by Manager and ProxyTracker	2023-11-07 23:03:37 +00:00
Chris S. Kim	92ce814693	Remove old build tags (#19128 )	2023-10-10 10:58:06 -04:00
Thomas Eckert	342306c312	Allow connections through Terminating Gateways from peered clusters NET-3463 (#18959 ) * Add InboundPeerTrustBundle maps to Terminating Gateway * Add notify and cancelation of watch for inbound peer trust bundles * Pass peer trust bundles to the RBAC creation function * Regenerate Golden Files * add changelog, also adds another spot that needed peeredTrustBundles * Add basic test for terminating gateway with peer trust bundle * Add intention to cluster peered golden test * rerun codegen * update changelog * really update the changelog --------- Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com>	2023-10-05 21:54:23 +00:00
Dhia Ayachi	b1688ad856	Run copyright after running deep-copy as part of the Makefile/CI (#18741 ) * execute copyright headers after performing deep-copy generation. * fix copyright install * Apply suggestions from code review Co-authored-by: Semir Patel <semir.patel@hashicorp.com> * Apply suggestions from code review Co-authored-by: Semir Patel <semir.patel@hashicorp.com> * rename steps to match codegen naming * remove copywrite install category --------- Co-authored-by: Semir Patel <semir.patel@hashicorp.com>	2023-09-11 13:50:52 -04:00
Derek Menteer	a698142325	Add extra logging for mesh health endpoints. (#18647 )	2023-09-01 12:29:09 -05:00
John Maguire	9876923e23	Add the plumbing for APIGW JWT work (#18609 ) * Add the plumbing for APIGW JWT work * Remove unneeded import * Add deep equal function for HTTPMatch * Added plumbing for status conditions * Remove unneeded comment * Fix comments * Add calls in xds listener for apigateway to setup listener jwt auth	2023-08-31 12:23:59 -04:00
Ashwin Venkatesh	797e42dc24	Watch the ProxyTracker from xDS controller (#18611 )	2023-08-29 14:39:29 -07:00
John Murret	0e606504bc	NET-4944 - wire up controllers with proxy tracker (#18603 ) Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>	2023-08-29 09:15:34 -06:00
John Murret	051f250edb	NET-5338 - NET-5338 - Run a v2 mode xds server (#18579 ) * NET-5338 - NET-5338 - Run a v2 mode xds server * fix linting	2023-08-24 16:44:14 -06:00
John Maguire	59ab57f350	NET-5147: Added placeholder structs for JWT functionality (#18575 ) * Added placeholder structs for JWT functionality * Added watches for CE vs ENT * Add license header * Undo plumbing work * Add context arg	2023-08-24 15:07:14 -04:00
Semir Patel	53e28a4963	OSS -> CE (community edition) changes (#18517 )	2023-08-22 09:46:03 -05:00
hashicorp-copywrite[bot]	5fb9df1640	[COMPLIANCE] License changes (#18443 ) * Adding explicit MPL license for sub-package This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository. * Adding explicit MPL license for sub-package This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository. * Updating the license from MPL to Business Source License Going forward, this project will be licensed under the Business Source License v1.1. Please see our blog post for more details at <Blog URL>, FAQ at www.hashicorp.com/licensing-faq, and details of the license at www.hashicorp.com/bsl. * add missing license headers * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 * Update copyright file headers to BUSL-1.1 --------- Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>	2023-08-11 09:12:13 -04:00
Dan Stough	284e3bdb54	[OSS] test: xds coverage for routes (#18369 ) test: xds coverage for routes	2023-08-03 15:03:02 -04:00
cui fliter	18a5edd232	docs: Fix some comments (#17118 ) Signed-off-by: cui fliter <imcusg@gmail.com>	2023-07-31 10:56:09 -07:00
Nathan Coleman	5caa0ae3f5	api-gateway: subscribe to bound-api-gateway only after receiving api-gateway (#18291 ) * api-gateway: subscribe to bound-api-gateway only after receiving api-gateway This fixes a race condition due to our dependency on having the listener(s) from the api-gateway config entry in order to fully and properly process the resources on the bound-api-gateway config entry. * Apply suggestions from code review * Add changelog entry	2023-07-26 16:02:04 -04:00
Dan Stough	8e3a1ddeb6	[OSS] Improve xDS Code Coverage - Endpoints and Misc (#18222 ) test: improve xDS endpoints code coverage	2023-07-21 17:48:25 -04:00
Dan Stough	2793761702	[OSS] Improve xDS Code Coverage - Clusters (#18165 ) test: improve xDS cluster code coverage	2023-07-20 18:02:21 -04:00
Dan Stough	33d898b857	[OSS] test: improve xDS listener code coverage (#18138 ) test: improve xDS listener code coverage	2023-07-17 13:49:40 -04:00
Dan Stough	1b08626358	[OSS] Fix initial_fetch_timeout to wait for all xDS resources (#18024 ) * fix(connect): set initial_fetch_time to wait indefinitely * changelog * PR feedback 1	2023-07-10 17:08:06 -04:00
Ronald	80394278b8	Expose JWKS cluster config through JWTProviderConfigEntry (#17978 ) * Expose JWKS cluster config through JWTProviderConfigEntry * fix typos, rename trustedCa to trustedCA	2023-07-04 09:12:06 -04:00
Eric Haberkorn	a3ba559149	Make locality aware routing xDS changes (#17826 )	2023-06-21 12:39:53 -04:00
R.B. Boyer	72f991d8d3	agent: remove agent cache dependency from service mesh leaf certificate management (#17075 ) * agent: remove agent cache dependency from service mesh leaf certificate management This extracts the leaf cert management from within the agent cache. This code was produced by the following process: 1. All tests in agent/cache, agent/cache-types, agent/auto-config, agent/consul/servercert were run at each stage. - The tests in agent matching .Leaf were run at each stage. - The tests in agent/leafcert were run at each stage after they existed. 2. The former leaf cert Fetch implementation was extracted into a new package behind a "fake RPC" endpoint to make it look almost like all other cache type internals. 3. The old cache type was shimmed to use the fake RPC endpoint and generally cleaned up. 4. I selectively duplicated all of Get/Notify/NotifyCallback/Prepopulate from the agent/cache.Cache implementation over into the new package. This was renamed as leafcert.Manager. - Code that was irrelevant to the leaf cert type was deleted (inlining blocking=true, refresh=false) 5. Everything that used the leaf cert cache type (including proxycfg stuff) was shifted to use the leafcert.Manager instead. 6. agent/cache-types tests were moved and gently replumbed to execute as-is against a leafcert.Manager. 7. Inspired by some of the locking changes from derek's branch I split the fat lock into N+1 locks. 8. The waiter chan struct{} was eventually replaced with a singleflight.Group around cache updates, which was likely the biggest net structural change. 9. The awkward two layers or logic produced as a byproduct of marrying the agent cache management code with the leaf cert type code was slowly coalesced and flattened to remove confusion. 10. The .Leaf tests from the agent package were copied and made to work directly against a leafcert.Manager to increase direct coverage. I have done a best effort attempt to port the previous leaf-cert cache type's tests over in spirit, as well as to take the e2e-ish tests in the agent package with Leaf in the test name and copy those into the agent/leafcert package to get more direct coverage, rather than coverage tangled up in the agent logic. There is no net-new test coverage, just coverage that was pushed around from elsewhere.	2023-06-13 10:54:45 -05:00
R.B. Boyer	ec347ef01d	sort some imports that are wonky between oss and ent (#17637 )	2023-06-09 11:30:56 -05:00
Andrew Stucki	9a4f503b2b	[API Gateway] Fix trust domain for external peered services in synthesis code (#17609 ) * [API Gateway] Fix trust domain for external peered services in synthesis code * Add changelog	2023-06-08 12:18:17 -04:00
Michael Zalimeni	ad03a5d0f2	Avoid panic applying TProxy Envoy extensions (#17537 ) When UpstreamEnvoyExtender was introduced, some code was left duplicated between it and BasicEnvoyExtender. One path in that code panics when a TProxy listener patch is attempted due to no upstream data in RuntimeConfig matching the local service (which would only happen in rare cases). Instead, we can remove the special handling of upstream VIPs from BasicEnvoyExtender entirely, greatly simplifying the listener filter patch code and avoiding the panic. UpstreamEnvoyExtender, which needs this code to function, is modified to ensure a panic does not occur. This also fixes a second regression in which the Lua extension was not applied to TProxy outbound listeners.	2023-06-01 13:04:39 -04:00
sarahalsmiller	b147323fb0	xds: Remove APIGateway ToIngress function (#17453 ) * xds generation for routes api gateway * Update gateway.go * move buildHttpRoute into xds package * Update agent/consul/discoverychain/gateway.go * remove unneeded function * convert http route code to only run for http protocol to future proof code path * Update agent/consul/discoverychain/gateway.go Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com> * fix tests, clean up http check logic * clean up todo * Fix casing in docstring * Fix import block, adjust docstrings * Rename func * Consolidate docstring onto single line * Remove ToIngress() conversion for APIGW, which generates its own xDS now * update name and comment * use constant value * use constant * rename readyUpstreams to readyListeners to better communicate what that function is doing --------- Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com> Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>	2023-05-25 15:16:37 +00:00
Dan Stough	d935c7b466	[OSS] gRPC Blocking Queries (#17426 ) * feat: initial grpc blocking queries * changelog and docs update	2023-05-23 17:29:10 -04:00
sarahalsmiller	e2a81aa8bd	xds: generate listeners directly from API gateway snapshot (#17398 ) * API Gateway XDS Primitives, endpoints and clusters (#17002) * XDS primitive generation for endpoints and clusters Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * server_test * deleted extra file * add missing parents to test --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Routes for API Gateway (#17158) * XDS primitive generation for endpoints and clusters Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * server_test * deleted extra file * add missing parents to test * checkpoint * delete extra file * httproute flattening code * linting issue * so close on this, calling for tonight * unit test passing * add in header manip to virtual host * upstream rebuild commented out * Use consistent upstream name whether or not we're rebuilding * Start working through route naming logic * Fix typos in test descriptions * Simplify route naming logic * Simplify RebuildHTTPRouteUpstream * Merge additional compiled discovery chains instead of overwriting * Use correct chain for flattened route, clean up + add TODOs * Remove empty conditional branch * Restore previous variable declaration Limit the scope of this PR * Clean up, improve TODO * add logging, clean up todos * clean up function --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * checkpoint, skeleton, tests not passing * checkpoint * endpoints xds cluster configuration * resources test fix * fix reversion in resources_test * checkpoint * Update agent/proxycfg/api_gateway.go Co-authored-by: John Maguire <john.maguire@hashicorp.com> * unit tests passing * gofmt * add deterministic sorting to appease the unit test gods * remove panic * Find ready upstream matching listener instead of first in list * Clean up, improve TODO * Modify getReadyUpstreams to filter upstreams by listener (#17410) Each listener would previously have all upstreams from any route that bound to the listener. This is problematic when a route bound to one listener also binds to other listeners and so includes upstreams for multiple listeners. The list for a given listener would then wind up including upstreams for other listeners. * clean up todos, references to api gateway in listeners_ingress * merge in Nathan's fix * Update agent/consul/discoverychain/gateway.go * cleanup current todos, remove snapshot manipulation from generation code * Update agent/structs/config_entry_gateways.go Co-authored-by: Thomas Eckert <teckert@hashicorp.com> * Update agent/consul/discoverychain/gateway.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Update agent/consul/discoverychain/gateway.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Update agent/proxycfg/snapshot.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * clarified header comment for FlattenHTTPRoute, changed RebuildHTTPRouteUpstream to BuildHTTPRouteUpstream * simplify cert logic * Delete scratch * revert route related changes in listener PR * Update agent/consul/discoverychain/gateway.go * Update agent/proxycfg/snapshot.go * clean up uneeded extra lines in endpoints --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> Co-authored-by: John Maguire <john.maguire@hashicorp.com> Co-authored-by: Thomas Eckert <teckert@hashicorp.com>	2023-05-22 17:36:29 -04:00
Ronald	113202d541	JWT Authentication with service intentions: xds package update (#17414 ) * JWT Authentication with service intentions: update xds package to translate config to envoy	2023-05-19 18:14:16 -04:00
sarahalsmiller	134aac7c26	xds: generate endpoints directly from API gateway snapshot (#17390 ) * endpoints xds cluster configuration * resources test fix * fix reversion in resources_test * Update agent/proxycfg/api_gateway.go Co-authored-by: John Maguire <john.maguire@hashicorp.com> * gofmt * Modify getReadyUpstreams to filter upstreams by listener (#17410) Each listener would previously have all upstreams from any route that bound to the listener. This is problematic when a route bound to one listener also binds to other listeners and so includes upstreams for multiple listeners. The list for a given listener would then wind up including upstreams for other listeners. * Update agent/proxycfg/api_gateway.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Restore import blocking * Skip to next route if route has no upstreams * cleanup * change set from bool to empty struct --------- Co-authored-by: John Maguire <john.maguire@hashicorp.com> Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>	2023-05-19 18:50:59 +00:00
Kyle Havlovitz	2904d0a431	Pull virtual IPs for filter chains from discovery chains (#17375 )	2023-05-17 11:18:39 -07:00
Connor	0789661ce5	Rename hcp-metrics-collector to consul-telemetry-collector (#17327 ) * Rename hcp-metrics-collector to consul-telemetry-collector * Fix docs * Fix doc comment --------- Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>	2023-05-16 14:36:05 -04:00
Freddy	7c3e9cd862	Hash namespace+proxy ID when creating socket path (#17204 ) UNIX domain socket paths are limited to 104-108 characters, depending on the OS. This limit was quite easy to exceed when testing the feature on Kubernetes, due to how proxy IDs encode the Pod ID eg: metrics-collector-59467bcb9b-fkkzl-hcp-metrics-collector-sidecar-proxy To ensure we stay under that character limit this commit makes a couple changes: - Use a b64 encoded SHA1 hash of the namespace + proxy ID to create a short and deterministic socket file name. - Add validation to proxy registrations and proxy-defaults to enforce a limit on the socket directory length.	2023-05-09 12:20:26 -06:00
Derek Menteer	4f6da20fe5	Fix multiple issues related to proxycfg health queries. (#17241 ) Fix multiple issues related to proxycfg health queries. 1. The datacenter was not being provided to a proxycfg query, which resulted in bypassing agentless query optimizations and using the normal API instead. 2. The health rpc endpoint would return a zero index when insufficient ACLs were detected. This would result in the agent cache performing an infinite loop of queries in rapid succession without backoff.	2023-05-09 12:37:58 -05:00
Semir Patel	5eaeb7b8e5	Support Envoy's MaxEjectionPercent and BaseEjectionTime config entries for passive health checks (#15979 ) * Add MaxEjectionPercent to config entry * Add BaseEjectionTime to config entry * Add MaxEjectionPercent and BaseEjectionTime to protobufs * Add MaxEjectionPercent and BaseEjectionTime to api * Fix integration test breakage * Verify MaxEjectionPercent and BaseEjectionTime in integration test upstream confings * Website docs for MaxEjectionPercent and BaseEjection time * Add `make docs` to browse docs at http://localhost:3000 * Changelog entry * so that is the difference between consul-docker and dev-docker * blah * update proto funcs * update proto --------- Co-authored-by: Maliz <maliheh.monshizadeh@hashicorp.com>	2023-04-26 15:59:48 -07:00
Eric Haberkorn	b1fae05983	Add sameness groups to service intentions. (#17064 )	2023-04-20 12:16:04 -04:00
Eric Haberkorn	44b39240a8	move enterprise test cases out of open source (#16985 )	2023-04-13 09:07:06 -04:00
cskh	a319953576	docs: add envoy to the proxycfg diagram (#16834 ) * docs: add envoy to the proxycfg diagram	2023-04-04 09:42:42 -04:00
Eric Haberkorn	a6d69adcf5	Add default resolvers to disco chains based on the default sameness group (#16837 )	2023-03-31 14:35:56 -04:00
Eric Haberkorn	0d1d2fc4c9	add order by locality failover to Consul enterprise (#16791 )	2023-03-30 10:08:38 -04:00
Ronald	94ec4eb2f4	copyright headers for agent folder (#16704 ) * copyright headers for agent folder * Ignore test data files * fix proto files and remove headers in agent/uiserver folder * ignore deep-copy files	2023-03-28 14:39:22 -04:00
Derek Menteer	2236975011	Change partition for peers in discovery chain targets (#16769 ) This commit swaps the partition field to the local partition for discovery chains targeting peers. Prior to this change, peer upstreams would always use a value of default regardless of which partition they exist in. This caused several issues in xds / proxycfg because of id mismatches. Some prior fixes were made to deal with one-off id mismatches that this PR also cleans up, since they are no longer needed.	2023-03-24 15:40:19 -05:00
Eric Haberkorn	495ad4c7ef	add enterprise xds tests (#16738 )	2023-03-22 14:56:18 -04:00

1 2 3 4 5 ...

366 Commits