consul

Commit Graph

Author	SHA1	Message	Date
Kyle Havlovitz	dde5c524ad	connect: strip port from DNS SANs for ingress gateway leaf cert (#15320 ) * connect: strip port from DNS SANs for ingress gateway leaf cert * connect: format DNS SANs in CreateCSR * connect: Test wildcard case when formatting SANs	2022-11-14 10:27:03 -08:00
Derek Menteer	931cec42b3	Prevent serving TLS via ports.grpc (#15339 ) Prevent serving TLS via ports.grpc We remove the ability to run the ports.grpc in TLS mode to avoid confusion and to simplify configuration. This breaking change ensures that any user currently using ports.grpc in an encrypted mode will receive an error message indicating that ports.grpc_tls must be explicitly used. The suggested action for these users is to simply swap their ports.grpc to ports.grpc_tls in the configuration file. If both ports are defined, or if the user has not configured TLS for grpc, then the error message will not be printed.	2022-11-11 14:29:22 -06:00
Dan Stough	626249fbf5	[OSS] fix: wait and try longer to peer through mesh gw (#15328 )	2022-11-10 13:54:00 -05:00
Kyle Schochenmaier	bf0f61a878	removes ioutil usage everywhere which was deprecated in go1.16 (#15297 ) * update go version to 1.18 for api and sdk, go mod tidy * removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward. Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-11-10 10:26:01 -06:00
malizz	b51f0e25e9	update ACLs for cluster peering (#15317 ) * update ACLs for cluster peering * add changelog * Update .changelog/15317.txt Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com> Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>	2022-11-09 13:02:58 -08:00
malizz	b9a9e1219c	update config defaults, add docs (#15302 ) * update config defaults, add docs * update grpc tls port for non-default values * add changelog * Update website/content/docs/upgrading/upgrade-specific.mdx Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> * Update website/content/docs/agent/config/config-files.mdx Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> * update logic for setting grpc tls port value * move default config to default.go, update changelog * update docs * Fix config tests. * Fix linter error. * Fix ConnectCA tests. * Cleanup markdown on upgrade notes. Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>	2022-11-09 09:29:55 -08:00
Eric Haberkorn	c340922991	Log Warnings When Peering With Mesh Gateway Mode None (#15304 ) warn when mesh gateway mode is set to none for peering	2022-11-09 11:48:58 -05:00
Derek Menteer	418bd62c44	Fix mesh gateway configuration with proxy-defaults (#15186 ) * Fix mesh gateway proxy-defaults not affecting upstreams. * Clarify distinction with upstream settings Top-level mesh gateway mode in proxy-defaults and service-defaults gets merged into NodeService.Proxy.MeshGateway, and only gets merged with the mode attached to an an upstream in proxycfg/xds. * Fix mgw mode usage for peered upstreams There were a couple issues with how mgw mode was being handled for peered upstreams. For starters, mesh gateway mode from proxy-defaults and the top-level of service-defaults gets stored in NodeService.Proxy.MeshGateway, but the upstream watch for peered data was only considering the mesh gateway config attached in NodeService.Proxy.Upstreams[i]. This means that applying a mesh gateway mode via global proxy-defaults or service-defaults on the downstream would not have an effect. Separately, transparent proxy watches for peered upstreams didn't consider mesh gateway mode at all. This commit addresses the first issue by ensuring that we overlay the upstream config for peered upstreams as we do for non-peered. The second issue is addressed by re-using setupWatchesForPeeredUpstream when handling transparent proxy updates. Note that for transparent proxies we do not yet support mesh gateway mode per upstream, so the NodeService.Proxy.MeshGateway mode is used. * Fix upstream mesh gateway mode handling in xds This commit ensures that when determining the mesh gateway mode for peered upstreams we consider the NodeService.Proxy.MeshGateway config as a baseline. In absense of this change, setting a mesh gateway mode via proxy-defaults or the top-level of service-defaults will not have an effect for peered upstreams. * Merge service/proxy defaults in cfg resolver Previously the mesh gateway mode for connect proxies would be merged at three points: 1. On servers, in ComputeResolvedServiceConfig. 2. On clients, in MergeServiceConfig. 3. On clients, in proxycfg/xds. The first merge returns a ServiceConfigResponse where there is a top-level MeshGateway config from proxy/service-defaults, along with per-upstream config. The second merge combines per-upstream config specified at the service instance with per-upstream config specified centrally. The third merge combines the NodeService.Proxy.MeshGateway config containing proxy/service-defaults data with the per-upstream mode. This third merge is easy to miss, which led to peered upstreams not considering the mesh gateway mode from proxy-defaults. This commit removes the third merge, and ensures that all mesh gateway config is available at the upstream. This way proxycfg/xds do not need to do additional overlays. * Ensure that proxy-defaults is considered in wc Upstream defaults become a synthetic Upstream definition under a wildcard key "". Now that proxycfg/xds expect Upstream definitions to have the final MeshGateway values, this commit ensures that values from proxy-defaults/service-defaults are the default for this synthetic upstream. Add changelog. Co-authored-by: freddygv <freddy@hashicorp.com>	2022-11-09 10:14:29 -06:00
Dan Upton	7b2d08d461	chore: remove unused argument from MergeNodeServiceWithCentralConfig (#15024 ) Previously, the MergeNodeServiceWithCentralConfig method accepted a ServiceSpecificRequest argument, of which only the Datacenter and QueryOptions fields were used. Digging a little deeper, it turns out these fields were only passed down to the ComputeResolvedServiceConfig method (through the ServiceConfigRequest struct) which didn't actually use them. As such, not all call-sites passed a valid ServiceSpecificRequest so it's safer to remove the argument altogether to prevent future changes from depending on it.	2022-11-09 14:54:57 +00:00
Derek Menteer	b64972d486	Bring back parameter ServerExternalAddresses in GenerateToken endpoint (#15267 ) Re-add ServerExternalAddresses parameter in GenerateToken endpoint This reverts commit `5e156772f6` and adds extra functionality to support newer peering behaviors.	2022-11-08 14:55:18 -06:00
cskh	a3f57cc5e8	fix(mesh-gateway): remove deregistered service from mesh gateway (#15272 ) * fix(mesh-gateway): remove deregistered service from mesh gateway * changelog Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> Co-authored-by: Evan Culver <eculver@users.noreply.github.com>	2022-11-07 20:30:15 -05:00
Freddy	7f5f7e9cf9	Avoid blocking child type updates on parent ack (#15083 )	2022-11-07 18:10:42 -07:00
Derek Menteer	c064ddf606	Backport test fix from ent. (#15279 )	2022-11-07 12:17:46 -06:00
Chris S. Kim	985a4ee1b1	Update hcp-scada-provider to fix diamond dependency problem with go-msgpack (#15185 )	2022-11-07 11:34:30 -05:00
Eric Haberkorn	1804b58799	Fix a bug in mesh gateway proxycfg where ACL tokens aren't passed. (#15273 )	2022-11-07 10:00:11 -05:00
Dan Stough	553312ef61	fix: persist peering CA updates to dialing clusters (#15243 ) fix: persist peering CA updates to dialing clusters	2022-11-04 12:53:20 -04:00
Derek Menteer	18d6c338f4	Backport tests from ent. (#15260 ) * Backport agent tests. Original commit: 0710b2d12fb51a29cedd1119b5fb086e5c71f632 Original commit: aaedb3c28bfe247266f21013d500147d8decb7cd (partial) * Backport test fix and reduce flaky failures.	2022-11-04 10:19:24 -05:00
Derek Menteer	0834fe349b	Backport test from ENT: "Fix missing test fields" (#15258 ) * Backport test from ENT: "Fix missing test fields" Original Author: Sarah Pratt Original Commit: a5c88bef7a969ea5d06ed898d142ab081ba65c69 * Update with proper linting.	2022-11-04 09:29:16 -05:00
Derek Menteer	f4cb2f82bf	Backport various fixes from ENT. (#15254 ) * Regenerate golden files. * Backport from ENT: "Avoid race" Original commit: 5006c8c858b0e332be95271ef9ba35122453315b Original author: freddygv * Backport from ENT: "chore: fix flake peerstream test" Original commit: b74097e7135eca48cc289798c5739f9ef72c0cc8 Original author: DanStough	2022-11-03 16:34:57 -05:00
malizz	617a5f2dc2	convert stream status time fields to pointers (#15252 )	2022-11-03 11:51:22 -07:00
sarahalsmiller	436160e155	Added check for empty peeringsni in restrictPeeringEndpoints (#15239 ) Add check for empty peeringSNI in restrictPeeringEndpoints Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>	2022-11-02 17:20:52 -05:00
Derek Menteer	bd1019fadb	Prevent peering acceptor from subscribing to addr updates. (#15214 )	2022-11-02 07:55:41 -05:00
Dan Stough	05e93f7569	test: refactor testcontainers and add peering integ tests (#15084 )	2022-11-01 15:03:23 -04:00
Derek Menteer	fa5d87c116	Decrease retry time for failed peering connections.	2022-10-31 14:30:27 -05:00
R.B. Boyer	97b9fcbf48	test: fix flaky TestSubscribeBackend_IntegrationWithServer_DeliversAllMessages test (#15195 ) Allow for some message duplication in subscription events during assertions. I'm pretty sure the subscriptions machinery allows for messages to occasionally be duplicated instead of dropping them, as a once-and-only-once queue is a pipe dream and you have to pick one of the other two options.	2022-10-31 12:10:43 -05:00
Evan Culver	62d4517f9e	connect: Add Envoy 1.24 to integration tests, remove Envoy 1.20 (#15093 )	2022-10-31 10:50:45 -05:00
Derek Menteer	693c8a4706	Allow peering endpoints to bypass verify_incoming.	2022-10-31 09:56:30 -05:00
Derek Menteer	2d4b62be3c	Add tests.	2022-10-31 08:45:00 -05:00
Derek Menteer	1483c94531	Fix peered service protocols using proxy-defaults.	2022-10-31 08:45:00 -05:00
Eric Haberkorn	cf50bdbe20	Fix peering metrics bug (#15178 ) This bug was caused by the peering health metric being set to NaN.	2022-10-28 10:51:12 -04:00
Chris S. Kim	0e176dd6aa	Allow consul debug on non-ACL consul servers (#15155 )	2022-10-27 09:25:18 -04:00
cskh	a9427e1310	fix(peering): nil pointer in calling handleUpdateService (#15160 ) * fix(peering): nil pointer in calling handleUpdateService * changelog	2022-10-26 11:50:34 -04:00
Eric Haberkorn	1bdad89026	fix bug that resulted in generating Envoy configs that use CDS with an EDS configuration (#15140 )	2022-10-25 14:49:57 -04:00
Luke Kysow	d3aa2bd9c5	ingress-gateways: don't log error when registering gateway (#15001 ) * ingress-gateways: don't log error when registering gateway Previously, when an ingress gateway was registered without a corresponding ingress gateway config entry, an error was logged because the watch on the config entry returned a nil result. This is expected so don't log an error.	2022-10-25 10:55:44 -07:00
Luke Kysow	9999672fd7	autoencrypt: helpful error for clients with wrong dc (#14832 ) * autoencrypt: helpful error for clients with wrong dc If clients have set a different datacenter than the servers they're connecting with for autoencrypt, give a helpful error message.	2022-10-25 10:13:41 -07:00
R.B. Boyer	3c44116a8f	cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956 ) This continues the work done in #14908 where a crude solution to prevent a goroutine leak was implemented. The former code would launch a perpetual goroutine family every iteration (+1 +1) and the fixed code simply caused a new goroutine family to first cancel the prior one to prevent the leak (-1 +1 == 0). This PR refactors this code completely to: - make it more understandable - remove the recursion-via-goroutine strangeness - prevent unnecessary RPC fetches when the prior one has errored. The core issue arose from a conflation of the entry.Fetching field to mean: - there is an RPC (blocking query) in flight right now - there is a goroutine running to manage the RPC fetch retry loop The problem is that the goroutine-leak-avoidance check would treat Fetching like (2), but within the body of a goroutine it would flip that boolean back to false before the retry sleep. This would cause a new chain of goroutines to launch which #14908 would correct crudely. The refactored code uses a plain for-loop and changes the semantics to track state for "is there a goroutine associated with this cache entry" instead of the former. We use a uint64 unique identity per goroutine instead of a boolean so that any orphaned goroutines can tell when they've been replaced when the expiry loop deletes a cache entry while the goroutine is still running and is later replaced.	2022-10-25 10:27:26 -05:00
R.B. Boyer	da70daba43	test: ensure that all dependencies in a test agent use the test logger (#14996 )	2022-10-24 17:02:38 -05:00
Chris S. Kim	9f0ed81cfd	Remove invalid 1xx HTTP codes These tests started failing in go1.19, presumably due to support for valid 1xx responses being added. https://github.com/golang/go/issues/56346	2022-10-24 16:12:08 -04:00
Chris S. Kim	bde57c0dd0	Regenerate files according to 1.19.2 formatter	2022-10-24 16:12:08 -04:00
cskh	db82ffe503	fix(peering): replicating wan address (#15108 ) * fix(peering): replicating wan address * add changelog * unit test	2022-10-24 15:44:57 -04:00
Iryna Shustava	176abb5ff2	proxycfg: watch service-defaults config entries (#15025 ) To support Destinations on the service-defaults (for tproxy with terminating gateway), we need to now also make servers watch service-defaults config entries.	2022-10-24 12:50:28 -06:00
Chris S. Kim	b236e86030	Move oss-only test to its own file	2022-10-24 14:17:43 -04:00
R.B. Boyer	d04cf25fa8	test: fix flaky TestHealthServiceNodes_NodeMetaFilter by waiting until the streaming subsystem has a valid grpc connection (#15019 ) Also potentially unflakes TestHealthIngressServiceNodes for similar reasons.	2022-10-24 13:09:53 -05:00
R.B. Boyer	300860412c	chore: update golangci-lint to v1.50.1 (#15022 )	2022-10-24 11:48:02 -05:00
Venu Yanamandra	efc813e92d	Update error message when restoring ENT snapshot in OSS (#15066 )	2022-10-24 11:40:26 -04:00
freddygv	d65e60de86	Return forbidden on permission denied This commit updates the establish endpoint to bubble up a 403 status code to callers when the establishment secret from the token is invalid. This is a signal that a new peering token must be generated.	2022-10-20 17:11:49 -06:00
Chris S. Kim	a7ea26192b	Update expected encoding in test go-memdb was updated in v1.3.3 to make integers in indexes sortable, which changed how integers were encoded.	2022-10-20 14:32:42 -04:00
freddygv	6d9be5fb15	Use plain TaggedAddressWAN	2022-10-19 16:32:44 -06:00
freddygv	8d211cc9cc	Add unit test	2022-10-19 16:26:15 -06:00
cskh	058ee4fb84	fix: wan address isn't used by peering token	2022-10-19 16:33:25 -04:00
Nitya Dhanushkodi	5e156772f6	Remove ability to specify external addresses in GenerateToken endpoint (#14930 ) * Reverts "update generate token endpoint to take external addresses (#13844)" This reverts commit `f47319b7c6`.	2022-10-19 09:31:36 -07:00
Kyle Havlovitz	5c3427608b	Merge pull request #15035 from hashicorp/vault-ttl-update-warn Warn instead of returning error when missing intermediate mount tune permissions	2022-10-18 15:41:52 -07:00
cskh	d562d363fc	peering: skip registering duplicate node and check from the peer (#14994 ) * peering: skip register duplicate node and check from the peer * Prebuilt the nodes map and checks map to avoid repeated for loop * use key type to struct: node id, service id, and check id	2022-10-18 16:19:24 -04:00
Chris S. Kim	29a297d3e9	Refactor client RPC timeouts (#14965 ) Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.	2022-10-18 15:05:09 -04:00
Kyle Havlovitz	d122108992	Warn instead of returning an error when intermediate mount tune permission is missing	2022-10-18 12:01:25 -07:00
R.B. Boyer	0cca4c088d	test: possibly fix flake in TestIntentionGetExact (#15021 ) Restructure test setup to be similar to TestAgent_ServerCertificate and see if that's enough to avoid flaking after join.	2022-10-18 10:51:20 -05:00
R.B. Boyer	fe2d41ddad	cache: prevent goroutine leak in agent cache (#14908 ) There is a bug in the error handling code for the Agent cache subsystem discovered: 1. NotifyCallback calls notifyBlockingQuery which calls getWithIndex in a loop (which backs off on-error up to 1 minute) 2. getWithIndex calls fetch if there’s no valid entry in the cache 3. fetch starts a goroutine which calls Fetch on the cache-type, waits for a while (again with backoff up to 1 minute for errors) and then calls fetch to trigger a refresh The end result being that every 1 minute notifyBlockingQuery spawns an ancestry of goroutines that essentially lives forever. This PR ensures that the goroutine started by `fetch` cancels any prior goroutine spawned by the same line for the same key. In isolated testing where a cache type was tweaked to indefinitely error, this patch prevented goroutine counts from skyrocketing.	2022-10-17 14:38:10 -05:00
R.B. Boyer	02a858efa0	ca: fix a masked bug in leaf cert generation that would not be notified of root cert rotation after the first one (#15005 ) In practice this was masked by #14956 and was only uncovered fixing the other bug. go test ./agent -run TestAgentConnectCALeafCert_goodNotLocal would fail when only #14956 was fixed.	2022-10-17 13:24:27 -05:00
Chris S. Kim	3d2dffff16	Merge pull request #13388 from deblasis/feature/health-checks_windows_service Feature: Health checks windows service	2022-10-17 09:26:19 -04:00
Dan Upton	f8b4b41205	proxycfg: fix goroutine leak when service is re-registered (#14988 ) Fixes a bug where we'd leak a goroutine in state.run when the given context was canceled while there was a pending update.	2022-10-17 11:31:10 +01:00
Kyle Havlovitz	aaf892a383	Extend tcp keepalive settings to work for terminating gateways as well	2022-10-14 17:05:46 -07:00
Kyle Havlovitz	2c569f6b9c	Update docs and add tcp_keepalive_probes setting	2022-10-14 17:05:46 -07:00
Kyle Havlovitz	2242d1ec4a	Add TCP keepalive settings to proxy config for mesh gateways	2022-10-14 17:05:46 -07:00
Derek Menteer	2a33d0ff96	Fix issue with incorrect method signature on test.	2022-10-14 11:04:57 -05:00
Freddy	24d0c8801a	Merge pull request #14981 from hashicorp/peering/dial-through-gateways	2022-10-14 09:44:56 -06:00
Dan Upton	328e3ff563	proxycfg: rate-limit delivery of config snapshots (#14960 ) Adds a user-configurable rate limiter to proxycfg snapshot delivery, with a default limit of 250 updates per second. This addresses a problem observed in our load testing of Consul Dataplane where updating a "global" resource such as a wildcard intention or the proxy-defaults config entry could starve the Raft or Memberlist goroutines of CPU time, causing general cluster instability.	2022-10-14 15:52:00 +01:00
Derek Menteer	29ebcf5ff0	Add tests for peering state snapshots / restores.	2022-10-14 09:48:04 -05:00
Derek Menteer	e3ff9912d0	Add test for ExportedServicesForAllPeersByName	2022-10-14 09:48:04 -05:00
Dan Upton	e6b55d1d81	perf: remove expensive reflection from xDS hot path (#14934 ) Replaces the reflection-based implementation of proxycfg's ConfigSnapshot.Clone with code generated by deep-copy. While load testing server-based xDS (for consul-dataplane) we discovered this method is extremely expensive. The ConfigSnapshot struct, directly or indirectly, contains a copy of many of the structs in the agent/structs package, which creates a large graph for copystructure.Copy to traverse at runtime, on every proxy reconfiguration.	2022-10-14 10:26:42 +01:00
freddygv	c77123a2aa	Use split var in tests	2022-10-13 17:12:47 -06:00
freddygv	bf51021c07	Use split wildcard partition name This way OSS avoids passing a non-empty label, which will be rejected in OSS consul.	2022-10-13 16:55:28 -06:00
Freddy	ee4cdc4985	Merge pull request #14935 from hashicorp/fix/alias-leak	2022-10-13 16:31:15 -06:00
freddygv	573aa408a1	Lint	2022-10-13 15:55:55 -06:00
Derek Menteer	0f424e3cdf	Reset wait on ensureServerAddrSubscription	2022-10-13 15:58:26 -05:00
freddygv	96fdd3728a	Fix CA init error code	2022-10-13 14:58:11 -06:00
freddygv	2c99a21596	Update leader routine to maybe use gateways	2022-10-13 14:58:00 -06:00
freddygv	e69bc727ec	Update peering establishment to maybe use gateways When peering through mesh gateways we expect outbound dials to peer servers to flow through the local mesh gateway addresses. Now when establishing a peering we get a list of dial addresses as a ring buffer that includes local mesh gateway addresses if the local DC is configured to peer through mesh gateways. The ring buffer includes the mesh gateway addresses first, but also includes the remote server addresses as a fallback. This fallback is present because it's possible that direct egress from the servers may be allowed. If not allowed then the leader will cycle back to a mesh gateway address through the ring. When attempting to dial the remote servers we retry up to a fixed timeout. If using mesh gateways we also have an initial wait in order to allow for the mesh gateways to configure themselves. Note that if we encounter a permission denied error we do not retry since that error indicates that the secret in the peering token is invalid.	2022-10-13 14:57:55 -06:00
malizz	b0b0cbb8ee	increase protobuf size limit for cluster peering (#14976 )	2022-10-13 13:46:51 -07:00
Derek Menteer	4e140c98bc	Address PR comments.	2022-10-13 14:11:02 -05:00
Derek Menteer	1e394da400	Disallow peering to the same cluster.	2022-10-13 14:11:02 -05:00
Derek Menteer	8742fbe14f	Prevent consul peer-exports by discovery chain.	2022-10-13 12:45:09 -05:00
Derek Menteer	f366edcb8d	Prevent the "consul" service from being exported.	2022-10-13 12:45:09 -05:00
Derek Menteer	caa1396255	Add remote peer partition and datacenter info.	2022-10-13 10:37:41 -05:00
Dan Upton	cbb4a030c4	xds: properly merge central config for "agentless" services (#14962 )	2022-10-13 12:04:59 +01:00
Dan Upton	0af9f16343	bug: fix goroutine leaks caused by incorrect usage of `WatchCh` (#14916 ) memdb's `WatchCh` method creates a goroutine that will publish to the returned channel when the watchset is triggered or the given context is canceled. Although this is called out in its godoc comment, it's not obvious that this method creates a goroutine who's lifecycle you need to manage. In the xDS capacity controller, we were calling `WatchCh` on each iteration of the control loop, meaning the number of goroutines would grow on each autopilot event until there was catalog churn. In the catalog config source, we were calling `WatchCh` with the background context, meaning that the goroutine would keep running after the sync loop had terminated.	2022-10-13 12:04:27 +01:00
Hans Hasselberg	0d5935ab83	adding configuration option cloud.scada_address (#14936 ) * adding scada_address * config tests * add changelog entry	2022-10-13 11:31:28 +02:00
Paul Glass	bcda205f88	Add consul.xds.server.streamStart metric (#14957 ) This adds a new consul.xds.server.streamStart metric to measure the time taken to first generate xDS resources after an xDS stream is opened.	2022-10-12 14:17:58 -05:00
Riddhi Shah	345191a0df	Service http checks data source for agentless proxies (#14924 ) Adds another datasource for proxycfg.HTTPChecks, for use on server agents. Typically these checks are performed by local client agents and there is no equivalent of this in agentless (where servers configure consul-dataplane proxies). Hence, the data source is mostly a no-op on servers but in the case where the service is present within the local state, it delegates to the cache data source.	2022-10-12 07:49:56 -07:00
Freddy	9ca8bb8ec4	Merge pull request #14958 from hashicorp/peering/nonce	2022-10-12 08:18:15 -06:00
freddygv	1b46b35041	Actually track nonce in test	2022-10-12 07:50:17 -06:00
Derek Menteer	f330438a45	Fix incorrect backoff-wait logic.	2022-10-12 08:01:10 -05:00
freddygv	7f9a5d0f58	Add basic nonce management This commit adds a monotonically increasing nonce to include in peering replication response messages. Every ack/nack from the peer handling a response will include this nonce, allowing to correlate the ack/nack with a specific resource. At the moment nothing is done with the nonce when it is received. In the future we may want to add functionality such as retries on NACKs, depending on the class of error.	2022-10-11 19:02:04 -06:00
Paul Glass	d17af23641	gRPC server metrics (#14922 ) * Move stats.go from grpc-internal to grpc-middleware * Update grpc server metrics with server type label * Add stats test to grpc-external * Remove global metrics instance from grpc server tests	2022-10-11 17:00:32 -05:00
cskh	e0356e1502	fix(peering): add missing grpc_tls_port for server address reconciliation (#14944 )	2022-10-11 10:56:29 -04:00
freddygv	f4cc4577ca	Fix alias check leak Preivously when alias check was removed it would not be stopped nor cleaned up from the associated aliasChecks map. This means that any time an alias check was deregistered we would leak a goroutine for CheckAlias.run() because the stopCh would never be closed. This issue mostly affects service mesh deployments on platforms where the client agent is mostly static but proxy services come and go regularly, since by default sidecars are registered with an alias check.	2022-10-10 16:42:29 -06:00
James Oulman	b8bd7a3058	Configure Envoy alpn_protocols based on service protocol (#14356 ) * Configure Envoy alpn_protocols based on service protocol * define alpnProtocols in a more standard way * http2 protocol should be h2 only * formatting * add test for getAlpnProtocol() * create changelog entry * change scope is connect-proxy * ignore errors on ParseProxyConfig; fixes linter * add tests for grpc and http2 public listeners * remove newlines from PR * Add alpn_protocol configuration for ingress gateway * Guard against nil tlsContext * add ingress gateway w/ TLS tests for gRPC and HTTP2 * getAlpnProtocols: add TCP protocol test * add tests for ingress gateway with grpc/http2 and per-listener TLS config * add tests for ingress gateway with grpc/http2 and per-listener TLS config * add Gateway level TLS config with mixed protocol listeners to validate ALPN * update changelog to include ingress-gateway * add http/1.1 to http2 ALPN * go fmt * fix test on custom-trace-listener	2022-10-10 13:13:56 -07:00
freddygv	bf72df7b0e	Fixup test	2022-10-10 13:20:14 -06:00
Chris S. Kim	4f4112662e	Fix nil pointer	2022-10-10 13:20:14 -06:00
Chris S. Kim	b0a4c5c563	Include stream-related information in peering endpoints	2022-10-10 13:20:14 -06:00
Paul Glass	c0c187f1c5	Merge central config for GetEnvoyBootstrapParams (#14869 ) This fixes GetEnvoyBootstrapParams to merge in proxy-defaults and service-defaults. Co-authored-by: Dan Upton <daniel@floppy.co>	2022-10-10 12:40:27 -05:00
Freddy	4abad02abd	Merge pull request #14796 from hashicorp/peering/use-connect-ca	2022-10-07 10:37:37 -06:00
freddygv	7d4da6eb22	Fixup test	2022-10-07 09:34:16 -06:00
freddygv	3034df6a5c	Require Connect and TLS to generate peering tokens By requiring Connect and a gRPC TLS listener we can automatically configure TLS for all peering control-plane traffic.	2022-10-07 09:06:29 -06:00
freddygv	fac3ddc857	Use internal server certificate for peering TLS A previous commit introduced an internally-managed server certificate to use for peering-related purposes. Now the peering token has been updated to match that behavior: - The server name matches the structure of the server cert - The CA PEMs correspond to the Connect CA Note that if Conect is disabled, and by extension the Connect CA, we fall back to the previous behavior of returning the manually configured certs and local server SNI. Several tests were updated to use the gRPC TLS port since they enable Connect by default. This means that the peering token will embed the Connect CA, and the dialer will expect a TLS listener.	2022-10-07 09:05:32 -06:00
freddygv	5f97223822	Simplify mgw watch mgmt	2022-10-07 08:54:37 -06:00
freddygv	d54db25421	Use existing query options to build ctx	2022-10-07 08:46:53 -06:00
DanStough	77ab28c5c7	feat: xDS updates for peerings control plane through mesh gw	2022-10-07 08:46:42 -06:00
Eric Haberkorn	1633cf20ea	Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic (#14817 ) Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic	2022-10-06 09:54:14 -04:00
cskh	c1b5f34fb7	fix: missing UDP field in checkType (#14885 ) * fix: missing UDP field in checkType * Add changelog * Update doc	2022-10-05 15:57:21 -04:00
Derek Menteer	a279d2d329	Fix explicit tproxy listeners with discovery chains. (#14751 ) Fix explicit tproxy listeners with discovery chains.	2022-10-05 14:38:25 -05:00
Alex Oskotsky	13da2c5fad	Add the ability to retry on reset connection to service-routers (#12890 )	2022-10-05 13:06:44 -04:00
John Murret	79a541fd7d	Upgrade serf to v0.10.1 and memberlist to v0.5.0 to get memberlist size metrics and broadcast queue depth metric (#14873 ) * updating to serf v0.10.1 and memberlist v0.5.0 to get memberlist size metrics and memberlist broadcast queue depth metric * update changelog * update changelog * correcting changelog * adding "QueueCheckInterval" for memberlist to test * updating integration test containers to grab latest api	2022-10-04 17:51:37 -06:00
Evan Culver	a3be5a5a82	connect: Bump Envoy 1.20 to 1.20.7, 1.21 to 1.21.5 and 1.22 to 1.22.5 (#14831 )	2022-10-04 13:15:01 -07:00
Eric Haberkorn	1b565444be	Rename `PeerName` to `Peer` on prepared queries and exported services (#14854 )	2022-10-04 14:46:15 -04:00
Freddy	d9fe3578ac	Merge pull request #14734 from hashicorp/NET-643-update-mesh-gateway-envoy-config-for-inbound-peering-control-plane-traffic	2022-10-03 12:54:11 -06:00
freddygv	b15d41534f	Update xds generation for peering over mesh gws This commit adds the xDS resources needed for INBOUND traffic from peer clusters: - 1 filter chain for all inbound peering requests. - 1 cluster for all inbound peering requests. - 1 endpoint per voting server with the gRPC TLS port configured. There is one filter chain and cluster because unlike with WAN federation, peer clusters will not attempt to dial individual servers. Peer clusters will only dial the local mesh gateway addresses.	2022-10-03 12:42:27 -06:00
freddygv	a8c4d6bc55	Share mgw addrs in peering stream if needed This commit adds handling so that the replication stream considers whether the user intends to peer through mesh gateways. The subscription will return server or mesh gateway addresses depending on the mesh configuration setting. These watches can be updated at runtime by modifying the mesh config entry.	2022-10-03 11:42:20 -06:00
freddygv	4ff9d475b0	Return mesh gateway addrs if peering through mgw	2022-10-03 11:35:10 -06:00
chappie	ad7295e5d9	Merge pull request #14811 from hashicorp/chappie/dns Add DNS gRPC proxying support	2022-10-03 08:02:48 -07:00
Chris Chapman	d7b5351b66	Making suggested comments	2022-09-30 15:03:33 -07:00
Chris Chapman	46bea72212	Making suggested changes	2022-09-30 14:51:12 -07:00
Chris Chapman	a05563b788	Update comment	2022-09-30 09:35:01 -07:00
DanStough	7f8971d77f	chore: fix flakey scada provider test	2022-09-30 11:56:40 -04:00
Chris Chapman	81e267171b	Bind a dns mux handler to gRPC proxy	2022-09-29 21:44:45 -07:00
Chris Chapman	7bc9cad180	Adding grpc handler for dns proxy	2022-09-29 21:19:51 -07:00
Eric Haberkorn	80e51ff907	Add exported services event to cluster peering replication. (#14797 )	2022-09-29 15:37:19 -04:00
Ashwin Venkatesh	4ba260958c	bug: watch local mesh gateways in non-default partitions with agentless (#14799 )	2022-09-29 13:19:04 -04:00
cskh	69f40df548	feat(ingress gateway: support configuring limits in ingress-gateway c… (#14749 ) * feat(ingress gateway: support configuring limits in ingress-gateway config entry - a new Defaults field with max_connections, max_pending_connections, max_requests is added to ingress gateway config entry - new field max_connections, max_pending_connections, max_requests in individual services to overwrite the value in Default - added unit test and integration test - updated doc Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> Co-authored-by: Dan Stough <dan.stough@hashicorp.com>	2022-09-28 14:56:46 -04:00
malizz	84b0f408fa	Support Stale Queries for Trust Bundle Lookups (#14724 ) * initial commit * add tags, add conversations * add test for query options utility functions * update previous tests * fix test * don't error out on empty context * add changelog * update decode config	2022-09-28 09:56:59 -07:00
Eric Haberkorn	6570d5f004	Enable outbound peered requests to go through local mesh gateway (#14763 )	2022-09-27 09:49:28 -04:00
Nick Ethier	1c1b0994b8	add HCP integration component (#14723 ) * add HCP integration * lint: use non-deprecated logging interface	2022-09-26 14:58:15 -04:00
Derek Menteer	aa4709ab74	Add envoy connection balancing. (#14616 ) Add envoy connection balancing config.	2022-09-26 11:29:06 -05:00
Chris S. Kim	2203cdc4db	Add new internal endpoint to list exported services to a peer	2022-09-23 09:43:56 -04:00
freddygv	d818d7b096	Manage local server watches depending on mesh cfg Routing peering control plane traffic through mesh gateways can be enabled or disabled at runtime with the mesh config entry. This commit updates proxycfg to add or cancel watches for local servers depending on this central config. Note that WAN federation over mesh gateways is determined by a service metadata flag, and any updates to the gateway service registration will force the creation of a new snapshot. If enabled, WAN-fed over mesh gateways will trigger a local server watch on initialize(). Because of this we will only add/remove server watches if WAN federation over mesh gateways is disabled.	2022-09-22 19:32:10 -06:00
Alessandro De Blasis	461b42ed48	fix(check): added missing OSService props	2022-09-21 13:10:21 +01:00
Alessandro De Blasis	5719fd6560	fix(checks): os_service OK message in output	2022-09-21 09:27:33 +01:00
Alessandro De Blasis	f440966a38	fix(checks): os_service lifecycle bugfix	2022-09-21 09:26:47 +01:00
Alessandro De Blasis	fc0dd92dcf	fix(agent): uninitialized map panic error	2022-09-21 09:25:54 +01:00
malizz	1a0aa38a82	increase the size of txn to support vault (#14599 ) * increase the size of txn to support vault * add test, revert change to acl endpoint * add changelog * update test, add passing test case * Update .changelog/14599.txt Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2022-09-19 09:07:19 -07:00
freddygv	5fbb26525b	Add awareness of server mode to TLS configurator Preivously the TLS configurator would default to presenting auto TLS certificates as client certificates. Server agents should not have this behavior and should instead present the manually configured certs. The autoTLS certs for servers are exclusively used for peering and should not be used as the default for outbound communication.	2022-09-16 17:57:10 -06:00
freddygv	f30bc96239	Test fixes - Pulls in CLI test fix from main - Updates psutils to fix TestAgent_Host on M1 Mac	2022-09-16 17:57:10 -06:00
freddygv	02d3ce1039	Add server certificate manager This certificate manager will request a leaf certificate for server agents and then keep them up to date.	2022-09-16 17:57:10 -06:00
freddygv	0e5131bd33	Generate ACL token for server management This commit introduces a new ACL token used for internal server management purposes. It has a few key properties: - It has unlimited permissions. - It is persisted through Raft as System Metadata rather than in the ACL tokens table. This is to avoid users seeing or modifying it. - It is re-generated on leadership establishment.	2022-09-16 17:54:34 -06:00
freddygv	0ea3353537	Add handling in agent cache for server leaf certs	2022-09-16 17:54:34 -06:00
Kyle Havlovitz	0d9ae52643	Merge pull request #14598 from hashicorp/root-removal-fix connect/ca: Don't discard old roots on primaryInitialize	2022-09-15 14:36:01 -07:00
Kyle Havlovitz	6105a7fd9f	connect/ca: don't discard old roots on primaryInitialize	2022-09-15 12:59:09 -07:00
Gabriel Santos	e53af28bd7	Middleware: `RequestRecorder` reports calls below 1ms as decimal value (#12905 ) * Typos * Test failing * Convert values <1ms to decimal * Fix test * Update docs and test error msg * Applied suggested changes to test case * Changelog file and suggested changes * Update .changelog/12905.txt Co-authored-by: Chris S. Kim <kisunji92@gmail.com> * suggested change - start duration with microseconds instead of nanoseconds * fix error * suggested change - floats Co-authored-by: alex <8968914+acpana@users.noreply.github.com> Co-authored-by: Chris S. Kim <kisunji92@gmail.com>	2022-09-15 13:04:37 -04:00
Daniel Graña	8c98172f53	[BUGFIX] Do not use interval as timeout (#14619 ) Do not use interval as timeout	2022-09-15 12:39:48 -04:00
Evan Culver	d0416f593c	connect: Bump latest Envoy to 1.23.1 in test matrix (#14573 )	2022-09-14 13:20:16 -07:00
DanStough	485e1b5d4e	fix(peering): generate token metrics only for leader	2022-09-14 11:37:30 -04:00

1 2 3 4 5 ...

4858 Commits