consul

Commit Graph

Author	SHA1	Message	Date
Dan Upton	328e3ff563	proxycfg: rate-limit delivery of config snapshots (#14960 ) Adds a user-configurable rate limiter to proxycfg snapshot delivery, with a default limit of 250 updates per second. This addresses a problem observed in our load testing of Consul Dataplane where updating a "global" resource such as a wildcard intention or the proxy-defaults config entry could starve the Raft or Memberlist goroutines of CPU time, causing general cluster instability.	2022-10-14 15:52:00 +01:00
Derek Menteer	29ebcf5ff0	Add tests for peering state snapshots / restores.	2022-10-14 09:48:04 -05:00
Derek Menteer	e3ff9912d0	Add test for ExportedServicesForAllPeersByName	2022-10-14 09:48:04 -05:00
Dan Upton	e6b55d1d81	perf: remove expensive reflection from xDS hot path (#14934 ) Replaces the reflection-based implementation of proxycfg's ConfigSnapshot.Clone with code generated by deep-copy. While load testing server-based xDS (for consul-dataplane) we discovered this method is extremely expensive. The ConfigSnapshot struct, directly or indirectly, contains a copy of many of the structs in the agent/structs package, which creates a large graph for copystructure.Copy to traverse at runtime, on every proxy reconfiguration.	2022-10-14 10:26:42 +01:00
freddygv	c77123a2aa	Use split var in tests	2022-10-13 17:12:47 -06:00
freddygv	bf51021c07	Use split wildcard partition name This way OSS avoids passing a non-empty label, which will be rejected in OSS consul.	2022-10-13 16:55:28 -06:00
Freddy	ee4cdc4985	Merge pull request #14935 from hashicorp/fix/alias-leak	2022-10-13 16:31:15 -06:00
freddygv	573aa408a1	Lint	2022-10-13 15:55:55 -06:00
Derek Menteer	0f424e3cdf	Reset wait on ensureServerAddrSubscription	2022-10-13 15:58:26 -05:00
freddygv	96fdd3728a	Fix CA init error code	2022-10-13 14:58:11 -06:00
freddygv	2c99a21596	Update leader routine to maybe use gateways	2022-10-13 14:58:00 -06:00
freddygv	e69bc727ec	Update peering establishment to maybe use gateways When peering through mesh gateways we expect outbound dials to peer servers to flow through the local mesh gateway addresses. Now when establishing a peering we get a list of dial addresses as a ring buffer that includes local mesh gateway addresses if the local DC is configured to peer through mesh gateways. The ring buffer includes the mesh gateway addresses first, but also includes the remote server addresses as a fallback. This fallback is present because it's possible that direct egress from the servers may be allowed. If not allowed then the leader will cycle back to a mesh gateway address through the ring. When attempting to dial the remote servers we retry up to a fixed timeout. If using mesh gateways we also have an initial wait in order to allow for the mesh gateways to configure themselves. Note that if we encounter a permission denied error we do not retry since that error indicates that the secret in the peering token is invalid.	2022-10-13 14:57:55 -06:00
malizz	b0b0cbb8ee	increase protobuf size limit for cluster peering (#14976 )	2022-10-13 13:46:51 -07:00
Derek Menteer	4e140c98bc	Address PR comments.	2022-10-13 14:11:02 -05:00
Derek Menteer	1e394da400	Disallow peering to the same cluster.	2022-10-13 14:11:02 -05:00
Derek Menteer	8742fbe14f	Prevent consul peer-exports by discovery chain.	2022-10-13 12:45:09 -05:00
Derek Menteer	f366edcb8d	Prevent the "consul" service from being exported.	2022-10-13 12:45:09 -05:00
Derek Menteer	caa1396255	Add remote peer partition and datacenter info.	2022-10-13 10:37:41 -05:00
Dan Upton	cbb4a030c4	xds: properly merge central config for "agentless" services (#14962 )	2022-10-13 12:04:59 +01:00
Dan Upton	0af9f16343	bug: fix goroutine leaks caused by incorrect usage of `WatchCh` (#14916 ) memdb's `WatchCh` method creates a goroutine that will publish to the returned channel when the watchset is triggered or the given context is canceled. Although this is called out in its godoc comment, it's not obvious that this method creates a goroutine who's lifecycle you need to manage. In the xDS capacity controller, we were calling `WatchCh` on each iteration of the control loop, meaning the number of goroutines would grow on each autopilot event until there was catalog churn. In the catalog config source, we were calling `WatchCh` with the background context, meaning that the goroutine would keep running after the sync loop had terminated.	2022-10-13 12:04:27 +01:00
Hans Hasselberg	0d5935ab83	adding configuration option cloud.scada_address (#14936 ) * adding scada_address * config tests * add changelog entry	2022-10-13 11:31:28 +02:00
Paul Glass	bcda205f88	Add consul.xds.server.streamStart metric (#14957 ) This adds a new consul.xds.server.streamStart metric to measure the time taken to first generate xDS resources after an xDS stream is opened.	2022-10-12 14:17:58 -05:00
Riddhi Shah	345191a0df	Service http checks data source for agentless proxies (#14924 ) Adds another datasource for proxycfg.HTTPChecks, for use on server agents. Typically these checks are performed by local client agents and there is no equivalent of this in agentless (where servers configure consul-dataplane proxies). Hence, the data source is mostly a no-op on servers but in the case where the service is present within the local state, it delegates to the cache data source.	2022-10-12 07:49:56 -07:00
Freddy	9ca8bb8ec4	Merge pull request #14958 from hashicorp/peering/nonce	2022-10-12 08:18:15 -06:00
freddygv	1b46b35041	Actually track nonce in test	2022-10-12 07:50:17 -06:00
Derek Menteer	f330438a45	Fix incorrect backoff-wait logic.	2022-10-12 08:01:10 -05:00
freddygv	7f9a5d0f58	Add basic nonce management This commit adds a monotonically increasing nonce to include in peering replication response messages. Every ack/nack from the peer handling a response will include this nonce, allowing to correlate the ack/nack with a specific resource. At the moment nothing is done with the nonce when it is received. In the future we may want to add functionality such as retries on NACKs, depending on the class of error.	2022-10-11 19:02:04 -06:00
Paul Glass	d17af23641	gRPC server metrics (#14922 ) * Move stats.go from grpc-internal to grpc-middleware * Update grpc server metrics with server type label * Add stats test to grpc-external * Remove global metrics instance from grpc server tests	2022-10-11 17:00:32 -05:00
cskh	e0356e1502	fix(peering): add missing grpc_tls_port for server address reconciliation (#14944 )	2022-10-11 10:56:29 -04:00
freddygv	f4cc4577ca	Fix alias check leak Preivously when alias check was removed it would not be stopped nor cleaned up from the associated aliasChecks map. This means that any time an alias check was deregistered we would leak a goroutine for CheckAlias.run() because the stopCh would never be closed. This issue mostly affects service mesh deployments on platforms where the client agent is mostly static but proxy services come and go regularly, since by default sidecars are registered with an alias check.	2022-10-10 16:42:29 -06:00
James Oulman	b8bd7a3058	Configure Envoy alpn_protocols based on service protocol (#14356 ) * Configure Envoy alpn_protocols based on service protocol * define alpnProtocols in a more standard way * http2 protocol should be h2 only * formatting * add test for getAlpnProtocol() * create changelog entry * change scope is connect-proxy * ignore errors on ParseProxyConfig; fixes linter * add tests for grpc and http2 public listeners * remove newlines from PR * Add alpn_protocol configuration for ingress gateway * Guard against nil tlsContext * add ingress gateway w/ TLS tests for gRPC and HTTP2 * getAlpnProtocols: add TCP protocol test * add tests for ingress gateway with grpc/http2 and per-listener TLS config * add tests for ingress gateway with grpc/http2 and per-listener TLS config * add Gateway level TLS config with mixed protocol listeners to validate ALPN * update changelog to include ingress-gateway * add http/1.1 to http2 ALPN * go fmt * fix test on custom-trace-listener	2022-10-10 13:13:56 -07:00
freddygv	bf72df7b0e	Fixup test	2022-10-10 13:20:14 -06:00
Chris S. Kim	4f4112662e	Fix nil pointer	2022-10-10 13:20:14 -06:00
Chris S. Kim	b0a4c5c563	Include stream-related information in peering endpoints	2022-10-10 13:20:14 -06:00
Paul Glass	c0c187f1c5	Merge central config for GetEnvoyBootstrapParams (#14869 ) This fixes GetEnvoyBootstrapParams to merge in proxy-defaults and service-defaults. Co-authored-by: Dan Upton <daniel@floppy.co>	2022-10-10 12:40:27 -05:00
Freddy	4abad02abd	Merge pull request #14796 from hashicorp/peering/use-connect-ca	2022-10-07 10:37:37 -06:00
freddygv	7d4da6eb22	Fixup test	2022-10-07 09:34:16 -06:00
freddygv	3034df6a5c	Require Connect and TLS to generate peering tokens By requiring Connect and a gRPC TLS listener we can automatically configure TLS for all peering control-plane traffic.	2022-10-07 09:06:29 -06:00
freddygv	fac3ddc857	Use internal server certificate for peering TLS A previous commit introduced an internally-managed server certificate to use for peering-related purposes. Now the peering token has been updated to match that behavior: - The server name matches the structure of the server cert - The CA PEMs correspond to the Connect CA Note that if Conect is disabled, and by extension the Connect CA, we fall back to the previous behavior of returning the manually configured certs and local server SNI. Several tests were updated to use the gRPC TLS port since they enable Connect by default. This means that the peering token will embed the Connect CA, and the dialer will expect a TLS listener.	2022-10-07 09:05:32 -06:00
freddygv	5f97223822	Simplify mgw watch mgmt	2022-10-07 08:54:37 -06:00
freddygv	d54db25421	Use existing query options to build ctx	2022-10-07 08:46:53 -06:00
DanStough	77ab28c5c7	feat: xDS updates for peerings control plane through mesh gw	2022-10-07 08:46:42 -06:00
Eric Haberkorn	1633cf20ea	Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic (#14817 ) Make the mesh gateway changes to allow `local` mode for cluster peering data plane traffic	2022-10-06 09:54:14 -04:00
cskh	c1b5f34fb7	fix: missing UDP field in checkType (#14885 ) * fix: missing UDP field in checkType * Add changelog * Update doc	2022-10-05 15:57:21 -04:00
Derek Menteer	a279d2d329	Fix explicit tproxy listeners with discovery chains. (#14751 ) Fix explicit tproxy listeners with discovery chains.	2022-10-05 14:38:25 -05:00
Alex Oskotsky	13da2c5fad	Add the ability to retry on reset connection to service-routers (#12890 )	2022-10-05 13:06:44 -04:00
John Murret	79a541fd7d	Upgrade serf to v0.10.1 and memberlist to v0.5.0 to get memberlist size metrics and broadcast queue depth metric (#14873 ) * updating to serf v0.10.1 and memberlist v0.5.0 to get memberlist size metrics and memberlist broadcast queue depth metric * update changelog * update changelog * correcting changelog * adding "QueueCheckInterval" for memberlist to test * updating integration test containers to grab latest api	2022-10-04 17:51:37 -06:00
Evan Culver	a3be5a5a82	connect: Bump Envoy 1.20 to 1.20.7, 1.21 to 1.21.5 and 1.22 to 1.22.5 (#14831 )	2022-10-04 13:15:01 -07:00
Eric Haberkorn	1b565444be	Rename `PeerName` to `Peer` on prepared queries and exported services (#14854 )	2022-10-04 14:46:15 -04:00
Freddy	d9fe3578ac	Merge pull request #14734 from hashicorp/NET-643-update-mesh-gateway-envoy-config-for-inbound-peering-control-plane-traffic	2022-10-03 12:54:11 -06:00
freddygv	b15d41534f	Update xds generation for peering over mesh gws This commit adds the xDS resources needed for INBOUND traffic from peer clusters: - 1 filter chain for all inbound peering requests. - 1 cluster for all inbound peering requests. - 1 endpoint per voting server with the gRPC TLS port configured. There is one filter chain and cluster because unlike with WAN federation, peer clusters will not attempt to dial individual servers. Peer clusters will only dial the local mesh gateway addresses.	2022-10-03 12:42:27 -06:00
freddygv	a8c4d6bc55	Share mgw addrs in peering stream if needed This commit adds handling so that the replication stream considers whether the user intends to peer through mesh gateways. The subscription will return server or mesh gateway addresses depending on the mesh configuration setting. These watches can be updated at runtime by modifying the mesh config entry.	2022-10-03 11:42:20 -06:00
freddygv	4ff9d475b0	Return mesh gateway addrs if peering through mgw	2022-10-03 11:35:10 -06:00
chappie	ad7295e5d9	Merge pull request #14811 from hashicorp/chappie/dns Add DNS gRPC proxying support	2022-10-03 08:02:48 -07:00
Chris Chapman	d7b5351b66	Making suggested comments	2022-09-30 15:03:33 -07:00
Chris Chapman	46bea72212	Making suggested changes	2022-09-30 14:51:12 -07:00
Chris Chapman	a05563b788	Update comment	2022-09-30 09:35:01 -07:00
DanStough	7f8971d77f	chore: fix flakey scada provider test	2022-09-30 11:56:40 -04:00
Chris Chapman	81e267171b	Bind a dns mux handler to gRPC proxy	2022-09-29 21:44:45 -07:00
Chris Chapman	7bc9cad180	Adding grpc handler for dns proxy	2022-09-29 21:19:51 -07:00
Eric Haberkorn	80e51ff907	Add exported services event to cluster peering replication. (#14797 )	2022-09-29 15:37:19 -04:00
Ashwin Venkatesh	4ba260958c	bug: watch local mesh gateways in non-default partitions with agentless (#14799 )	2022-09-29 13:19:04 -04:00
cskh	69f40df548	feat(ingress gateway: support configuring limits in ingress-gateway c… (#14749 ) * feat(ingress gateway: support configuring limits in ingress-gateway config entry - a new Defaults field with max_connections, max_pending_connections, max_requests is added to ingress gateway config entry - new field max_connections, max_pending_connections, max_requests in individual services to overwrite the value in Default - added unit test and integration test - updated doc Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> Co-authored-by: Dan Stough <dan.stough@hashicorp.com>	2022-09-28 14:56:46 -04:00
malizz	84b0f408fa	Support Stale Queries for Trust Bundle Lookups (#14724 ) * initial commit * add tags, add conversations * add test for query options utility functions * update previous tests * fix test * don't error out on empty context * add changelog * update decode config	2022-09-28 09:56:59 -07:00
Eric Haberkorn	6570d5f004	Enable outbound peered requests to go through local mesh gateway (#14763 )	2022-09-27 09:49:28 -04:00
Nick Ethier	1c1b0994b8	add HCP integration component (#14723 ) * add HCP integration * lint: use non-deprecated logging interface	2022-09-26 14:58:15 -04:00
Derek Menteer	aa4709ab74	Add envoy connection balancing. (#14616 ) Add envoy connection balancing config.	2022-09-26 11:29:06 -05:00
Chris S. Kim	2203cdc4db	Add new internal endpoint to list exported services to a peer	2022-09-23 09:43:56 -04:00
freddygv	d818d7b096	Manage local server watches depending on mesh cfg Routing peering control plane traffic through mesh gateways can be enabled or disabled at runtime with the mesh config entry. This commit updates proxycfg to add or cancel watches for local servers depending on this central config. Note that WAN federation over mesh gateways is determined by a service metadata flag, and any updates to the gateway service registration will force the creation of a new snapshot. If enabled, WAN-fed over mesh gateways will trigger a local server watch on initialize(). Because of this we will only add/remove server watches if WAN federation over mesh gateways is disabled.	2022-09-22 19:32:10 -06:00
Alessandro De Blasis	461b42ed48	fix(check): added missing OSService props	2022-09-21 13:10:21 +01:00
Alessandro De Blasis	5719fd6560	fix(checks): os_service OK message in output	2022-09-21 09:27:33 +01:00
Alessandro De Blasis	f440966a38	fix(checks): os_service lifecycle bugfix	2022-09-21 09:26:47 +01:00
Alessandro De Blasis	fc0dd92dcf	fix(agent): uninitialized map panic error	2022-09-21 09:25:54 +01:00
malizz	1a0aa38a82	increase the size of txn to support vault (#14599 ) * increase the size of txn to support vault * add test, revert change to acl endpoint * add changelog * update test, add passing test case * Update .changelog/14599.txt Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2022-09-19 09:07:19 -07:00
freddygv	5fbb26525b	Add awareness of server mode to TLS configurator Preivously the TLS configurator would default to presenting auto TLS certificates as client certificates. Server agents should not have this behavior and should instead present the manually configured certs. The autoTLS certs for servers are exclusively used for peering and should not be used as the default for outbound communication.	2022-09-16 17:57:10 -06:00
freddygv	f30bc96239	Test fixes - Pulls in CLI test fix from main - Updates psutils to fix TestAgent_Host on M1 Mac	2022-09-16 17:57:10 -06:00
freddygv	02d3ce1039	Add server certificate manager This certificate manager will request a leaf certificate for server agents and then keep them up to date.	2022-09-16 17:57:10 -06:00
freddygv	0e5131bd33	Generate ACL token for server management This commit introduces a new ACL token used for internal server management purposes. It has a few key properties: - It has unlimited permissions. - It is persisted through Raft as System Metadata rather than in the ACL tokens table. This is to avoid users seeing or modifying it. - It is re-generated on leadership establishment.	2022-09-16 17:54:34 -06:00
freddygv	0ea3353537	Add handling in agent cache for server leaf certs	2022-09-16 17:54:34 -06:00
Kyle Havlovitz	0d9ae52643	Merge pull request #14598 from hashicorp/root-removal-fix connect/ca: Don't discard old roots on primaryInitialize	2022-09-15 14:36:01 -07:00
Kyle Havlovitz	6105a7fd9f	connect/ca: don't discard old roots on primaryInitialize	2022-09-15 12:59:09 -07:00
Gabriel Santos	e53af28bd7	Middleware: `RequestRecorder` reports calls below 1ms as decimal value (#12905 ) * Typos * Test failing * Convert values <1ms to decimal * Fix test * Update docs and test error msg * Applied suggested changes to test case * Changelog file and suggested changes * Update .changelog/12905.txt Co-authored-by: Chris S. Kim <kisunji92@gmail.com> * suggested change - start duration with microseconds instead of nanoseconds * fix error * suggested change - floats Co-authored-by: alex <8968914+acpana@users.noreply.github.com> Co-authored-by: Chris S. Kim <kisunji92@gmail.com>	2022-09-15 13:04:37 -04:00
Daniel Graña	8c98172f53	[BUGFIX] Do not use interval as timeout (#14619 ) Do not use interval as timeout	2022-09-15 12:39:48 -04:00
Evan Culver	d0416f593c	connect: Bump latest Envoy to 1.23.1 in test matrix (#14573 )	2022-09-14 13:20:16 -07:00
DanStough	485e1b5d4e	fix(peering): generate token metrics only for leader	2022-09-14 11:37:30 -04:00
DanStough	2a2debee64	feat(peering): validate server name conflicts on establish	2022-09-14 11:37:30 -04:00
Kyle Havlovitz	60cee76746	Merge pull request #14516 from hashicorp/ca-ttl-fixes Fix inconsistent TTL behavior in CA providers	2022-09-13 16:07:36 -07:00
Kyle Havlovitz	d67bccd210	Update intermediate pki mount/role when reconfiguring Vault provider	2022-09-13 15:42:26 -07:00
Kyle Havlovitz	f46955101a	connect/ca: Clarify behavior around IntermediateCertTTL in CA config	2022-09-13 15:42:26 -07:00
DanStough	0150e88200	feat: add PeerThroughMeshGateways to mesh config	2022-09-13 17:19:54 -04:00
Derek Menteer	0aa13733a0	Add CSR check for number of URIs. (#14579 ) Add CSR check for number of URIs.	2022-09-13 14:21:47 -05:00
Derek Menteer	db83ff4fa6	Add input validation for auto-config JWT authorization checks.	2022-09-13 11:16:36 -05:00
cskh	f22685b969	Config-entry: Support proxy config in service-defaults (#14395 ) * Config-entry: Support proxy config in service-defaults * Update website/content/docs/connect/config-entries/service-defaults.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2022-09-12 10:41:58 -04:00
Eric Haberkorn	aa8268e50c	Implement Cluster Peering Redirects (#14445 ) implement cluster peering redirects	2022-09-09 13:58:28 -04:00
skpratt	b761589340	add non-double-prefixed metrics (#14193 )	2022-09-09 12:13:43 -05:00
skpratt	19f79aa9a6	PR #14057 follow up fix: service id parsing from sidecar id (#14541 ) * fix service id parsing from sidecar id * simplify suffix trimming	2022-09-09 09:47:10 -05:00
Dan Upton	1c2c975b0b	xDS Load Balancing (#14397 ) Prior to #13244, connect proxies and gateways could only be configured by an xDS session served by the local client agent. In an upcoming release, it will be possible to deploy a Consul service mesh without client agents. In this model, xDS sessions will be handled by the servers themselves, which necessitates load-balancing to prevent a single server from receiving a disproportionate amount of load and becoming overwhelmed. This introduces a simple form of load-balancing where Consul will attempt to achieve an even spread of load (xDS sessions) between all healthy servers. It does so by implementing a concurrent session limiter (limiter.SessionLimiter) and adjusting the limit according to autopilot state and proxy service registrations in the catalog. If a server is already over capacity (i.e. the session limit is lowered), Consul will begin draining sessions to rebalance the load. This will result in the client receiving a `RESOURCE_EXHAUSTED` status code. It is the client's responsibility to observe this response and reconnect to a different server. Users of the gRPC client connection brokered by the consul-server-connection-manager library will get this for free. The rate at which Consul will drain sessions to rebalance load is scaled dynamically based on the number of proxies in the catalog.	2022-09-09 15:02:01 +01:00
Derek Menteer	f7c884f0af	Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports	2022-09-08 14:53:08 -05:00
Derek Menteer	bfe7c5e8af	Remove rebuilding grpc server.	2022-09-08 13:45:44 -05:00
Derek Menteer	80d31458e5	Various cleanups.	2022-09-08 10:51:50 -05:00
Chris S. Kim	03df6c3ac6	Reuse http.DefaultTransport in UIMetricsProxy (#14521 ) http.Transport keeps a pool of connections and should be reused when possible. We instantiate a new http.DefaultTransport for every metrics request, making large numbers of concurrent requests inefficiently spin up new connections instead of reusing open ones.	2022-09-08 11:02:05 -04:00
Chris S. Kim	1c4a6eef4f	Merge pull request #14285 from hashicorp/NET-638-push-server-address-updates-to-the-peer peering: Subscribe to server address changes and push updates to peers	2022-09-07 09:30:45 -04:00
skpratt	3bf1edfb3f	move port and default check logic to locked step (#14057 )	2022-09-06 19:35:31 -05:00
Freddy	f4dfd42e0a	Add SpiffeID for Consul server agents (#14485 ) Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com> By adding a SpiffeID for server agents, servers can now request a leaf certificate from the Connect CA. This new Spiffe ID has a key property: servers are identified by their datacenter name and trust domain. All servers that share these attributes will share a ServerURI. The aim is to use these certificates to verify the server name of ANY server in a Consul datacenter.	2022-09-06 17:58:13 -06:00
Daniel Upton	8c46e48e0d	proxycfg-glue: server-local implementation of IntentionUpstreamsDestination This is the OSS portion of enterprise PR 2463. Generalises the serverIntentionUpstreams type to support matching on a service or destination.	2022-09-06 23:27:25 +01:00
Daniel Upton	f8dba7e9ac	proxycfg-glue: server-local implementation of InternalServiceDump This is the OSS portion of enterprise PR 2489. This PR introduces a server-local implementation of the proxycfg.InternalServiceDump interface that sources data from a blocking query against the server's state store. For simplicity, it only implements the subset of the Internal.ServiceDump RPC handler actually used by proxycfg - as such the result type has been changed to IndexedCheckServiceNodes to avoid confusion.	2022-09-06 23:27:25 +01:00
Daniel Upton	a31738f76f	proxycfg-glue: server-local implementation of ResolvedServiceConfig This is the OSS portion of enterprise PR 2460. Introduces a server-local implementation of the proxycfg.ResolvedServiceConfig interface that sources data from a blocking query against the server's state store. It moves the service config resolution logic into the agent/configentry package so that it can be used in both the RPC handler and data source. I've also done a little re-arranging and adding comments to call out data sources for which there is to be no server-local equivalent.	2022-09-06 23:27:25 +01:00
Derek Menteer	bf769daae4	Merge branch 'main' of github.com:hashicorp/consul into derekm/split-grpc-ports	2022-09-06 10:51:04 -05:00
Derek Menteer	02ae66bda8	Add kv txn get-not-exists operation.	2022-09-06 10:28:59 -05:00
Chris S. Kim	953808e899	PR feedback on terminated state checking	2022-09-06 10:28:20 -04:00
Chris S. Kim	ddb9375cb6	Add testcase for parsing grpc_port	2022-09-06 10:17:44 -04:00
Kyle Havlovitz	d97ccccdd5	Merge pull request #14429 from hashicorp/ca-prune-intermediates Prune old expired intermediate certs when appending a new one	2022-09-02 15:34:33 -07:00
cskh	0f7d4efac3	fix(txn api): missing proxy config in registering proxy service (#14471 ) * fix(txn api): missing proxy config in registering proxy service	2022-09-02 14:28:05 -04:00
Chris S. Kim	ec36755cc0	Properly assert for ServerAddresses replication request	2022-09-02 11:44:54 -04:00
Chris S. Kim	d1d9dbff8e	Fix terminate not returning early	2022-09-02 11:44:38 -04:00
Derek Menteer	f64771c707	Address PR comments.	2022-09-01 16:54:24 -05:00
Kyle Havlovitz	0c2fb7252d	Prune intermediates before appending new one	2022-09-01 14:24:30 -07:00
Luke Kysow	81d7cc41dc	Use proxy address for default check (#14433 ) When a sidecar proxy is registered, a check is automatically added. Previously, the address this check used was the underlying service's address instead of the proxy's address, even though the check is testing if the proxy is up. This worked in most cases because the proxy ran on the same IP as the underlying service but it's not guaranteed and so the proper default address should be the proxy's address.	2022-09-01 14:03:35 -07:00
malizz	f1054dada9	fix TestProxyConfigEntry (#14435 )	2022-09-01 11:37:47 -07:00
malizz	b3ac8f48ca	Add additional parameters to envoy passive health check config (#14238 ) * draft commit * add changelog, update test * remove extra param * fix test * update type to account for nil value * add test for custom passive health check * update comments and tests * update description in docs * fix missing commas	2022-09-01 09:59:11 -07:00
Chris S. Kim	f2b147e575	Add Internal.ServiceDump support for querying by PeerName	2022-09-01 10:32:59 -04:00
Chris S. Kim	e62f830fa8	Merge pull request #13998 from jorgemarey/f-new-tracing-envoy Add new envoy tracing configuration	2022-09-01 08:57:23 -04:00
Derek Menteer	cf7f24a6ec	Change serf-tag references to field references.	2022-08-31 16:38:42 -05:00
malizz	a80e0bcd00	validate args before deleting proxy defaults (#14290 ) * validate args before deleting proxy defaults * add changelog * validate name when normalizing proxy defaults * add test for proxyConfigEntry * add comments	2022-08-31 13:03:38 -07:00
Kyle Havlovitz	113454645d	Prune old expired intermediate certs when appending a new one	2022-08-31 11:41:58 -07:00
Alessandro De Blasis	60c7c831c6	Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service	2022-08-30 18:49:20 +01:00
Eric Haberkorn	3726a0ab7a	Finish up cluster peering failover (#14396 )	2022-08-30 11:46:34 -04:00
Chris S. Kim	560d410c6d	Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer # Conflicts: # agent/grpc-external/services/peerstream/stream_test.go	2022-08-30 11:09:25 -04:00
Jorge Marey	3f3bb8831e	Fix typos. Add test. Add documentation	2022-08-30 16:59:02 +02:00
Jorge Marey	ed7b34128f	Add new tracing configuration	2022-08-30 16:59:02 +02:00
Freddy	97d1db759f	Merge pull request #13496 from maxb/fix-kv_entries-metric	2022-08-29 15:35:11 -06:00
Freddy	829a2a8722	Merge pull request #14364 from hashicorp/peering/term-delete	2022-08-29 15:33:18 -06:00
Max Bowsher	decc9231ee	Merge branch 'main' into fix-kv_entries-metric	2022-08-29 22:22:10 +01:00
Chris S. Kim	5010fa5c03	Merge pull request #14371 from hashicorp/kisunji/peering-metrics-update Adjust metrics reporting for peering tracker	2022-08-29 17:16:19 -04:00
Chris S. Kim	74ddf040dd	Add heartbeat timeout grace period when accounting for peering health	2022-08-29 16:32:26 -04:00
Derek Menteer	0ceec9017b	Expose `grpc_tls` via serf for cluster peering.	2022-08-29 13:43:49 -05:00
Derek Menteer	1255a8a20d	Add separate grpc_tls port. To ease the transition for users, the original gRPC port can still operate in a deprecated mode as either plain-text or TLS mode. This behavior should be removed in a future release whenever we no longer support this. The resulting behavior from this commit is: `ports.grpc > 0 && ports.grpc_tls > 0` spawns both plain-text and tls ports. `ports.grpc > 0 && grpc.tls == undefined` spawns a single plain-text port. `ports.grpc > 0 && grpc.tls != undefined` spawns a single tls port (backwards compat mode).	2022-08-29 13:43:43 -05:00
freddygv	310608fb19	Add validation to prevent switching dialing mode This prevents unexpected changes to the output of ShouldDial, which should never change unless a peering is deleted and recreated.	2022-08-29 12:31:13 -06:00
Eric Haberkorn	72f90754ae	Update max_ejection_percent on outlier detection for peered clusters to 100% (#14373 ) We can't trust health checks on peered services when service resolvers, splitters and routers are used.	2022-08-29 13:46:41 -04:00
Alessandro De Blasis	26cc56bc68	fix(agent): removed redundant code in docker check as well	2022-08-29 18:15:59 +01:00
Alessandro De Blasis	c0d647d11e	fix(agent): removed redundant check on prev. running check	2022-08-29 17:53:39 +01:00
Chris S. Kim	def529edd3	Rename test	2022-08-29 10:34:50 -04:00
Chris S. Kim	93271f649c	Fix test	2022-08-29 10:20:30 -04:00
Eric Haberkorn	1099665473	Update the structs and discovery chain for service resolver redirects to cluster peers. (#14366 )	2022-08-29 09:51:32 -04:00
Alessandro De Blasis	f3437eaf05	Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service Signed-off-by: Alessandro De Blasis <alex@deblasis.net>	2022-08-28 18:09:31 +01:00
Alessandro De Blasis	f634e36811	fix(OSServiceCheck): fixes following code-review	2022-08-28 17:56:30 +01:00
Chris S. Kim	4d97e2f936	Adjust metrics reporting for peering tracker	2022-08-26 17:34:17 -04:00
freddygv	650e48624d	Allow terminated peerings to be deleted Peerings are terminated when a peer decides to delete the peering from their end. Deleting a peering sends a termination message to the peer and triggers them to mark the peering as terminated but does NOT delete the peering itself. This is to prevent peerings from disappearing from both sides just because one side deleted them. Previously the Delete endpoint was skipping the deletion if the peering was not marked as active. However, terminated peerings are also inactive. This PR makes some updates so that peerings marked as terminated can be deleted by users.	2022-08-26 10:52:47 -06:00
Chris S. Kim	937a8ec742	Fix casing	2022-08-26 11:56:26 -04:00
Chris S. Kim	87962b9713	Merge branch 'main' into catalog-service-list-filter	2022-08-26 11:16:06 -04:00
Chris S. Kim	e2fe8b8d65	Fix tests for enterprise	2022-08-26 11:14:02 -04:00
Chris S. Kim	1c43a1a7b4	Merge branch 'main' into NET-638-push-server-address-updates-to-the-peer # Conflicts: # agent/grpc-external/services/peerstream/stream_test.go	2022-08-26 10:43:56 -04:00
Chris S. Kim	6ddcc04613	Replace ring buffer with async version (#14314 ) We need to watch for changes to peerings and update the server addresses which get served by the ring buffer. Also, if there is an active connection for a peer, we are getting up-to-date server addresses from the replication stream and can safely ignore the token's addresses which may be stale.	2022-08-26 10:27:13 -04:00
alex	30ff2e9a35	peering: add peer health metric (#14004 ) Signed-off-by: acpana <8968914+acpana@users.noreply.github.com>	2022-08-25 16:32:59 -07:00
Chris S. Kim	181063cd23	Exit loop when context is cancelled	2022-08-25 11:48:25 -04:00
cskh	41aea65214	Fix: the inboundconnection limit filter should be placed in front of http co… (#14325 ) * fix: the inboundconnection limit should be placed in front of http connection manager Co-authored-by: Freddy <freddygv@users.noreply.github.com>	2022-08-24 14:13:10 -04:00
Chris S. Kim	8c94d1a80c	Update test comment	2022-08-24 13:50:24 -04:00
Chris S. Kim	5f2959329f	Add check for zero-length server addresses	2022-08-24 13:30:52 -04:00
skpratt	919da33331	no-op: refactor usagemetrics tests for clarity and DRY cases (#14313 )	2022-08-24 12:00:09 -05:00
Pablo Ruiz García	1f293e5244	Added new auto_encrypt.grpc_server_tls config option to control AutoTLS enabling of GRPC Server's TLS usage Fix for #14253 Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2022-08-24 12:31:38 -04:00
Dan Upton	3b993f2da7	dataplane: update envoy bootstrap params for consul-dataplane (#14017 ) Contains 2 changes to the GetEnvoyBootstrapParams response to support consul-dataplane. Exposing node_name and node_id: consul-dataplane will support providing either the node_id or node_name in its configuration. Unfortunately, supporting both in the xDS meta adds a fair amount of complexity (partly because most tables are currently indexed on node_name) so for now we're going to return them both from the bootstrap params endpoint, allowing consul-dataplane to exchange a node_id for a node_name (which it will supply in the xDS meta). Properly setting service for gateways: To avoid the need to special case gateways in consul-dataplane, service will now either be the destination service name for connect proxies, or the gateway service name. This means it can be used as-is in Envoy configuration (i.e. as a cluster name or in metric tags).	2022-08-24 12:03:15 +01:00
Daniel Upton	13c04a13af	proxycfg: terminate stream on irrecoverable errors This is the OSS portion of enterprise PR 2339. It improves our handling of "irrecoverable" errors in proxycfg data sources. The canonical example of this is what happens when the ACL token presented by Envoy is deleted/revoked. Previously, the stream would get "stuck" until the xDS server re-checked the token (after 5 minutes) and terminated the stream. Materializers would also sit burning resources retrying something that could never succeed. Now, it is possible for data sources to mark errors as "terminal" which causes the xDS stream to be closed immediately. Similarly, the submatview.Store will evict materializers when it observes they have encountered such an error.	2022-08-23 20:17:49 +01:00
Chris S. Kim	81e965479b	PR feedback to specify Node name in test mock	2022-08-23 11:51:04 -04:00
Eric Haberkorn	58901ad7df	Cluster peering failover disco chain changes (#14296 )	2022-08-23 09:13:43 -04:00
Chris S. Kim	cdc8b0634d	Fix flakes	2022-08-22 14:45:31 -04:00
Chris S. Kim	03e92826aa	Increase heartbeat rate to reduce test flakes	2022-08-22 14:24:05 -04:00
Chris S. Kim	06ba9775ee	Remove check for ResponseNonce	2022-08-22 13:55:01 -04:00
Chris S. Kim	547fb9570e	Add missing mock assertions	2022-08-22 13:55:01 -04:00
Chris S. Kim	adff2eef16	Fix data race newMockSnapshotHandler has an assertion on t.Cleanup which gets called before the event publisher is cancelled. This commit reorders the context.WithCancel so it properly gets cancelled before the assertion is made.	2022-08-22 13:55:01 -04:00
cskh	060531a29a	Fix: add missing ent meta for test (#14289 )	2022-08-22 13:51:04 -04:00
Chris S. Kim	4e40e1d222	Handle server addresses update as client	2022-08-22 13:42:12 -04:00
Chris S. Kim	584d3409c4	Send server addresses on update from server	2022-08-22 13:41:44 -04:00
Chris S. Kim	c9d8ad3939	Add new subscription for server addresses	2022-08-22 13:40:25 -04:00
Chris S. Kim	028b87d51f	Cleanup unused logger	2022-08-22 13:40:23 -04:00
Chris S. Kim	df951bd601	Expose external gRPC port in autopilot The grpc_port was added to a NodeService's meta in `ea58f235f5`	2022-08-22 10:07:00 -04:00
cskh	527ebd068a	fix: missing MaxInboundConnections field in service-defaults config entry (#14072 ) * fix: missing max_inbound_connections field in merge config	2022-08-19 14:11:21 -04:00
cskh	e84e4b8868	Fix: upgrade pkg imdario/merg to prevent merge config panic (#14237 ) * upgrade imdario/merg to prevent merge config panic * test: service definition takes precedence over service-defaults in merged results	2022-08-17 21:14:04 -04:00
James Hartig	f92883bbce	Use the maximum jitter when calculating the timeout The timeout should include the maximum possible jitter since the server will randomly add to it's timeout a jitter. If the server's timeout is less than the client's timeout then the client will return an i/o deadline reached error. Before: ``` time curl 'http://localhost:8500/v1/catalog/service/service?dc=other-dc&stale=&wait=600s&index=15820644' rpc error making call: i/o deadline reached real 10m11.469s user 0m0.018s sys 0m0.023s ``` After: ``` time curl 'http://localhost:8500/v1/catalog/service/service?dc=other-dc&stale=&wait=600s&index=15820644' [...] real 10m35.835s user 0m0.021s sys 0m0.021s ```	2022-08-17 10:24:09 -04:00
Eric Haberkorn	1a73b0ca20	Add `Targets` field to service resolver failovers. (#14162 ) This field will be used for cluster peering failover.	2022-08-15 09:20:25 -04:00
Alessandro De Blasis	5dee555888	Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service Signed-off-by: Alessandro De Blasis <alex@deblasis.net>	2022-08-15 08:26:55 +01:00
Alessandro De Blasis	ab611eabc3	Merge remote-tracking branch 'hashicorp/main' into feature/health-checks_windows_service Signed-off-by: Alessandro De Blasis <alex@deblasis.net>	2022-08-15 08:09:56 +01:00
cskh	d46b515b64	fix: missing segment and partition (#14194 )	2022-08-12 15:21:39 -04:00
Eric Haberkorn	ebd5513d4b	Refactor failover code to use Envoy's aggregate clusters (#14178 )	2022-08-12 14:30:46 -04:00
cskh	81931e52c3	feat(telemetry): add labels to serf and memberlist metrics (#14161 ) * feat(telemetry): add labels to serf and memberlist metrics * changelog * doc update Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-08-11 22:09:56 -04:00
Chris S. Kim	4c928cb2f7	Handle breaking change for ServiceVirtualIP restore (#14149 ) Consul 1.13.0 changed ServiceVirtualIP to use PeeredServiceName instead of ServiceName which was a breaking change for those using service mesh and wanted to restore their snapshot after upgrading to 1.13.0. This commit handles existing data with older ServiceName and converts it during restore so that there are no issues when restoring from older snapshots.	2022-08-11 14:47:10 -04:00
Chris S. Kim	3926009405	Add test to verify forwarding	2022-08-11 11:16:02 -04:00
Chris S. Kim	1ef22360c3	Register peerStreamServer internally to enable RPC forwarding	2022-08-11 11:16:02 -04:00
Chris S. Kim	de73171202	Handle wrapped errors in isFailedPreconditionErr	2022-08-11 11:16:02 -04:00
Daniel Kimsey	3c4fa9b468	Add support for filtering the 'List Services' API 1. Create a bexpr filter for performing the filtering 2. Change the state store functions to return the raw (not aggregated) list of ServiceNodes. 3. Move the aggregate service tags by name logic out of the state store functions into a new function called from the RPC endpoint 4. Perform the filtering in the endpoint before aggregation.	2022-08-10 16:52:32 -05:00
cskh	11e7a0d547	fix: shadowed err in retryJoin() (#14112 ) - err value will be used later to surface the error message if r.join() returns any err.	2022-08-10 10:53:57 -04:00
skpratt	79c23a7cd2	Merge pull request #14056 from hashicorp/proxy-register-port-race Refactor sidecar_service method to separate port assignment	2022-08-10 09:46:29 -05:00
skpratt	aa77559819	Merge branch 'main' into proxy-register-port-race	2022-08-10 08:40:45 -05:00
Chris S. Kim	e3046120b3	Close active listeners on error If startListeners successfully created listeners for some of its input addresses but eventually failed, the function would return an error and existing listeners would not be cleaned up.	2022-08-09 12:22:39 -04:00
Chris S. Kim	6311c651de	Add retry in TestAgentConnectCALeafCert_good	2022-08-09 11:20:37 -04:00
Kyle Havlovitz	6938b8c755	Merge pull request #13958 from hashicorp/gateway-wildcard-fix Fix wildcard picking up services it shouldn't for ingress/terminating gateways	2022-08-08 12:54:40 -07:00
Kyle Havlovitz	fe1fcea34f	Add some extra handling for destination deletes	2022-08-08 11:38:13 -07:00
freddygv	d421e18172	Update snapshot test	2022-08-08 09:17:15 -06:00
freddygv	1031ffc3c7	Re-validate existing secrets at state store Previously establishment and pending secrets were only checked at the RPC layer. However, given that these are Check-and-Set transactions we should ensure that the given secrets are still valid when persisting a secret exchange or promotion. Otherwise it would be possible for concurrent requests to overwrite each other.	2022-08-08 09:06:07 -06:00
freddygv	0ea4bfae94	Test fixes	2022-08-08 08:31:47 -06:00
freddygv	c04515a844	Use proto message for each secrets write op Previously there was a field indicating the operation that triggered a secrets write. Now there is a message for each operation and it contains the secret ID being persisted.	2022-08-08 01:41:00 -06:00

... 2 3 4 5 6 ...

4843 Commits