consul

Commit Graph

Author	SHA1	Message	Date
Hans Hasselberg	9cb7adb304	add envoy version 1.12.2 and 1.13.0 to the matrix (#7240 ) * add 1.12.2 * add envoy 1.13.0 * Introduce -envoy-version to get 1.10.0 passing. * update old version and fix consul-exec case * add envoy_version and fix check * Update Envoy CLI tests to account for the 1.13 compatibility changes. Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>	2020-02-10 14:53:04 -05:00
Paschalis Tsilias	a335aa57c5	Expose Envoy's /stats for statsd agents (#7173 ) * Expose Envoy /stats for statsd agents; Add testcases * Remove merge conflict leftover * Add support for prefix instead of path; Fix docstring to mirror these changes * Add new config field to docs; Add testcases to check that /stats/prometheus is exposed as well * Parametrize matchType (prefix or path) and value * Update website/source/docs/connect/proxies/envoy.md Co-Authored-By: Paul Banks <banks@banksco.de> Co-authored-by: Paul Banks <banks@banksco.de>	2020-02-03 17:19:34 +00:00
Matt Keeler	bfc03ec587	Fix a couple bugs regarding intentions with namespaces (#7169 )	2020-01-29 17:30:38 -05:00
Matt Keeler	c09693e545	Updates to Config Entries and Connect for Namespaces (#7116 )	2020-01-24 10:04:58 -05:00
Chris Piraino	f3b54fa535	Allow configuration of upstream connection limits in Envoy (#6829 ) * Adds 'limits' field to the upstream configuration of a connect proxy This allows a user to configure the envoy connect proxy with 'max_connections', 'max_queued_requests', and 'max_concurrent_requests'. These values are defined in the local proxy on a per-service instance basis and should thus NOT be thought of as a global-level or even service-level value.	2019-12-03 14:13:33 -06:00
R.B. Boyer	2011f3d7dc	xds: mesh gateway CDS requests are now allowed to receive an empty CDS reply (#6787 ) This is the rest of the fix for #6543 that was incompletely fixed in #6576.	2019-11-26 15:55:13 -06:00
Paul Banks	87699eca2f	Fix support for RSA CA keys in Connect. (#6638 ) * Allow RSA CA certs for consul and vault providers to correctly sign EC leaf certs. * Ensure key type ad bits are populated from CA cert and clean up tests * Add integration test and fix error when initializing secondary CA with RSA key. * Add more tests, fix review feedback * Update docs with key type config and output * Apply suggestions from code review Co-Authored-By: R.B. Boyer <rb@hashicorp.com>	2019-11-01 13:20:26 +00:00
R.B. Boyer	8dcba472a2	xds: tcp services using the discovery chain should not assume RDS during LDS (#6623 ) Previously the logic for configuring RDS during LDS for L7 upstreams was overapplied to TCP proxies resulting in a cluster name of <emptystring> being used incorrectly. Fixes #6621	2019-10-17 16:44:59 -05:00
Freddy	fdd10dd8b8	Expose HTTP-based paths through Connect proxy (#6446 ) Fixes: #5396 This PR adds a proxy configuration stanza called expose. These flags register listeners in Connect sidecar proxies to allow requests to specific HTTP paths from outside of the node. This allows services to protect themselves by only listening on the loopback interface, while still accepting traffic from non Connect-enabled services. Under expose there is a boolean checks flag that would automatically expose all registered HTTP and gRPC check paths. This stanza also accepts a paths list to expose individual paths. The primary use case for this functionality would be to expose paths for third parties like Prometheus or the kubelet. Listeners for requests to exposed paths are be configured dynamically at run time. Any time a proxy, or check can be registered, a listener can also be created. In this initial implementation requests to these paths are not authenticated/encrypted.	2019-09-25 20:55:52 -06:00
R.B. Boyer	2cd5a7e542	tests: make envoy integration tests more tolerant of internal retries that may inflate counters (#6539 ) This should remove false positives that look like: cluster.s2.default.primary.*cx_total - expected count: 2, actual count: 3	2019-09-25 09:08:42 -05:00
Pierre Souchay	2f37d68d9b	[BUGFIX][BUILD] When test fail in circle-ci in main, have a proper error message (#6416 ) Since FUNCNAME is not defined when running outside a function, trap does not work and display wrong error message. Example from https://circleci.com/gh/hashicorp/consul/69506 : ``` ⨯ FAIL /home/circleci/project/test/integration/connect/envoy/run-tests.sh: line 1: FUNCNAME[0]: unbound variable make: *** [GNUmakefile:363: test-envoy-integ] Error 1 ``` This fix will avoid this error message and display the real cause.	2019-08-28 10:26:05 -04:00
Matt Keeler	9a5b258edf	Turned on Envoy 1.11.1 integration tests (#6347 ) I also ran this against 1.5.2 so the docs update claiming compatibility should still be accurate.	2019-08-20 10:20:13 -04:00
R.B. Boyer	72207256b9	xds: improve how envoy metrics are emitted (#6312 ) Since generated envoy clusters all are named using (mostly) SNI syntax we can have envoy read the various fields out of that structure and emit it as stats labels to the various telemetry backends. I changed the delimiter for the 'customization hash' from ':' to '~' because ':' is always reencoded by envoy as '_' when generating metrics keys.	2019-08-16 09:30:17 -05:00
R.B. Boyer	8e22d80e35	connect: fix failover through a mesh gateway to a remote datacenter (#6259 ) Failover is pushed entirely down to the data plane by creating envoy clusters and putting each successive destination in a different load assignment priority band. For example this shows that normally requests go to 1.2.3.4:8080 but when that fails they go to 6.7.8.9:8080: - name: foo load_assignment: cluster_name: foo policy: overprovisioning_factor: 100000 endpoints: - priority: 0 lb_endpoints: - endpoint: address: socket_address: address: 1.2.3.4 port_value: 8080 - priority: 1 lb_endpoints: - endpoint: address: socket_address: address: 6.7.8.9 port_value: 8080 Mesh gateways route requests based solely on the SNI header tacked onto the TLS layer. Envoy currently only lets you configure the outbound SNI header at the cluster layer. If you try to failover through a mesh gateway you ideally would configure the SNI value per endpoint, but that's not possible in envoy today. This PR introduces a simpler way around the problem for now: 1. We identify any target of failover that will use mesh gateway mode local or remote and then further isolate any resolver node in the compiled discovery chain that has a failover destination set to one of those targets. 2. For each of these resolvers we will perform a small measurement of comparative healths of the endpoints that come back from the health API for the set of primary target and serial failover targets. We walk the list of targets in order and if any endpoint is healthy we return that target, otherwise we move on to the next target. 3. The CDS and EDS endpoints both perform the measurements in (2) for the affected resolver nodes. 4. For CDS this measurement selects which TLS SNI field to use for the cluster (note the cluster is always going to be named for the primary target) 5. For EDS this measurement selects which set of endpoints will populate the cluster. Priority tiered failover is ignored. One of the big downsides to this approach to failover is that the failover detection and correction is going to be controlled by consul rather than deferring that entirely to the data plane as with the prior version. This also means that we are bound to only failover using official health signals and cannot make use of data plane signals like outlier detection to affect failover. In this specific scenario the lack of data plane signals is ok because the effectiveness is already muted by the fact that the ultimate destination endpoints will have their data plane signals scrambled when they pass through the mesh gateway wrapper anyway so we're not losing much. Another related fix is that we now use the endpoint health from the underlying service, not the health of the gateway (regardless of failover mode).	2019-08-05 13:30:35 -05:00
R.B. Boyer	6393edba53	connect: reconcile how upstream configuration works with discovery chains (#6225 ) * connect: reconcile how upstream configuration works with discovery chains The following upstream config fields for connect sidecars sanely integrate into discovery chain resolution: - Destination Namespace/Datacenter: Compilation occurs locally but using different default values for namespaces and datacenters. The xDS clusters that are created are named as they normally would be. - Mesh Gateway Mode (single upstream): If set this value overrides any value computed for any resolver for the entire discovery chain. The xDS clusters that are created may be named differently (see below). - Mesh Gateway Mode (whole sidecar): If set this value overrides any value computed for any resolver for the entire discovery chain. If this is specifically overridden for a single upstream this value is ignored in that case. The xDS clusters that are created may be named differently (see below). - Protocol (in opaque config): If set this value overrides the value computed when evaluating the entire discovery chain. If the normal chain would be TCP or if this override is set to TCP then the result is that we explicitly disable L7 Routing and Splitting. The xDS clusters that are created may be named differently (see below). - Connect Timeout (in opaque config): If set this value overrides the value for any resolver in the entire discovery chain. The xDS clusters that are created may be named differently (see below). If any of the above overrides affect the actual result of compiling the discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op override to "tcp") then the relevant parameters are hashed and provided to the xDS layer as a prefix for use in naming the Clusters. This is to ensure that if one Upstream discovery chain has no overrides and tangentially needs a cluster named "api.default.XXX", and another Upstream does have overrides for "api.default.XXX" that they won't cross-pollinate against the operator's wishes. Fixes #6159	2019-08-01 22:03:34 -05:00
Matt Keeler	3053342198	Envoy Mesh Gateway integration tests (#6187 ) * Allow setting the mesh gateway mode for an upstream in config files * Add envoy integration test for mesh gateways This necessitated many supporting changes in most of the other test cases. Add remote mode mesh gateways integration test	2019-07-24 17:01:42 -04:00
R.B. Boyer	e039dfd7f8	connect: rework how the service resolver subset OnlyPassing flag works (#6173 ) The main change is that we no longer filter service instances by health, preferring instead to render all results down into EDS endpoints in envoy and merely label the endpoints as HEALTHY or UNHEALTHY. When OnlyPassing is set to true we will force consul checks in a 'warning' state to render as UNHEALTHY in envoy. Fixes #6171	2019-07-23 20:20:24 -05:00
R.B. Boyer	aca2c5de3f	tests: adding new envoy integration tests for L7 service-resolvers (#6129 ) Additionally: - wait for bootstrap config entries to be applied - run the verify container in the host's PID namespace so we can kill envoys without mounting the docker socket * assert that we actually send HEALTHY and UNHEALTHY endpoints down in EDS during failover	2019-07-23 20:08:36 -05:00
R.B. Boyer	4a9f4b97e6	tests: when running envoy integration tests try to limit container bleedover between cases (#6148 )	2019-07-17 09:20:10 -05:00
R.B. Boyer	8a90185bbd	unknown fields now fail, so omit these unimplemented fields (#6125 )	2019-07-12 14:04:15 -05:00
R.B. Boyer	9138a97054	Fix bug in service-resolver redirects if the destination uses a default resolver. (#6122 ) Also: - add back an internal http endpoint to dump a compiled discovery chain for debugging purposes Before the CompiledDiscoveryChain.IsDefault() method would test: - is this chain just one resolver step? - is that resolver step just the default? But what I forgot to test: - is that resolver step for the same service that the chain represents? This last point is important because if you configured just one config entry: kind = "service-resolver" name = "web" redirect { service = "other" } and requested the chain for "web" you'd get back a default resolver for "other". In the xDS code the IsDefault() method is used to determine if this chain is "empty". If it is then we use the pre-discovery-chain logic that just uses data embedded in the Upstream object (and still lets the escape hatches function). In the example above that means certain parts of the xDS code were going to try referencing a cluster named "web..." despite the other parts of the xDS code maintaining clusters named "other...".	2019-07-12 12:21:25 -05:00
R.B. Boyer	911ed76e5b	tests: further reduce envoy integration test flakiness (#6112 ) In addition to waiting until s2 shows up healthy in the Catalog, wait until s2 endpoints show up healthy via EDS in the s1 upstream clusters.	2019-07-12 11:12:56 -05:00
R.B. Boyer	d4e58e9773	test: for envoy integration tests bump the time to wait for the upstream to be healthy (#6109 )	2019-07-10 18:07:47 -04:00
R.B. Boyer	20caa4f744	test: for envoy integration tests, wait until 's2' is healthy in consul before interrogating envoy (#6108 ) When the envoy healthy panic threshold was explicitly disabled as part of L7 traffic management it changed how envoy decided to load balance to endpoints in a cluster. This only matters when envoy is in "panic mode" aka "when you have a bunch of unhealthy endpoints". Panic mode sends traffic to unhealthy instances in certain circumstances. Note: Prior to explicitly disabling the healthy panic threshold, the default value is 50%. What was happening is that the test harness was bringing up consul the sidecars, and the service instances all at once and sometimes the proxies wouldn't have time to be checked by consul to be labeled as 'passing' in the catalog before a round of EDS happened. The xDS server in consul effectively queries /v1/health/connect/s2 and gets 1 result, but that one result has a 'critical' check so the xDS server sends back that endpoint labeled as UNHEALTHY. Envoy sees that 100% of the endpoints in the cluster are unhealthy and would enter panic mode and still send traffic to s2. This is why the test suites PRIOR to disabling the healthy panic threshold worked. They were _incorrectly_ passing. When the healthy panic threshol is disabled, envoy never enters panic mode in this situation and thus the cluster has zero healthy endpoints so load balancing goes nowhere and the tests fail. Why does this only affect the test suites for envoy 1.8.0? My guess is that https://github.com/envoyproxy/envoy/pull/4442 was merged into the 1.9.x series and somehow that plays a role. This PR modifies the bats scripts to explicitly wait until the upstream sidecar is healthy as measured by /v1/health/connect/s2?passing BEFORE trying to interrogate envoy which should make the tests less racy.	2019-07-10 15:58:25 -05:00
Jack Pearkes	e6f1b78efb	Make cluster names SNI always (#6081 ) * Make cluster names SNI always * Update some tests * Ensure we check for prepared query types * Use sni for route cluster names * Proper mesh gateway mode defaulting when the discovery chain is used * Ignore service splits from PatchSliceOfMaps * Update some xds golden files for proper test output * Allow for grpc/http listeners/cluster configs with the disco chain * Update stats expectation	2019-07-08 12:48:48 +01:00
R.B. Boyer	4bdb690a25	activate most discovery chain features in xDS for envoy (#6024 )	2019-07-01 22:10:51 -05:00
Hans Hasselberg	33a7df3330	tls: auto_encrypt enables automatic RPC cert provisioning for consul clients (#5597 )	2019-06-27 22:22:07 +02:00
Paul Banks	9f656a2dc8	Fix envoy 1.10 exec (#5964 ) * Make exec test assert Envoy version - it was not rebuilding before and so often ran against wrong version. This makes 1.10 fail consistenty. * Switch Envoy exec to use a named pipe rather than FD magic since Envoy 1.10 doesn't support that. * Refactor to use an internal shim command for piping the bootstrap through. * Fmt. So sad that vscode golang fails so often these days. * go mod tidy * revert go mod tidy changes * Revert "ignore consul-exec tests until fixed (#5986)" This reverts commit `683262a686`. * Review cleanups	2019-06-21 16:06:25 +01:00
Alvin Huang	683262a686	ignore consul-exec tests until fixed (#5986 )	2019-06-18 15:45:32 -04:00
Paul Banks	ffcfdf29fc	Upgrade xDS (go-control-plane) API to support Envoy 1.10. (#5872 ) * Upgrade xDS (go-control-plane) API to support Envoy 1.10. This includes backwards compatibility shim to work around the ext_authz package rename in 1.10. It also adds integration test support in CI for 1.10.0. * Fix go vet complaints * go mod vendor * Update Envoy version info in docs * Update website/source/docs/connect/proxies/envoy.md	2019-06-07 07:10:43 -05:00
Paul Banks	2d47b28722	Envoy integration test improvements (#5797 ) * Grab consul logs on integration test failures too and don't remove .gitignore * Don't wipe logs so we have some artifacts to upload at the end	2019-05-21 14:17:41 +01:00
Alvin Huang	19f44cd1cc	remove container after docker run exits (#5798 )	2019-05-07 10:13:07 -04:00
Paul Banks	446abd7aa2	Make central conf test work when run in a suite. (#5767 ) * Make central conf test work when run in a suite. This switches integration tests to hard restart Consul each time which causes less surpise when some tests need to set configs that don't work on consul reload. This also increases the isolation and repeatability of the tests by dropping Consul's state entirely for each case run. * Remove aborted attempt to make restart optional.	2019-05-02 12:53:06 +01:00
Paul Banks	0cfb6051ea	Add integration test for central config; fix central config WIP (#5752 ) * Add integration test for central config; fix central config WIP * Add integration test for central config; fix central config WIP * Set proxy protocol correctly and begin adding upstream support * Add upstreams to service config cache key and start new notify watcher if they change. This doesn't update the tests to pass though. * Fix some merging logic get things working manually with a hack (TODO fix properly) * Simplification to not allow enabling sidecars centrally - it makes no sense without upstreams anyway * Test compile again and obvious ones pass. Lots of failures locally not debugged yet but may be flakes. Pushing up to see what CI does * Fix up service manageer and API test failures * Remove the enable command since it no longer makes much sense without being able to turn on sidecar proxies centrally * Remove version.go hack - will make integration test fail until release * Remove unused code from commands and upstream merge * Re-bump version to 1.5.0	2019-05-01 16:39:31 -07:00
Paul Banks	421ecd32fc	Connect: allow configuring Envoy for L7 Observability (#5558 ) * Add support for HTTP proxy listeners * Add customizable bootstrap configuration options * Debug logging for xDS AuthZ * Add Envoy Integration test suite with basic test coverage * Add envoy command tests to cover new cases * Add tracing integration test * Add gRPC support WIP * Merged changes from master Docker. get CI integration to work with same Dockerfile now * Make docker build optional for integration * Enable integration tests again! * http2 and grpc integration tests and fixes * Fix up command config tests * Store all container logs as artifacts in circle on fail * Add retries to outer part of stats measurements as we keep missing them in CI * Only dump logs on failing cases * Fix typos from code review * Review tidying and make tests pass again * Add debug logs to exec test. * Fix legit test failure caused by upstream rename in envoy config * Attempt to reduce cases of bad TLS handshake in CI integration tests * bring up the right service * Add prometheus integration test * Add test for denied AuthZ both HTTP and TCP * Try ANSI term for Circle	2019-04-29 17:27:57 +01:00
Hans Hasselberg	0fc1c203cc	snapshot: read meta.json correctly. (#5193 ) * snapshot: read meta.json correctly. Fixes #4452.	2019-01-08 17:06:28 +01:00
Paul Banks	bbde408b09	Update test certificates that expire this year to be way in the future	2018-05-12 10:15:45 +01:00
James Phillips	93f68555d0	Adds enable_agent_tls_for_checks configuration option which allows (#3661 ) HTTP health checks for services requiring 2-way TLS to be checked using the agent's credentials.	2017-11-07 18:22:09 -08:00
Frank Schroeder	c94751ad43	test: replace porter tool with freeport lib This patch removes the porter tool which hands out free ports from a given range with a library which does the same thing. The challenge for acquiring free ports in concurrent go test runs is that go packages are tested concurrently and run in separate processes. There has to be some inter-process synchronization in preventing processes allocating the same ports. freeport allocates blocks of ports from a range expected to be not in heavy use and implements a system-wide mutex by binding to the first port of that block for the lifetime of the application. Ports are then provided sequentially from that block and are tested on localhost before being returned as available.	2017-10-21 22:01:09 +02:00
Alex Dadgar	c7a65b4587	Testutil falls back to random ports w/o porter (#3604 ) * Testutil falls back to random ports w/o porter This PR allows the testutil server to be used without porter. * Adds sterner-sounding fallback comments.	2017-10-20 16:46:13 -07:00
Frank Schroeder	86901b887d	porter: add better warning if missing	2017-10-18 09:58:58 +02:00
James Phillips	d0e099e54c	Makes porter take over if an existing instance died.	2017-09-26 16:25:18 -07:00
James Phillips	e19f547529	Makes porter more conservative by trying to connect to ports before handing them out.	2017-09-25 17:42:53 -07:00
Frank Schröder	12216583a1	New config parser, HCL support, multiple bind addrs (#3480 ) * new config parser for agent This patch implements a new config parser for the consul agent which makes the following changes to the previous implementation: * add HCL support * all configuration fragments in tests and for default config are expressed as HCL fragments * HCL fragments can be provided on the command line so that they can eventually replace the command line flags. * HCL/JSON fragments are parsed into a temporary Config structure which can be merged using reflection (all values are pointers). The existing merge logic of overwrite for values and append for slices has been preserved. * A single builder process generates a typed runtime configuration for the agent. The new implementation is more strict and fails in the builder process if no valid runtime configuration can be generated. Therefore, additional validations in other parts of the code should be removed. The builder also pre-computes all required network addresses so that no address/port magic should be required where the configuration is used and should therefore be removed. * Upgrade github.com/hashicorp/hcl to support int64 * improve error messages * fix directory permission test * Fix rtt test * Fix ForceLeave test * Skip performance test for now until we know what to do * Update github.com/hashicorp/memberlist to update log prefix * Make memberlist use the default logger * improve config error handling * do not fail on non-existing data-dir * experiment with non-uniform timeouts to get a handle on stalled leader elections * Run tests for packages separately to eliminate the spurious port conflicts * refactor private address detection and unify approach for ipv4 and ipv6. Fixes #2825 * do not allow unix sockets for DNS * improve bind and advertise addr error handling * go through builder using test coverage * minimal update to the docs * more coverage tests fixed * more tests * fix makefile * cleanup * fix port conflicts with external port server 'porter' * stop test server on error * do not run api test that change global ENV concurrently with the other tests * Run remaining api tests concurrently * no need for retry with the port number service * monkey patch race condition in go-sockaddr until we understand why that fails * monkey patch hcl decoder race condidtion until we understand why that fails * monkey patch spurious errors in strings.EqualFold from here * add test for hcl decoder race condition. Run with go test -parallel 128 * Increase timeout again * cleanup * don't log port allocations by default * use base command arg parsing to format help output properly * handle -dc deprecation case in Build * switch autopilot.max_trailing_logs to int * remove duplicate test case * remove unused methods * remove comments about flag/config value inconsistencies * switch got and want around since the error message was misleading. * Removes a stray debug log. * Removes a stray newline in imports. * Fixes TestACL_Version8. * Runs go fmt. * Adds a default case for unknown address types. * Reoders and reformats some imports. * Adds some comments and fixes typos. * Reorders imports. * add unix socket support for dns later * drop all deprecated flags and arguments * fix wrong field name * remove stray node-id file * drop unnecessary patch section in test * drop duplicate test * add test for LeaveOnTerm and SkipLeaveOnInt in client mode * drop "bla" and add clarifying comment for the test * split up tests to support enterprise/non-enterprise tests * drop raft multiplier and derive values during build phase * sanitize runtime config reflectively and add test * detect invalid config fields * fix tests with invalid config fields * use different values for wan sanitiziation test * drop recursor in favor of recursors * allow dns_config.udp_answer_limit to be zero * make sure tests run on machines with multiple ips * Fix failing tests in a few more places by providing a bind address in the test * Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder. * Add porter to server_test.go to make tests there less flaky * go fmt	2017-09-25 11:40:42 -07:00
Frank Schroeder	01bc7dd3c4	test: log exit code in cluster.bash	2017-06-08 14:06:10 +02:00
Frank Schroeder	bab941e76f	test: add script for starting a multi-node cluster	2017-06-07 13:08:19 +02:00
James Phillips	6f7ff554d0	Updates unit test certs for another year.	2017-06-05 19:22:20 -07:00
James Phillips	dd85930b6d	Updates expired test certs and includes a script to generate new certs.	2017-05-12 09:28:21 +02:00
Kyle Havlovitz	ae6bf56ee1	Add tls client options to api/cli	2017-04-14 13:37:29 -07:00
Kyle Havlovitz	197dc10a7f	Add utility types to enable checking for unset flags	2017-02-07 20:14:41 -05:00

1 2

57 Commits