consul

Commit Graph

Author	SHA1	Message	Date
Alvin Huang	f45e495e38	Merge pull request #5376 from hashicorp/fix-tests Fix tests in prep for CircleCI Migration	2019-04-04 17:09:32 -04:00
Kyle Havlovitz	5f569fb2ac	Merge pull request #5539 from hashicorp/service-config Service config state model	2019-04-02 16:34:58 -07:00
Kyle Havlovitz	a2fa9a0019	Cleaned up some error handling/comments around config entries	2019-04-02 15:42:12 -07:00
Kyle Havlovitz	c2da314eeb	Merge pull request #5553 from hashicorp/txn-check-serialization Use the correct check duration fields when converting transaction ops	2019-04-02 10:59:36 -07:00
Hans Hasselberg	ac45b17482	fix remaining CI failures after Go 1.12.1 Upgrade (#5576 )	2019-03-29 16:29:27 +01:00
Kyle Havlovitz	d16be2e269	Encode config entry FSM messages in a generic type	2019-03-28 00:06:56 -07:00
Kyle Havlovitz	f6df5c9b3b	Clean up service config state store methods	2019-03-27 16:52:38 -07:00
R.B. Boyer	0d1b496a52	acl: memdb filter of tokens-by-policy was inverted (#5575 ) The inversion wasn't noticed because the parallel execution of TokenList tests was operating incorrectly due to variable shadowing.	2019-03-27 15:24:44 -05:00
Jeff Mitchell	4243c3ae42	Move internal/ to sdk/ (#5568 ) * Move internal/ to sdk/ * Add a readme to the SDK folder	2019-03-27 08:54:56 -04:00
Jeff Mitchell	47c390025b	Convert to Go Modules (#5517 ) * First conversion * Use serf 0.8.2 tag and associated updated deps * * Move freeport and testutil into internal/ * Make internal/ its own module * Update imports * Add replace statements so API and normal Consul code are self-referencing for ease of development * Adapt to newer goe/values * Bump to new cleanhttp * Fix ban nonprintable chars test * Update lock bad args test The error message when the duration cannot be parsed changed in Go 1.12 (ae0c435877d3aacb9af5e706c40f9dddde5d3e67). This updates that test. * Update another test as well * Bump travis * Bump circleci * Bump go-discover and godo to get rid of launchpad dep * Bump dockerfile go version * fix tar command * Bump go-cleanhttp	2019-03-26 17:04:58 -04:00
Kyle Havlovitz	716a20d8a6	Re-add logic to handle the undocumented duration fields	2019-03-26 10:44:02 -07:00
Kyle Havlovitz	3f5e20452e	http: use the correct check duration fields when converting txn ops	2019-03-25 16:58:41 -07:00
Paul Banks	89fa5ec3ba	Connect: Fix Envoy getting stuck during load (#5499 ) * Connect: Fix Envoy getting stuck during load Also in this PR: - Enabled outlier detection on upstreams which will mark instances unhealthy after 5 failures (using Envoy's defaults) - Enable weighted load balancing where DNS weights are configured * Fix empty load assignments in the right place * Fix import names from review * Move millisecond parse to a helper function	2019-03-22 19:37:14 +00:00
Kyle Havlovitz	e199c37ee4	Add some basic normalize/validation logic for config entries	2019-03-22 09:25:37 -07:00
Paul Banks	d2e68a900a	Connect: Make Connect health queries unblock correctly (#5508 ) * Make Connect health queryies unblock correctly in all cases and use optimal number of watch chans. Fixes #5506. * Node check test cases and clearer bug test doc * Comment update	2019-03-21 16:01:56 +00:00
Kyle Havlovitz	d92577c16b	Fix fsm serialization and add snapshot/restore	2019-03-20 16:13:13 -07:00
Hans Hasselberg	ea5210a30e	Release v1.4.4	2019-03-20 16:00:54 +00:00
Kyle Havlovitz	17aa6a5a34	Fill out state store/FSM functions and add tests	2019-03-19 15:56:17 -07:00
R.B. Boyer	02b2cb1d15	agent: ensure the TLS hostname verification knows about the currently configured domain (#5513 )	2019-03-19 22:35:19 +01:00
Kyle Havlovitz	9d07add047	Add config types and state store table	2019-03-19 10:06:46 -07:00
Hans Hasselberg	e7134a0dab	agent: only use TestAgent when appropriate (#5502 )	2019-03-18 17:06:16 +01:00
Paul Banks	0b5a078b95	Optimize health watching to single chan/goroutine. (#5449 ) Refs #4984. Watching chans for every node we touch in a health query is wasteful. In #4984 it shows that if there are more than 682 service instances we always fallback to watching all services which kills performance. We already have a record in MemDB that is reliably update whenever the service health result should change thanks to per-service watch indexes. So in general, provided there is at least one service instances and we actually have a service index for it (we always do now) we only ever need to watch a single channel. This saves us from ever falling back to the general index and causing the performance cliff in #4984, but it also means fewer goroutines and work done for every blocking health query. It also saves some allocations made during the query because we no longer have to populate a WatchSet with 3 chans per service instance which saves the internal map allocation. This passes all state store tests except the one that explicitly checked for the fallback behaviour we've now optimized away and in general seems safe.	2019-03-15 20:18:48 +00:00
Pierre Souchay	88d4383410	Ensure we remove Connect proxy before deregistering the service itself (#5482 ) This will fix https://github.com/hashicorp/consul/issues/5351	2019-03-15 20:14:46 +00:00
Valentin Fritz	21f149de8b	Fix checks removal when removing service (#5457 ) Fix my recently discovered issue described here: #5456	2019-03-14 11:02:49 -04:00
R.B. Boyer	cd96af4fc0	acl: reduce complexity of token resolution process with alternative singleflighting (#5480 ) acl: reduce complexity of token resolution process with alternative singleflighting Switches acl resolution to use golang.org/x/sync/singleflight. For the identity/legacy lookups this is a drop-in replacement with the same overall approach to request coalescing. For policies this is technically a change in behavior, but when considered holistically is approximately performance neutral (with the benefit of less code). There are two goals with this blob of code (speaking specifically of policy resolution here): 1) Minimize cross-DC requests. 2) Minimize client-to-server LAN requests. The previous iteration of this code was optimizing for the case of many possibly different tokens being resolved concurrently that have a significant overlap in linked policies such that deduplication would be worth the complexity. While this is laudable there are some things to consider that can help to adjust expectations: 1) For v1.4+ policies are always replicated, and once a single policy shows up in a secondary DC the replicated data is considered authoritative for requests made in that DC. This means that our earlier concerns about minimizing cross-DC requests are irrelevant because there will be no cross-DC policy reads that occur. 2) For Server nodes the in-memory ACL policy cache is capped at zero, meaning it has no caching. Only Client nodes run with a cache. This means that instead of having an entire DC's worth of tokens (what a Server might see) that can have policy resolutions coalesced these nodes will only ever be seeing node-local token resolutions. In a reasonable worst-case scenario where a scheduler like Kubernetes has "filled" a node with Connect services, even that will only schedule ~100 connect services per node. If every service has a unique token there will only be 100 tokens to coalesce and even then those requests have to occur concurrently AND be hitting an empty consul cache. Instead of seeing a great coalescing opportunity for cutting down on redundant Policy resolutions, in practice it's far more likely given node densities that you'd see requests for the same token concurrently than you would for two tokens sharing a policy concurrently (to a degree that would warrant the overhead of the current variation of singleflighting. Given that, this patch switches the Policy resolution process to only singleflight by requesting token (but keeps the cache as by-policy).	2019-03-14 09:35:34 -05:00
Hans Hasselberg	7e11dd82aa	agent: enable reloading of tls config (#5419 ) This PR introduces reloading tls configuration. Consul will now be able to reload the TLS configuration which previously required a restart. It is not yet possible to turn TLS ON or OFF with these changes. Only when TLS is already turned on, the configuration can be reloaded. Most importantly the certificates and CAs.	2019-03-13 10:29:06 +01:00
R.B. Boyer	2e175be41b	acl: correctly extend the cache for acl identities during resolution (#5475 )	2019-03-12 10:23:43 -05:00
Aestek	4bea29f15a	[catalog] Update the node's services indexes on update (#5458 ) Node updates were not updating the service indexes, which are used for service related queries. This caused the X-Consul-Index to stay the same after a node update as seen from a service query even though the node data is returned in heath queries. If that happened in between queries the client would miss this change. We now update the indexes of the services on the node when it is updated. Fixes: #5450	2019-03-11 14:48:19 +00:00
Alvin Huang	8cb8108b1b	fix typos	2019-03-06 14:47:33 -05:00
R.B. Boyer	f4a3b9d518	fix typos reported by golangci-lint:misspell (#5434 )	2019-03-06 11:13:28 -06:00
R.B. Boyer	2ffbea41c8	improve flaky LANReap tests by expliciting configuring the tombstone timeout In TestServer_LANReap autopilot is running, so the alternate flow through the serf reaping function is possible. In that situation the ReconnectTimeout is not relevant so for parity also override the TombstoneTimeout value as well. For additional parity update the TestServer_WANReap and TestClient_LANReap versions of this test in the same way even though autopilot is irrelevant here .	2019-03-05 14:34:03 -06:00
R.B. Boyer	5bea49ecb0	tests: avoid leaking child processes from agent/proxyprocess package	2019-03-05 14:29:25 -06:00
Matt Keeler	567e41ff6b	Release v1.4.3	2019-03-04 19:21:20 +00:00
Matt Keeler	90040f8bff	Fixes for CVE-2019-8336 Fix error in detecting raft replication errors. Detect redacted token secrets and prevent attempting to insert. Add a Redacted field to the TokenBatchRead and TokenRead RPC endpoints This will indicate whether token secrets have been redacted. Ensure any token with a redacted secret in secondary datacenters is removed. Test that redacted tokens cannot be replicated.	2019-03-04 19:13:24 +00:00
Hans Hasselberg	d35824b1fa	default to tls 1.2 as promised. (#5340 )	2019-03-04 09:42:04 -05:00
Aestek	2aac4d5168	Register and deregisters services and their checks atomically in the local state (#5012 ) Prevent race between register and deregister requests by saving them together in the local state on registration. Also adds more cleaning in case of failure when registering services / checks.	2019-03-04 09:34:05 -05:00
Matt Keeler	6e6910ea11	Dont modify memdb owned token data for get/list requests of tokens (#5412 ) Previously we were fixing up the token links directly on the *ACLToken returned by memdb. This invalidated some assumptions that a snapshot is immutable as well as potentially being able to cause a crash. The fix here is to give the policy link fixing function copy on write semantics. When no fixes are necessary we can return the memdb object directly, otherwise we copy it and create a new list of links. Eventually we might find a better way to keep those policy links in sync but for now this fixes the issue.	2019-03-04 09:28:46 -05:00
Aestek	02f991843f	Fix race condition in DNS when using cache (#5398 ) * Fix race condition in DNS when using cache The healty node filtering was modifying the result from the cache, which caused a crash when multiple queries were made to the same service simultaneously. We now copy the node slice before filtering to ensure we do not modify the data stored in the cache. * Fix wording in dns cache config doc s/dns_max_age/cache_max_age/	2019-03-04 09:22:01 -05:00
Matt Keeler	200c0fb3e9	Call RemoveServer for reap events (#5317 ) This ensures that servers are removed from RPC routing when they are reaped.	2019-03-04 09:19:35 -05:00
R.B. Boyer	409c901f8e	test: fix concurrent map access when setting up test vault	2019-03-01 14:30:19 -06:00
R.B. Boyer	6955186239	fix ignored errors in state store internals as reported by errcheck	2019-03-01 14:18:00 -06:00
R.B. Boyer	c7067645dd	fix a few leap-year related clock math inaccuracies and failing tests	2019-03-01 13:51:49 -06:00
Matt Keeler	118adbb123	ACL Token Persistence and Reloading (#5328 ) This PR adds two features which will be useful for operators when ACLs are in use. 1. Tokens set in configuration files are now reloadable. 2. If `acl.enable_token_persistence` is set to `true` in the configuration, tokens set via the `v1/agent/token` endpoint are now persisted to disk and loaded when the agent starts (or during configuration reload) Note that token persistence is opt-in so our users who do not want tokens on the local disk will see no change. Some other secondary changes: * Refactored a bunch of places where the replication token is retrieved from the token store. This token isn't just for replicating ACLs and now it is named accordingly. * Allowed better paths in the `v1/agent/token/` API. Instead of paths like: `v1/agent/token/acl_replication_token` the path can now be just `v1/agent/token/replication`. The old paths remain to be valid. * Added a couple new API functions to set tokens via the new paths. Deprecated the old ones and pointed to the new names. The names are also generally better and don't imply that what you are setting is for ACLs but rather are setting ACL tokens. There is a minor semantic difference there especially for the replication token as again, its no longer used only for ACL token/policy replication. The new functions will detect 404s and fallback to using the older token paths when talking to pre-1.4.3 agents. * Docs updated to reflect the API additions and to show using the new endpoints. * Updated the ACL CLI set-agent-tokens command to use the non-deprecated APIs.	2019-02-27 14:28:31 -05:00
Kyle Havlovitz	f07e928afc	Merge pull request #5325 from hashicorp/consul-ca-panic connect/ca: fix a potential panic in the Consul provider	2019-02-27 09:43:44 -08:00
Hans Hasselberg	80e7d63fc2	Centralise tls configuration part 2 (#5374 ) This PR is based on #5366 and continues to centralise the tls configuration in order to be reloadable eventually! This PR is another refactoring. No tests are changed, beyond calling other functions or cosmetic stuff. I added a bunch of tests, even though they might be redundant.	2019-02-27 10:14:59 +01:00
Hans Hasselberg	786b3b1095	Centralise tls configuration part 1 (#5366 ) In order to be able to reload the TLS configuration, we need one way to generate the different configurations. This PR introduces a `tlsutil.Configurator` which holds a `tlsutil.Config`. Afterwards it is responsible for rendering every `tls.Config`. In this particular PR I moved `IncomingHTTPSConfig`, `IncomingTLSConfig`, and `OutgoingTLSWrapper` into `tlsutil.Configurator`. This PR is a pure refactoring - not a single feature added. And not a single test added. I only slightly modified existing tests as necessary.	2019-02-26 16:52:07 +01:00
Aestek	f1cdfbe40e	Allow DNS interface to use agent cache (#5300 ) Adds two new configuration parameters "dns_config.use_cache" and "dns_config.cache_max_age" controlling how DNS requests use the agent cache when querying servers.	2019-02-25 14:06:01 -05:00
Alvin Huang	77eecf1046	add wait to TestClient_JoinLAN	2019-02-22 17:34:45 -05:00
Alvin Huang	136df63e2c	add retry to TestResetSessionTimerLocked	2019-02-22 17:34:45 -05:00
Alvin Huang	a7180f715a	add serf check to testDNSServiceLookupResponseLimits, checkDNSService	2019-02-22 17:34:45 -05:00

1 2 3 4 5 ...

1410 Commits