consul

Commit Graph

Author	SHA1	Message	Date
Daniel Nephin	f8dbe4821d	Merge pull request #8548 from edevil/fix_flake Fix flaky TestACLResolver_Client/Concurrent-Token-Resolve	2020-08-28 19:11:24 +00:00
Daniel Nephin	607a494000	Merge pull request #8552 from pierresouchay/reload_cache_throttling_config Ensure that Cache options are reloaded when `consul reload` is performed	2020-08-28 19:05:15 +00:00
Hans Hasselberg	ae2cbbce99	agent/cache test for cache throttling. (#8396 )	2020-07-30 12:41:38 +00:00
Matt Keeler	91ec880e07	Backport #8389 (#8392 ) # Conflicts: # agent/cache-types/catalog_list_services_test.go	2020-07-28 14:22:29 -04:00
Pierre Souchay	678489d9d1	Added ratelimit to handle throtling cache (#8226 ) This implements a solution for #7863 It does: Add a new config cache.entry_fetch_rate to limit the number of calls/s for a given cache entry, default value = rate.Inf Add cache.entry_fetch_max_burst size of rate limit (default value = 2) The new configuration now supports the following syntax for instance to allow 1 query every 3s: command line HCL: -hcl 'cache = { entry_fetch_rate = 0.333}' in JSON { "cache": { "entry_fetch_rate": 0.333 } }	2020-07-27 21:11:42 +00:00
Matt Keeler	a8d2e5a2c2	Disable background cache refresh for Connect Leaf Certs The rationale behind removing them is that all of our own code (xDS, builtin connect proxy) use the cache notification mechanism. This ensures that the blocking fetch behind the scenes is always executing. Therefore the only way you might go to get a certificate and have to wait is when 1) the request has never been made for that cert before or 2) you are using the v1/agent/connect/ca/leaf API for retrieving the cert yourself. In the first case, the refresh change doesn’t alter the behavior. In the second case, it can be mitigated by using blocking queries with that API which just like normal cache notification mechanism will cause the blocking fetch to be initiated and to get leaf certs as soon as needed. If you are not using blocking queries, or Envoy/xDS, or the builtin connect proxy but are retrieving the certs yourself then the HTTP endpoint might take a little longer to respond. This also renames the RefreshTimeout field on the register options to QueryTimeout to more accurately reflect that it is used for any type that supports blocking queries. # Conflicts: # agent/cache/cache.go	2020-07-21 13:51:18 -04:00
Matt Keeler	64262d22d6	Make the Agent Cache more Context aware (#8092 ) Blocking queries issues will still be uncancellable (that cannot be helped until we get rid of net/rpc). However this makes it so that if calling getWithIndex (like during a cache Notify go routine) we can cancell the outer routine. Previously it would keep issuing more blocking queries until the result state actually changed.	2020-06-15 15:43:32 +00:00
Daniel Nephin	81755c860a	agent/cache: remove error return from fetch A previous change removed the only error, so the return value can be removed now.	2020-04-17 11:55:01 -04:00
Daniel Nephin	4ef9fc9f27	agent/cache: reduce function arguments by removing duplicates A few of the unexported functions in agent/cache took a large number of arguments. These arguments were effectively overrides for values that were provided in RequestInfo. By using a struct we can not only reduce the number of arguments, but also simplify the logic by removing the need for overrides.	2020-04-17 11:35:07 -04:00
Daniel Nephin	5fe7043439	agent/cache: Make all cache options RegisterOptions Previously the SupportsBlocking option was specified by a method on the type, and all the other options were specified from RegisterOptions. This change moves RegisterOptions to a method on the type, and moves SupportsBlocking into the options struct. Currently there are only 2 cache-types. So all cache-types can implement this method by embedding a struct with those predefined values. In the future if a cache type needs to be registered more than once with different options it can remove the embedded type and implement the method in a way that allows for paramaterization.	2020-04-16 18:56:34 -04:00
Daniel Nephin	89f41bddfe	Remove TTL from cacheEntryExpiry This should very slightly reduce the amount of memory required to store each item in the cache. It will also enable setting different TTLs based on the type of result. For example we may want to use a shorter TTL when the result indicates the resource does not exist, as storing these types of records could easily lead to a DOS caused by OOM.	2020-04-13 13:10:38 -04:00
Daniel Nephin	7246d8b6cb	agent/cache: Reduce differences between notify implementations These two notify functions are very similar. There appear to be just enough differences that trying to parameterize the differences may not improve things. For now, reduce some of the cosmetic differences so that the material differences are more obvious.	2020-04-13 13:10:38 -04:00
Daniel Nephin	66fbb13976	agent/cache: Inline the refresh function to make recursion more obvious fetch is already an exceptionally long function, but hiding the recrusion in a function call likely does not help.	2020-04-13 13:10:38 -04:00
Daniel Nephin	faeaed5d0c	agent/cache: Make the return values of getEntryLocked more obvious Use named returned so that the caller has a better idea of what these bools mean. Return early to reduce the scope, and make it more obvious what values are returned in which cases. Also reduces the number of conditional expressions in each case.	2020-04-13 13:10:38 -04:00
Daniel Nephin	e9e45545dd	agent/cache: Small formatting improvements to improve readability Remove Cache.entryKey which called a single function. Format multiline struct creation one field per line.	2020-04-13 12:34:11 -04:00
Daniel Nephin	1f25bf88b8	Merge pull request #7596 from hashicorp/dnephin/agent-cache-type-entry agent/cache: move typeEntry lookup to the edge	2020-04-13 12:24:07 -04:00
Daniel Nephin	c9a87be6ee	agent/cache: move typeEntry lookup to the edge This change moves all the typeEntry lookups to the first step in the exported methods, and makes unexporter internals accept the typeEntry struct. This change is primarily intended to make it easier to extract the container of caches from the Cache type. It may incidentally reduce locking in fetch, but that was not a goal.	2020-04-03 16:01:56 -04:00
Pierre Souchay	09e638a9c6	tests: more tolerance to latency for unstable test `TestCacheNotifyPolling()`. (#7574 )	2020-04-03 10:29:38 +02:00
R.B. Boyer	12876983cf	avoid 'panic: Log in goroutine after TestCacheGet_refreshAge has completed' (#7276 )	2020-02-12 10:01:51 -06:00
Anthony Scalisi	beb928f8de	fix spelling errors (#7135 )	2020-01-27 07:00:33 -06:00
R.B. Boyer	9566df524e	agent: cache notifications work after error if the underlying RPC returns index=1 (#6547 ) Fixes #6521 Ensure that initial failures to fetch an agent cache entry using the notify API where the underlying RPC returns a synthetic index of 1 correctly recovers when those RPCs resume working. The bug in the Cache.notifyBlockingQuery used to incorrectly "fix" the index for the next query from 0 to 1 for all queries, when it should have not done so for queries that errored. Also fixed some things that made debugging difficult: - config entry read/list endpoints send back QueryMeta headers - xds event loops don't swallow the cache notification errors	2019-09-26 10:42:17 -05:00
Christian Muehlhaeuser	7753b97cc7	Simplified code in various places (#6176 ) All these changes should have no side-effects or change behavior: - Use bytes.Buffer's String() instead of a conversion - Use time.Since and time.Until where fitting - Drop unnecessary returns and assignment	2019-07-20 09:37:19 -04:00
Freddy	5873c56a03	Flaky test overhaul (#6100 )	2019-07-12 09:52:26 -06:00
Hans Hasselberg	33a7df3330	tls: auto_encrypt enables automatic RPC cert provisioning for consul clients (#5597 )	2019-06-27 22:22:07 +02:00
Matt Keeler	dbc48ea3f7	Fixes race condition in Agent Cache (#5796 ) * Fix race condition during a cache get Check the entry we pulled out of the cache while holding the lock had Fetching set. If it did then we should use the existing Waiter instead of calling fetch. The reason this is better than just calling fetch is that fetch re-gets the entry out of the entries map and the previous fetch may have finished. Therefore this prevents erroneously starting a new fetch because we just missed the last update. * Fix race condition fully The first commit still allowed for the following scenario: • No entry existing when checked in getWithIndex while holding the read lock • Then by time we had reached fetch it had been created and finished. * always use ok when returning * comment mentioning the reading from entries. * use cacheHit consistently	2019-05-07 11:15:49 +01:00
Kyle Havlovitz	43bfc20dc8	Test an index=0 value in cache.Notify	2019-04-25 02:11:07 -07:00
Kyle Havlovitz	c269369760	Make central service config opt-in and rework the initial registration	2019-04-24 06:11:08 -07:00
R.B. Boyer	f4a3b9d518	fix typos reported by golangci-lint:misspell (#5434 )	2019-03-06 11:13:28 -06:00
Paul Banks	ef9f27cbc8	connect: tame thundering herd of CSRs on CA rotation (#5228 ) * Support rate limiting and concurrency limiting CSR requests on servers; handle CA rotations gracefully with jitter and backoff-on-rate-limit in client * Add CSR rate limiting docs * Fix config naming and add tests for new CA configs	2019-01-22 17:19:36 +00:00
Matt Keeler	7e6b3e6a0c	Implement prepared query upstreams watching for envoy (#5224 ) Fixes #4969 This implements non-blocking request polling at the cache layer which is currently only used for prepared queries. Additionally this enables the proxycfg manager to poll prepared queries for use in envoy proxy upstreams.	2019-01-18 12:44:04 -05:00
Paul Banks	0638e09b6e	connect: agent leaf cert caching improvements (#5091 ) * Add State storage and LastResult argument into Cache so that cache.Types can safely store additional data that is eventually expired. * New Leaf cache type working and basic tests passing. TODO: more extensive testing for the Root change jitter across blocking requests, test concurrent fetches for different leaves interact nicely with rootsWatcher. * Add multi-client and delayed rotation tests. * Typos and cleanup error handling in roots watch * Add comment about how the FetchResult can be used and change ca leaf state to use a non-pointer state. * Plumb test override of root CA jitter through TestAgent so that tests are deterministic again! * Fix failing config test	2019-01-10 12:46:11 +00:00
Paul Banks	0589525ae9	agent: Don't leave old errors around in cache (#5094 ) * Fixes #4480. Don't leave old errors around in cache that can be hit in specific circumstances. * Move error reset to cover extreme edge case of nil Value, nil err Fetch	2019-01-08 10:06:38 +00:00
Paul Banks	298af6dca7	Quick fix for cache age flakiness in CI	2018-10-11 13:12:19 +01:00
Paul Banks	c9217c958e	merge feedback: fix typos; actually use deliverLatest added previously but not plumbed in	2018-10-10 16:55:34 +01:00
Paul Banks	96b9b95a19	Add cache.Notify to abstract watching for cache updates for types that support blocking semantics. (#4695 )	2018-10-10 16:55:34 +01:00
Paul Banks	88388d760d	Support Agent Caching for Service Discovery Results (#4541 ) * Add cache types for catalog/services and health/services and basic test that caching works * Support non-blocking cache types with Cache-Control semantics. * Update API docs to include caching info for every endpoint. * Comment updates per PR feedback. * Add note on caching to the 10,000 foot view on the architecture page to make the new data path more clear. * Document prepared query staleness quirk and force all background requests to AllowStale so we can spread service discovery load across servers.	2018-10-10 16:55:34 +01:00
Paul Banks	e8ba527f23	Add a Close method to cache that stops background goroutines. (#4746 ) In a real agent the `cache` instance is alive until the agent shuts down so this is not a real leak in production, however in out test suite, every testAgent that is started and stops leaks goroutines that never get cleaned up which accumulate consuming CPU and memory through subsequent test in the `agent` package which doesn't help our test flakiness. This adds a Close method that doesn't invalidate or clean up the cache, and still allows concurrent blocking queries to run (for up to 10 mins which might still affect tests). But at least it doesn't maintain them forever with background refresh and an expiry watcher routine. It would be nice to cancel any outstanding blocking requests as well when we close but that requires much more invasive surgery right into our RPC protocol since we don't have a way to cancel requests currently. Unscientifically this seems to make tests pass a bit quicker and more reliably locally but I can't really be sure of that!	2018-10-04 11:27:11 +01:00
Paul Banks	8cbeb29e73	Fixes #4421 : General solution to stop blocking queries with index 0 (#4437 ) * Fix theoretical cache collision bug if/when we use more cache types with same result type * Generalized fix for blocking query handling when state store methods return zero index * Refactor test retry to only affect CI * Undo make file merge * Add hint to error message returned to end-user requests if Connect is not enabled when they try to request cert * Explicit error for Roots endpoint if connect is disabled * Fix tests that were asserting old behaviour	2018-07-25 20:26:27 +01:00
Paul Banks	2e223ea2b7	Fix hot loop in cache for RPC returning zero index.	2018-06-25 12:25:37 -07:00
Paul Banks	43b48bc06b	Get agent cache tests passing without global hit count (which is racy). Few other fixes in here just to get a clean run locally - they are all also fixed in other PRs but shouldn't conflict. This should be robust to timing between goroutines now.	2018-06-25 12:25:37 -07:00
Mitchell Hashimoto	6b1e0a3003	agent/cache: always schedule the refresh	2018-06-25 12:25:14 -07:00
Mitchell Hashimoto	cf9b377c78	agent/cache: always fetch with minimum index of 1 at least	2018-06-25 12:25:12 -07:00
Mitchell Hashimoto	6b745964c4	agent/cache: update comment from PR review to clarify	2018-06-25 12:24:08 -07:00
Mitchell Hashimoto	d609ad216b	agent/cache: update comments	2018-06-25 12:24:07 -07:00
Mitchell Hashimoto	839d3c323d	agent/cache: correct test name	2018-06-25 12:24:07 -07:00
Mitchell Hashimoto	45e49f31de	agent/cache: change behavior to return error rather than retry The cache behavior should not be to mask errors and retry. Instead, it should aim to return errors as quickly as possible. We do that here.	2018-06-25 12:24:07 -07:00
Mitchell Hashimoto	311d503fb0	agent/cache: perform backoffs on error retries on blocking queries	2018-06-25 12:24:06 -07:00
Paul Banks	1722734313	Verify trust domain on /authorize calls	2018-06-14 09:42:16 -07:00
Mitchell Hashimoto	4f3b5647e5	agent/cache: change uint8 to uint	2018-06-14 09:42:15 -07:00
Mitchell Hashimoto	fc5508f8a3	agent/cache: string through attempt rather than storing on the entry	2018-06-14 09:42:15 -07:00

1 2

71 Commits