25 Commits

Author SHA1 Message Date
Matt Keeler
dbc48ea3f7 Fixes race condition in Agent Cache (#5796)
* Fix race condition during a cache get

Check the entry we pulled out of the cache while holding the lock had Fetching set.
If it did then we should use the existing Waiter instead of calling fetch. The reason
this is better than just calling fetch is that fetch re-gets the entry out of the
entries map and the previous fetch may have finished. Therefore this prevents
erroneously starting a new fetch because we just missed the last update.

* Fix race condition fully

The first commit still allowed for the following scenario:

• No entry existing when checked in getWithIndex while holding the read lock
• Then by time we had reached fetch it had been created and finished.

* always use ok when returning

* comment mentioning the reading from entries.

* use cacheHit consistently
2019-05-07 11:15:49 +01:00
Paul Banks
ef9f27cbc8
connect: tame thundering herd of CSRs on CA rotation (#5228)
* Support rate limiting and concurrency limiting CSR requests on servers; handle CA rotations gracefully with jitter and backoff-on-rate-limit in client

* Add CSR rate limiting docs

* Fix config naming and add tests for new CA configs
2019-01-22 17:19:36 +00:00
Paul Banks
0638e09b6e
connect: agent leaf cert caching improvements (#5091)
* Add State storage and LastResult argument into Cache so that cache.Types can safely store additional data that is eventually expired.

* New Leaf cache type working and basic tests passing. TODO: more extensive testing for the Root change jitter across blocking requests, test concurrent fetches for different leaves interact nicely with rootsWatcher.

* Add multi-client and delayed rotation tests.

* Typos and cleanup error handling in roots watch

* Add comment about how the FetchResult can be used and change ca leaf state to use a non-pointer state.

* Plumb test override of root CA jitter through TestAgent so that tests are deterministic again!

* Fix failing config test
2019-01-10 12:46:11 +00:00
Paul Banks
0589525ae9
agent: Don't leave old errors around in cache (#5094)
* Fixes #4480. Don't leave old errors around in cache that can be hit in specific circumstances.

* Move error reset to cover extreme edge case of nil Value, nil err Fetch
2019-01-08 10:06:38 +00:00
Paul Banks
298af6dca7
Quick fix for cache age flakiness in CI 2018-10-11 13:12:19 +01:00
Paul Banks
88388d760d Support Agent Caching for Service Discovery Results (#4541)
* Add cache types for catalog/services and health/services and basic test that caching works

* Support non-blocking cache types with Cache-Control semantics.

* Update API docs to include caching info for every endpoint.

* Comment updates per PR feedback.

* Add note on caching to the 10,000 foot view on the architecture page to make the new data path more clear.

* Document prepared query staleness quirk and force all background requests to AllowStale so we can spread service discovery load across servers.
2018-10-10 16:55:34 +01:00
Paul Banks
8cbeb29e73
Fixes #4421: General solution to stop blocking queries with index 0 (#4437)
* Fix theoretical cache collision bug if/when we use more cache types with same result type

* Generalized fix for blocking query handling when state store methods return zero index

* Refactor test retry to only affect CI

* Undo make file merge

* Add hint to error message returned to end-user requests if Connect is not enabled when they try to request cert

* Explicit error for Roots endpoint if connect is disabled

* Fix tests that were asserting old behaviour
2018-07-25 20:26:27 +01:00
Paul Banks
2e223ea2b7 Fix hot loop in cache for RPC returning zero index. 2018-06-25 12:25:37 -07:00
Paul Banks
43b48bc06b Get agent cache tests passing without global hit count (which is racy).
Few other fixes in here just to get a clean run locally - they are all also fixed in other PRs but shouldn't conflict.

This should be robust to timing between goroutines now.
2018-06-25 12:25:37 -07:00
Mitchell Hashimoto
6b1e0a3003 agent/cache: always schedule the refresh 2018-06-25 12:25:14 -07:00
Mitchell Hashimoto
cf9b377c78 agent/cache: always fetch with minimum index of 1 at least 2018-06-25 12:25:12 -07:00
Mitchell Hashimoto
839d3c323d agent/cache: correct test name 2018-06-25 12:24:07 -07:00
Mitchell Hashimoto
45e49f31de agent/cache: change behavior to return error rather than retry
The cache behavior should not be to mask errors and retry. Instead, it
should aim to return errors as quickly as possible. We do that here.
2018-06-25 12:24:07 -07:00
Mitchell Hashimoto
311d503fb0 agent/cache: perform backoffs on error retries on blocking queries 2018-06-25 12:24:06 -07:00
Mitchell Hashimoto
cfcd733609
agent/cache: implement refresh backoff 2018-06-14 09:42:14 -07:00
Mitchell Hashimoto
02b20a0353
agent/cache: address feedback, clarify comments 2018-06-14 09:42:03 -07:00
Mitchell Hashimoto
3b550d2b72
agent/cache: rework how expiry data is stored to be more efficient 2018-06-14 09:42:03 -07:00
Mitchell Hashimoto
595193a781
agent/cache: initial TTL work 2018-06-14 09:42:02 -07:00
Mitchell Hashimoto
1df99514ca
agent/cache: send the RefreshTimeout into the backend fetch 2018-06-14 09:42:02 -07:00
Mitchell Hashimoto
db4c47df27
agent/cache: on error, return from Get immediately, don't block forever 2018-06-14 09:42:02 -07:00
Mitchell Hashimoto
fcb15e15ae
agent/cache: support timeouts for cache reads and empty fetch results 2018-06-14 09:42:01 -07:00
Mitchell Hashimoto
c329b4cb34
agent/cache: partition by DC/ACL token 2018-06-14 09:42:00 -07:00
Mitchell Hashimoto
e3c1162881
agent/cache: Reorganize some files, RequestInfo struct, prepare for partitioning 2018-06-14 09:42:00 -07:00
Mitchell Hashimoto
975be337a9
agent/cache: blank cache key means to always fetch 2018-06-14 09:42:00 -07:00
Mitchell Hashimoto
1cfb0f1922
agent/cache: initial kind-of working cache 2018-06-14 09:42:00 -07:00