Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server.
This also refactors the auto-config package a bit to split things out into multiple files.
* Add sorting to Intentions with tests
* Skip Intentions url back test
* Create comparator for intention and implement in template
* Add a intentions navigation feature test
There were several PRs that while all passed CI independently, when they all got merged into the same branch caused compilation errors in test code.
The main changes that caused issues where changing agent/cache.Cache.New to require a concrete options struct instead of a pointer. This broke the cert monitor tests and the catalog_list_services_test.go. Another change was made to unembed the http.Server from the agent.HTTPServer struct. That coupled with another change to add a test to ensure cache rate limiting coming from HTTP requests was working as expected caused compilation failures.
This implements a solution for #7863
It does:
Add a new config cache.entry_fetch_rate to limit the number of calls/s for a given cache entry, default value = rate.Inf
Add cache.entry_fetch_max_burst size of rate limit (default value = 2)
The new configuration now supports the following syntax for instance to allow 1 query every 3s:
command line HCL: -hcl 'cache = { entry_fetch_rate = 0.333}'
in JSON
{
"cache": {
"entry_fetch_rate": 0.333
}
}
Previously we used a `shouldReconcile` method in order to decide whether
a response should trigger a reconciliation of the frontend ember-data
'source of truth' or not. It's a lot nicer/clearer if this 'flag' can be set
alongside the HTTP request information, moreover we almost have the same
functionality in `If-Range`/`Partial Content` HTTP functionality.
Here we partly follow this HTTP semantics but use a custom `X-Range` header
instead.
Previously, the envoy bootstrap config would blindly copy the self_admin
cluster into the list of static clusters when configuring either
ReadyBindAddr, PrometheusBindAddr, or StatsBindAddr.
Since ingress gateways always configure the ReadyBindAddr property,
users ran into this case much more often than previously.
The "Overview" link previously went to a summary of Consul features and a
comparison to other products. This commit restores that destination at the
request of Consul PMs.
This commit previously removed the link (it went to `/` instead of `/intro`).
ac612a9cdc (diff-f98b55875118725d1373dd2da36d9ee5)
Replaces #7559
Running tests in parallel, with background goroutines, results in test output not being associated with the correct test. `go test` does not make any guarantees about output from goroutines being attributed to the correct test case.
Attaching log output from background goroutines also cause data races. If the goroutine outlives the test, it will race with the test being marked done. Previously this was noticed as a panic when logging, but with the race detector enabled it is shown as a data race.
The previous solution did not address the problem of correct test attribution because test output could still be hidden when it was associated with a test that did not fail. You would have to look at all of the log output to find the relevant lines. It also made debugging test failures more difficult because each log line was very long.
This commit attempts a new approach. Instead of printing all the logs, only print when a test fails. This should work well when there are a small number of failures, but may not work well when there are many test failures at the same time. In those cases the failures are unlikely a result of a specific test, and the log output is likely less useful.
All of the logs are printed from the test goroutine, so they should be associated with the correct test.
Also removes some test helpers that were not used, or only had a single caller. Packages which expose many functions with similar names can be difficult to use correctly.
Related:
https://github.com/golang/go/issues/38458 (may be fixed in go1.15)
https://github.com/golang/go/issues/38382#issuecomment-612940030
The rationale behind removing them is that all of our own code (xDS, builtin connect proxy) use the cache notification mechanism. This ensures that the blocking fetch behind the scenes is always executing. Therefore the only way you might go to get a certificate and have to wait is when 1) the request has never been made for that cert before or 2) you are using the v1/agent/connect/ca/leaf API for retrieving the cert yourself.
In the first case, the refresh change doesn’t alter the behavior. In the second case, it can be mitigated by using blocking queries with that API which just like normal cache notification mechanism will cause the blocking fetch to be initiated and to get leaf certs as soon as needed.
If you are not using blocking queries, or Envoy/xDS, or the builtin connect proxy but are retrieving the certs yourself then the HTTP endpoint might take a little longer to respond.
This also renames the RefreshTimeout field on the register options to QueryTimeout to more accurately reflect that it is used for any type that supports blocking queries.
The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint.
This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality.
Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.