During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.
Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:
```json
[
{
"WAN": true,
"Datacenter": "dc2",
"Segment": "",
"Keys": {
"0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
},
"PrimaryKeys": {
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
},
"NumNodes": 6
},
{
"WAN": false,
"Datacenter": "dc2",
"Segment": "",
"Keys": {
"0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
},
"PrimaryKeys": {
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
},
"NumNodes": 8
},
{
"WAN": false,
"Datacenter": "dc1",
"Segment": "",
"Keys": {
"0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
},
"PrimaryKeys": {
"SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
},
"NumNodes": 8
}
]
```
I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output
Fixes#3393.
* update bindata_assetfs.go
* Release v1.8.2
* Putting source back into Dev Mode
* changelog: add entries for 1.7.6, 1.7.5 and 1.6.7
Co-authored-by: hashicorp-ci <hashicorp-ci@users.noreply.github.com>
AutoConfig will generate local tokens for clients and the ability to use local tokens is gated off of token replication being enabled and being configured with a replication token. Therefore we already have a hard requirement on having token replication enabled, this commit just makes sure to surface that to the operator instead of having to discern what the issue is from RPC errors.
This code started as an optimization to avoid doing an RPC Ping to
itself. But in a single server cluster the rebalancing was led to
believe that there were no healthy servers because foundHealthyServer
was not set. Now this is being set properly.
Fixes#8401 and #8403.
Ensure that enabling AutoConfig sets the tls configurator properly
This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.
Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server.
This also refactors the auto-config package a bit to split things out into multiple files.
This implements a solution for #7863
It does:
Add a new config cache.entry_fetch_rate to limit the number of calls/s for a given cache entry, default value = rate.Inf
Add cache.entry_fetch_max_burst size of rate limit (default value = 2)
The new configuration now supports the following syntax for instance to allow 1 query every 3s:
command line HCL: -hcl 'cache = { entry_fetch_rate = 0.333}'
in JSON
{
"cache": {
"entry_fetch_rate": 0.333
}
}
Refactoring of the agentpb package.
First move the whole thing to the top-level proto package name.
Secondly change some things around internally to have sub-packages.
# Conflicts:
# agent/consul/state/acl.go
# agent/consul/state/acl_test.go
The rationale behind removing them is that all of our own code (xDS, builtin connect proxy) use the cache notification mechanism. This ensures that the blocking fetch behind the scenes is always executing. Therefore the only way you might go to get a certificate and have to wait is when 1) the request has never been made for that cert before or 2) you are using the v1/agent/connect/ca/leaf API for retrieving the cert yourself.
In the first case, the refresh change doesn’t alter the behavior. In the second case, it can be mitigated by using blocking queries with that API which just like normal cache notification mechanism will cause the blocking fetch to be initiated and to get leaf certs as soon as needed.
If you are not using blocking queries, or Envoy/xDS, or the builtin connect proxy but are retrieving the certs yourself then the HTTP endpoint might take a little longer to respond.
This also renames the RefreshTimeout field on the register options to QueryTimeout to more accurately reflect that it is used for any type that supports blocking queries.
# Conflicts:
# agent/cache/cache.go
The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint.
This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality.
Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.
# Conflicts:
# agent/agent.go