2170 Commits

Author SHA1 Message Date
Daniel Nephin
1a5ba078a8 Merge pull request #8514 from hashicorp/dnephin/testing-improvements-1
testing: small improvements to TestSessionCreate and testutil.retry
2020-08-26 17:11:43 -04:00
Daniel Nephin
690d3f4f20 Merge pull request #8515 from hashicorp/dnephin/unexport-testing-shims
config: unexport fields and resolve TODOs in config.Builder
2020-08-26 17:09:46 -04:00
Daniel Nephin
cbfae50854 Merge pull request #8473 from hashicorp/dnephin/unmethod-consul-config
agent: convert consulConfig method to a function
2020-08-26 17:06:32 -04:00
Daniel Nephin
298c4d7e66 Merge pull request #8463 from hashicorp/dnephin/unmethod-make-node-id
agent: convert NodeID methods to functions
2020-08-26 17:05:57 -04:00
Daniel Nephin
533c53b8ef Merge pull request #8461 from hashicorp/dnephin/remove-notify-shutdown
agent/consul: Remove NotifyShutdown
2020-08-26 17:04:03 -04:00
Daniel Nephin
81de78d131 Merge pull request #8500 from hashicorp/dnephin/auto-config-loader
auto-config: reduce awareness of config
2020-08-26 17:01:55 -04:00
Daniel Nephin
6dc6507abc Merge pull request #8469 from hashicorp/dnephin/config-source
config: make Source an interface to avoid the marshal/unmarshal cycle in auto-config
2020-08-26 17:00:51 -04:00
Daniel Nephin
56444e0405 Merge pull request #8546 from edevil/fix_vet
testing: Fix govet errors
2020-08-24 18:39:56 +00:00
Daniel Nephin
984318c2d8 Merge pull request #8537 from hashicorp/dnephin/fix-panic-on-connect-nil
Fix panic when decoding 'Connect: null'
2020-08-20 22:01:30 +00:00
Hans Hasselberg
bc5e2ddfc3 add primary keys to list keyring (#8522)
During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.

Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:

```json
[
  {
    "WAN": true,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "NumNodes": 6
  },
  {
    "WAN": false,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  },
  {
    "WAN": false,
    "Datacenter": "dc1",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  }
]
```

I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output

Fixes #3393.
2020-08-18 07:51:22 +00:00
Daniel Nephin
d5179becf6 Merge pull request #8509 from hashicorp/dnephin/use-t.cleanup-in-testagent
testing: Use t.cleanup in TestAgent , and fix two flaky tests
2020-08-14 20:34:09 +00:00
R.B. Boyer
7983023acf
[backport/1.8.x] connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8494)
Backport of #8470 to 1.8.x
2020-08-13 15:26:23 -05:00
Daniel Nephin
010a8eb515 Merge pull request #8365 from hashicorp/dnephin/fix-service-by-node-meta-flake
state: speed up tests that use watchLimit
2020-08-13 15:16:58 +00:00
hashicorp-ci
39b8ecb84a
update bindata_assetfs.go 2020-08-12 19:03:06 +00:00
Freddy
56993a0054 Notify alias checks when aliased service is [de]registered (#8456) 2020-08-12 15:48:23 +00:00
Hans Hasselberg
04fcbff24b Merge pull request #8471 from hashicorp/local_only
thread local-only through the layers
2020-08-12 06:56:10 +00:00
Kyle Havlovitz
ae8aa66cd7 Backport catalog index fix to 1.8.x 2020-08-11 15:13:25 -07:00
Kyle Havlovitz
eaf3a00b88 Fix a state store comment about version 2020-08-11 14:22:44 -07:00
Kyle Havlovitz
0b715f5f27 fsm: Fix snapshot bug with restoring node/service/check indexes 2020-08-11 14:22:42 -07:00
Mike Morris
99d683bfe5 changelog: Update for 1.8.2, 1.7.6, 1.7.5 and 1.6.7 (#8462)
* update bindata_assetfs.go

* Release v1.8.2

* Putting source back into Dev Mode

* changelog: add entries for 1.7.6, 1.7.5 and 1.6.7

Co-authored-by: hashicorp-ci <hashicorp-ci@users.noreply.github.com>
2020-08-07 19:00:23 -04:00
Daniel Nephin
62703e4426
Merge pull request #8438 from hashicorp/dnephin/1.8.x-backport-ineffassign
[1.8.x] Backport addition of  ineffassign linter and staticcheck
2020-08-07 13:04:56 -04:00
Matt Keeler
1c3c8c7804 Require token replication to be enabled in secondary dcs when ACLs are enabled with AutoConfig (#8451)
AutoConfig will generate local tokens for clients and the ability to use local tokens is gated off of token replication being enabled and being configured with a replication token. Therefore we already have a hard requirement on having token replication enabled, this commit just makes sure to surface that to the operator instead of having to discern what the issue is from RPC errors.
2020-08-07 14:20:57 +00:00
Hans Hasselberg
ba495cd11b auto_config implies connect (#8433) 2020-08-07 10:02:30 +00:00
Hans Hasselberg
1a351d53bb Mark its own cluster as healthy when rebalancing. (#8406)
This code started as an optimization to avoid doing an RPC Ping to
itself. But in a single server cluster the rebalancing was led to
believe that there were no healthy servers because foundHealthyServer
was not set. Now this is being set properly.

Fixes #8401 and #8403.
2020-08-06 08:43:18 +00:00
Daniel Nephin
2bde91a2a0 Merge pull request #8404 from hashicorp/dnephin/remove-log-output-field
Use Logger consistently, instead of LogOutput
2020-08-05 18:32:16 +00:00
Daniel Nephin
4756cb6af4 Merge pull request #8437 from hashicorp/dnephin/fix-service-checks-cache-type
cache-type: Return nil value on error
2020-08-05 17:50:28 +00:00
Daniel Nephin
26e53053a9 Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4
ci: enable SA4006 staticcheck check and add ineffassign
2020-08-05 13:37:35 -04:00
freddygv
b5e858d3e1 Avoid panics during shutdown routine 2020-07-30 11:13:40 -06:00
Matt Keeler
c9b66157a1
Ensure certificates retrieved through the cache get persisted with auto-config (#8409) 2020-07-30 11:42:24 -04:00
Matt Keeler
4f98af0724 Allow setting verify_incoming* when using auto_encrypt or auto_config (#8394)
Ensure that enabling AutoConfig sets the tls configurator properly

This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.
2020-07-30 14:16:15 +00:00
Hans Hasselberg
ae2cbbce99 agent/cache test for cache throttling. (#8396) 2020-07-30 12:41:38 +00:00
Matt Keeler
e813445e57 Agent Auto Config: Implement Certificate Generation (#8360)
Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server.

This also refactors the auto-config package a bit to split things out into multiple files.
2020-07-28 19:32:22 +00:00
Matt Keeler
91ec880e07
Backport #8389 (#8392)
# Conflicts:
#	agent/cache-types/catalog_list_services_test.go
2020-07-28 14:22:29 -04:00
Pierre Souchay
678489d9d1 Added ratelimit to handle throtling cache (#8226)
This implements a solution for #7863

It does:

    Add a new config cache.entry_fetch_rate to limit the number of calls/s for a given cache entry, default value = rate.Inf
    Add cache.entry_fetch_max_burst size of rate limit (default value = 2)

The new configuration now supports the following syntax for instance to allow 1 query every 3s:

    command line HCL: -hcl 'cache = { entry_fetch_rate = 0.333}'
    in JSON

{
  "cache": {
    "entry_fetch_rate": 0.333
  }
}
2020-07-27 21:11:42 +00:00
Matt Keeler
0937f70ddf Move connect root retrieval and cert signing logic out of the RPC endpoints (#8364)
The code now lives on the Server type itself. This was done so that all of this could be shared with auto config certificate signing.
2020-07-24 14:01:58 +00:00
Matt Keeler
4d41ee3887 Move generation of the CA Configuration from the agent code into a method on the RuntimeConfig (#8363)
This allows this to be reused elsewhere.
2020-07-23 20:05:52 +00:00
Matt Keeler
56b46436c1
Backport: #8362 (#8366)
Refactoring of the agentpb package.

First move the whole thing to the top-level proto package name.

Secondly change some things around internally to have sub-packages.
# Conflicts:
#	agent/consul/state/acl.go
#	agent/consul/state/acl_test.go
2020-07-23 12:44:27 -04:00
Daniel Nephin
4205fdf1d6 Merge pull request #7948 from hashicorp/dnephin/buffer-test-logs
testutil: NewLogBuffer - buffer logs until a test fails
2020-07-21 19:22:29 +00:00
Matt Keeler
a8d2e5a2c2
Disable background cache refresh for Connect Leaf Certs
The rationale behind removing them is that all of our own code (xDS, builtin connect proxy) use the cache notification mechanism. This ensures that the blocking fetch behind the scenes is always executing. Therefore the only way you might go to get a certificate and have to wait is when 1) the request has never been made for that cert before or 2) you are using the v1/agent/connect/ca/leaf API for retrieving the cert yourself.

In the first case, the refresh change doesn’t alter the behavior. In the second case, it can be mitigated by using blocking queries with that API which just like normal cache notification mechanism will cause the blocking fetch to be initiated and to get leaf certs as soon as needed.

If you are not using blocking queries, or Envoy/xDS, or the builtin connect proxy but are retrieving the certs yourself then the HTTP endpoint might take a little longer to respond.

This also renames the RefreshTimeout field on the register options to QueryTimeout to more accurately reflect that it is used for any type that supports blocking queries.

# Conflicts:
#	agent/cache/cache.go
2020-07-21 13:51:18 -04:00
Matt Keeler
24e11b511e
Fix issue with changing the agent token causing failure to renew the auto-encrypt certificate
The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint.

This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality.

Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.

# Conflicts:
#	agent/agent.go
2020-07-21 13:49:18 -04:00
Daniel Nephin
65566e2c98 Merge pull request #8290 from hashicorp/dnephin/watch-decode
watch: fix script watches with single arg
2020-07-20 18:41:48 +00:00
André
927e73d8db minor: fix docstring of DNSOnlyPassing (#8318)
In runtime.go it had "duration" but it is actually a boolean.
2020-07-16 13:48:07 +00:00
Matt Keeler
625055a556 Add ability for notifications when one of the agent tokens is updated (#8301)
Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>
2020-07-14 13:54:38 +00:00
Freddy
89af0212d3 Add api mod support for /catalog/gateway-services (#8278) 2020-07-10 19:02:09 +00:00
R.B. Boyer
2142a697ad
[backport: 1.8.x] xds: version sniff envoy and switch regular expressions from 'regex' to 'safe_regex' on newer envoy versions (#8265)
cherry-pick of #8222 onto origin/release/1.8.x

Fixes: #8205
2020-07-09 17:04:23 -05:00
Matt Keeler
38251ab0e8
Pass the Config and TLS Configurator into the AutoConfig constructor
This is instead of having the AutoConfigBackend interface provide functions for retrieving them.

NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.
2020-07-09 10:38:29 -04:00
Matt Keeler
f06595992a
Rename (*Server).forward to (*Server).ForwardRPC
Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.
2020-07-09 10:38:16 -04:00
Matt Keeler
977eb725a7
Refactor AutoConfig RPC to not have a direct dependency on the Server type
Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.

# Conflicts:
#	agent/pool/pool.go
2020-07-09 10:37:55 -04:00
R.B. Boyer
8a5680aaf0
connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8247)
cherry-pick of #8165 onto origin/release/1.8.x
2020-07-07 16:22:30 -05:00
Chris Piraino
cbf143844f Append port number to ingress host domain (#8190)
A port can be sent in the Host header as defined in the HTTP RFC, so we
take any hosts that we want to match traffic to and also add another
host with the listener port added.

Also fix an issue with envoy integration tests not running the
case-ingress-gateway-tls test.
2020-07-07 15:43:32 +00:00