Commit Graph

832 Commits

Author SHA1 Message Date
Daniel Nephin 62703e4426
Merge pull request #8438 from hashicorp/dnephin/1.8.x-backport-ineffassign
[1.8.x] Backport addition of  ineffassign linter and staticcheck
2020-08-07 13:04:56 -04:00
Hans Hasselberg ba495cd11b auto_config implies connect (#8433) 2020-08-07 10:02:30 +00:00
Daniel Nephin 2bde91a2a0 Merge pull request #8404 from hashicorp/dnephin/remove-log-output-field
Use Logger consistently, instead of LogOutput
2020-08-05 18:32:16 +00:00
Daniel Nephin 26e53053a9 Merge pull request #8034 from hashicorp/dnephin/add-linter-staticcheck-4
ci: enable SA4006 staticcheck check and add ineffassign
2020-08-05 13:37:35 -04:00
freddygv b5e858d3e1 Avoid panics during shutdown routine 2020-07-30 11:13:40 -06:00
Matt Keeler 4f98af0724 Allow setting verify_incoming* when using auto_encrypt or auto_config (#8394)
Ensure that enabling AutoConfig sets the tls configurator properly

This also refactors the TLS configurator a bit so the naming doesn’t imply only AutoEncrypt as the source of the automatically setup TLS cert info.
2020-07-30 14:16:15 +00:00
Matt Keeler e813445e57 Agent Auto Config: Implement Certificate Generation (#8360)
Most of the groundwork was laid in previous PRs between adding the cert-monitor package to extracting the logic of signing certificates out of the connect_ca_endpoint.go code and into a method on the server.

This also refactors the auto-config package a bit to split things out into multiple files.
2020-07-28 19:32:22 +00:00
Matt Keeler 0937f70ddf Move connect root retrieval and cert signing logic out of the RPC endpoints (#8364)
The code now lives on the Server type itself. This was done so that all of this could be shared with auto config certificate signing.
2020-07-24 14:01:58 +00:00
Matt Keeler 4d41ee3887 Move generation of the CA Configuration from the agent code into a method on the RuntimeConfig (#8363)
This allows this to be reused elsewhere.
2020-07-23 20:05:52 +00:00
Matt Keeler 56b46436c1
Backport: #8362 (#8366)
Refactoring of the agentpb package.

First move the whole thing to the top-level proto package name.

Secondly change some things around internally to have sub-packages.
# Conflicts:
#	agent/consul/state/acl.go
#	agent/consul/state/acl_test.go
2020-07-23 12:44:27 -04:00
Daniel Nephin 4205fdf1d6 Merge pull request #7948 from hashicorp/dnephin/buffer-test-logs
testutil: NewLogBuffer - buffer logs until a test fails
2020-07-21 19:22:29 +00:00
Matt Keeler 24e11b511e
Fix issue with changing the agent token causing failure to renew the auto-encrypt certificate
The fallback method would still work but it would get into a state where it would let the certificate expire for 10s before getting a new one. And the new one used the less secure RPC endpoint.

This is also a pretty large refactoring of the auto encrypt code. I was going to write some tests around the certificate monitoring but it was going to be impossible to get a TestAgent configured in such a way that I could write a test that ran in less than an hour or two to exercise the functionality.

Moving the certificate monitoring into its own package will allow for dependency injection and in particular mocking the cache types to control how it hands back certificates and how long those certificates should live. This will allow for exercising the main loop more than would be possible with it coupled so tightly with the Agent.

# Conflicts:
#	agent/agent.go
2020-07-21 13:49:18 -04:00
Matt Keeler 38251ab0e8
Pass the Config and TLS Configurator into the AutoConfig constructor
This is instead of having the AutoConfigBackend interface provide functions for retrieving them.

NOTE: the config is not reloadable. For now this is fine as we don’t look at any reloadable fields. If that changes then we should provide a way to make it reloadable.
2020-07-09 10:38:29 -04:00
Matt Keeler f06595992a
Rename (*Server).forward to (*Server).ForwardRPC
Also get rid of the preexisting shim in server.go that existed before to have this name just call the unexported one.
2020-07-09 10:38:16 -04:00
Matt Keeler 977eb725a7
Refactor AutoConfig RPC to not have a direct dependency on the Server type
Instead it has an interface which can be mocked for better unit testing that is deterministic and not prone to flakiness.

# Conflicts:
#	agent/pool/pool.go
2020-07-09 10:37:55 -04:00
Matt Keeler 9c64239db7 Merge pull request #8211 from hashicorp/bugfix/auto-encrypt-various 2020-07-02 13:51:34 +00:00
Matt Keeler d73d299848 Merge pull request #8218 from yurkeen/fix-dns-rcode 2020-07-01 13:13:55 +00:00
Matt Keeler 8853e38c72
Various go routine leak fixes 2020-06-25 09:36:14 -04:00
Matt Keeler 0736c42b72 Allow cancelling startup when performing auto-config (#8157)
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2020-06-19 19:16:20 +00:00
Chris Piraino 8d72225d33 Remove ACLEnforceVersion8 from tests (#8138)
The field had been deprecated for a while and was recently removed,
however a PR which added these tests prior to removal was merged.
2020-06-18 18:15:43 +00:00
Matt Keeler 6375db7b4b Merge pull request #8086 from hashicorp/feature/auto-config/client-config-inject 2020-06-18 14:45:52 +00:00
Matt Keeler 9f37a218c5 Merge pull request #8035 from hashicorp/feature/auto-config/server-rpc 2020-06-17 20:08:17 +00:00
Pierre Souchay 318495d1f8 gossip: Ensure that metadata of Consul Service is updated (#7903)
While upgrading servers to a new version, I saw that metadata of
existing servers are not upgraded, so the version and raft meta
is not up to date in catalog.

The only way to do it was to:
 * update Consul server
 * make it leave the cluster, then metadata is accurate

That's because the optimization to avoid updating catalog does
not take into account metadata, so no update on catalog is performed.
2020-06-17 10:17:33 +00:00
Matt Keeler c3b348bebb Agent Auto Configuration: Configuration Syntax Updates (#8003) 2020-06-16 19:03:59 +00:00
Matt Keeler 3c4413cbed ACL Node Identities (#7970)
A Node Identity is very similar to a service identity. Its main targeted use is to allow creating tokens for use by Consul agents that will grant the necessary permissions for all the typical agent operations (node registration, coordinate updates, anti-entropy).

Half of this commit is for golden file based tests of the acl token and role cli output. Another big updates was to refactor many of the tests in agent/consul/acl_endpoint_test.go to use the same style of tests and the same helpers. Besides being less boiler plate in the tests it also uses a common way of starting a test server with ACLs that should operate without any warnings regarding deprecated non-uuid master tokens etc.
2020-06-16 16:55:01 +00:00
Freddy 2af14433be Merge pull request #8099 from hashicorp/gateway-services-endpoint 2020-06-12 21:15:25 +00:00
Hans Hasselberg a678b47c73 acl: do not resolve local tokens from remote dcs (#8068) 2020-06-09 19:14:19 +00:00
Hans Hasselberg cfc95732f3
Tokens converted from legacy ACLs get their Hash computed (#8047) (#8054)
This allows new style token replication to work for legacy tokens as well when they change.
Fixes #5606
2020-06-08 23:36:55 +02:00
Hans Hasselberg de3e68c577 Merge pull request #7966 from hashicorp/pool_improvements
Agent connection pool cleanup
2020-06-05 19:03:24 +00:00
R.B. Boyer ebc5fc039f server: don't activate federation state replication or anti-entropy until all servers are running 1.8.0+ (#8014) 2020-06-04 21:05:49 +00:00
Matt Keeler 1e2754d59c Fix legacy management tokens in unupgraded secondary dcs (#7908)
The ACL.GetPolicy RPC endpoint was supposed to return the “parent” policy and not always the default policy. In the case of legacy management tokens the parent policy was supposed to be “manage”. The result of us not sending this properly was that operations that required specifically a management token such as saving a snapshot would not work in secondary DCs until they were upgraded.
2020-06-03 15:42:57 +00:00
Matt Keeler a539c5de88 Fix segfault due to race condition for checking server versions (#7957)
The ACL monitoring routine uses c.routers to check for server version updates. Therefore it needs to be started after initializing the routers.
2020-06-03 14:37:10 +00:00
R.B. Boyer 5404155d36 acl: allow auth methods created in the primary datacenter to optionally create global tokens (#7899) 2020-06-01 16:45:22 +00:00
R.B. Boyer c4b875cae4 acl: remove the deprecated `acl_enforce_version_8` option (#7991)
Fixes #7292
2020-06-01 10:40:22 -05:00
Jono Sosulska cedcbf3299 Replace whitelist/blacklist terminology with allowlist/denylist (#7971)
* Replace whitelist/blacklist terminology with allowlist/denylist
2020-06-01 10:40:14 -05:00
Daniel Nephin 1664067943 ci: Add staticcheck and fix most errors
Three of the checks are temporarily disabled to limit the size of the
diff, and allow us to enable all the other checks in CI.

In a follow up we can fix the issues reported by the other checks one
at a time, and enable them.
2020-06-01 10:40:04 -05:00
Pierre Souchay 876ee89d4a Allow to restrict servers that can join a given Serf Consul cluster. (#7628)
Based on work done in https://github.com/hashicorp/memberlist/pull/196
this allows to restrict the IP ranges that can join a given Serf cluster
and be a member of the cluster.

Restrictions on IPs can be done separatly using 2 new differents flags
and config options to restrict IPs for LAN and WAN Serf.
2020-06-01 10:31:32 -05:00
R.B. Boyer c2b903b597 create lib/stringslice package (#7934) 2020-05-27 16:48:01 +00:00
R.B. Boyer b527e77850 agent: handle re-bootstrapping in a secondary datacenter when WAN federation via mesh gateways is configured (#7931)
The main fix here is to always union the `primary-gateways` list with
the list of mesh gateways in the primary returned from the replicated
federation states list. This will allow any replicated (incorrect) state
to be supplemented with user-configured (correct) state in the config
file. Eventually the game of random selection whack-a-mole will pick a
winning entry and re-replicate the latest federation states from the
primary. If the user-configured state is actually the incorrect one,
then the same eventual correct selection process will work in that case,
too.

The secondary fix is actually to finish making wanfed-via-mgws actually
work as originally designed. Once a secondary datacenter has replicated
federation states for the primary AND managed to stand up its own local
mesh gateways then all of the RPCs from a secondary to the primary
SHOULD go through two sets of mesh gateways to arrive in the consul
servers in the primary (one hop for the secondary datacenter's mesh
gateway, and one hop through the primary datacenter's mesh gateway).
This was neglected in the initial implementation. While everything
works, ideally we should treat communications that go around the mesh
gateways as just provided for bootstrapping purposes.

Now we heuristically use the success/failure history of the federation
state replicator goroutine loop to determine if our current mesh gateway
route is working as intended. If it is, we try using the local gateways,
and if those don't work we fall back on trying the primary via the union
of the replicated state and the go-discover configuration flags.

This can be improved slightly in the future by possibly initializing the
gateway choice to local on startup if we already have replicated state.
This PR does not address that improvement.

Fixes #7339
2020-05-27 16:32:22 +00:00
R.B. Boyer 1765fa854e connect: ensure proxy-defaults protocol is used for upstreams (#7938) 2020-05-21 21:09:51 +00:00
Daniel Nephin 7925a0074c Merge pull request #7933 from hashicorp/dnephin/state-txn-missing-errors
state: fix unhandled error
2020-05-21 17:03:33 +00:00
Aleksandr Zagaevskiy 6aecf89418 Preserve ModifyIndex for unchanged entry in KVS TXN (#7832) 2020-05-21 17:03:16 +00:00
Daniel Nephin c02d4e1390 Merge pull request #7894 from hashicorp/dnephin/add-linter-staticcheck-1
Fix some bugs/issues found by staticcheck
2020-05-21 17:01:15 +00:00
Chris Piraino 6969d08361 Merge pull request #7898 from hashicorp/bug/update-gateways-on-config-entry-delete
Remove error from GatewayServices RPC when a service is not a gateway
2020-05-18 18:03:35 +00:00
Matt Keeler acccdbe45c
Fix identity resolution on clients and in secondary dcs (#7862)
Previously this happened to be using the method on the Server/Client that was meant to allow the ACLResolver to locally resolve tokens. On Servers that had tokens (primary or secondary dc + token replication) this function would lookup the token from raft and return the ACLIdentity. On clients this was always a noop. We inadvertently used this function instead of creating a new one when we added logging accessor ids for permission denied RPC requests. 

With this commit, a new method is used for resolving the identity properly via the ACLResolver which may still resolve locally in the case of being on a server with tokens but also supports remote token resolution.
2020-05-13 13:00:08 -04:00
Chris Piraino 7a7760bfd5
Make new gateway tests compatible with enterprise (#7856) 2020-05-12 13:48:20 -05:00
Daniel Nephin 600645b5f9 Add unconvert linter
To find unnecessary type convertions
2020-05-12 13:47:25 -04:00
Drew Bailey c9d0b83277 Value is already an int, remove type cast 2020-05-12 13:13:09 -04:00
R.B. Boyer 1efafd7523
acl: add auth method for JWTs (#7846) 2020-05-11 20:59:29 -05:00
Chris Piraino c21052457b
Return early from updateGatewayServices if nothing to update (#7838)
* Return early from updateGatewayServices if nothing to update

Previously, we returned an empty slice of gatewayServices, which caused
us to accidentally delete everything in the memdb table

* PR comment and better formatting
2020-05-11 14:46:48 -05:00