Commit Graph

2182 Commits

Author SHA1 Message Date
Freddy c18a218bbb Avoid potential proxycfg/xDS deadlock using non-blocking send 2021-02-08 23:18:38 +00:00
R.B. Boyer 556b8bd1c2 server: use the presense of stored federation state data as a sign that we already activated the federation state feature flag (#9519)
This way we only have to wait for the serf barrier to pass once before
we can make use of federation state APIs Without this patch every
restart needs to re-compute the change.
2021-02-08 19:30:58 +00:00
R.B. Boyer eed2302b43 xds: prevent LDS flaps in mesh gateways due to unstable datacenter lists (#9651)
Also fix a similar issue in Terminating Gateways that was masked by an overzealous test.
2021-02-08 16:20:37 +00:00
R.B. Boyer bb5c2e802b xds: deduplicate mesh gateway listeners in a stable way (#9650)
In a situation where the mesh gateway is configured to bind to multiple
network interfaces, we use a feature called 'tagged addresses'.
Sometimes an address is duplicated across multiple tags such as 'lan'
and 'lan_ipv4'.

There is code to deduplicate these things when creating envoy listeners,
but that code doesn't ensure that the same tag wins every time. If the
winning tag flaps between xDS discovery requests it will cause the
listener to be drained and replaced.
2021-02-05 22:28:57 +00:00
Hans Hasselberg e6584182f2 Add flags to support CA generation for Connect (#9585) 2021-01-27 07:55:31 +00:00
R.B. Boyer 685c38a1b1 server: initialize mgw-wanfed to use local gateways more on startup (#9528)
Fixes #9342
2021-01-25 23:31:28 +00:00
hashicorp-ci dd110e8c74 Merge branch 'release/1.8.8' into remote-x 2021-01-22 20:17:04 +00:00
hashicorp-ci e2f9307430
update bindata_assetfs.go 2021-01-22 18:50:02 +00:00
R.B. Boyer f135c3b64e server: when wan federating via mesh gateways only do heuristic primary DC bypass on the leader (#9366)
Fixes #9341
2021-01-22 16:07:11 +00:00
Matt Keeler 7cddf128e9
Backport #9570 to release/1.8.x: Ensure that CA initialization does not block leader election. (#9571)
Backport of PR: 9570

After fixing that bug I uncovered a couple more:

Fix an issue where we might try to cross sign a cert when we never had a valid root.
Fix a potential issue where reconfiguring the CA could cause either the Vault or AWS PCA CA providers to delete resources that are still required by the new incarnation of the CA.

Ensure that CA initialization does not block leader election.

After fixing that bug I uncovered a couple more:

Fix an issue where we might try to cross sign a cert when we never had a valid root.
Fix a potential issue where reconfiguring the CA could cause either the Vault or AWS PCA CA providers to delete resources that are still required by the new incarnation of the CA.
2021-01-21 09:04:30 -05:00
Matt Keeler 87f7bb475c Fix flaky test by marking mock expectations as optional (#9596)
These expectations are optional because in a slow CI environment the deadline to cancell the context might occur before the go routine reaches issuing the RPC. Either way we are successfully ensuring context cancellation is working.
2021-01-20 15:59:13 +00:00
Matt Keeler 0d4b710c4a Special case the error returned when we have a Raft leader but are not tracking it in the ServerLookup (#9487)
This can happen when one other node in the cluster such as a client is unable to communicate with the leader server and sees it as failed. When that happens its failing status eventually gets propagated to the other servers in the cluster and eventually this can result in RPCs returning “No cluster leader” error.

That error is misleading and unhelpful for determing the root cause of the issue as its not raft stability but rather and client -> server networking issue. Therefore this commit will add a new error that will be returned in that case to differentiate between the two cases.
2021-01-04 19:05:58 +00:00
hashicorp-ci bf98530f78
update bindata_assetfs.go 2020-12-10 21:46:51 +00:00
R.B. Boyer 0ecd16a382
acl: global tokens created by auth methods now correctly replicate to secondary datacenters (#9363)
Previously the tokens would fail to insert into the secondary's state
store because the AuthMethod field of the ACLToken did not point to a
known auth method from the primary.

Backport of #9351 to 1.8.x
2020-12-10 08:35:48 -06:00
hashicorp-ci 0b1d1323d7
update bindata_assetfs.go 2020-12-03 19:11:42 +00:00
Kyle Havlovitz e51bd34952 Merge pull request #9318 from hashicorp/ca-update-followup
connect: Fix issue with updating config in secondary
2020-12-02 20:18:32 +00:00
Kyle Havlovitz 6e62166f6d Merge pull request #9009 from hashicorp/update-secondary-ca
connect: Fix an issue with updating CA config in a secondary datacenter
2020-11-30 16:13:12 -08:00
freddygv 545e7379ee Merge branch 'release/1.8.x' into release/1.8.6 2020-11-19 15:45:37 -07:00
hashicorp-ci 8967edad2a
update bindata_assetfs.go 2020-11-19 20:56:50 +00:00
Freddy 8ed789766b Require operator:write to get Connect CA config (#9240)
A vulnerability was identified in Consul and Consul Enterprise (“Consul”) such that operators with `operator:read` ACL permissions are able to read the Consul Connect CA configuration when explicitly configured with the `/v1/connect/ca/configuration` endpoint, including the private key. This allows the user to effectively privilege escalate by enabling the ability to mint certificates for any Consul Connect services. This would potentially allow them to masquerade (receive/send traffic) as any service in the mesh.

--

This PR increases the permissions required to read the Connect CA's private key when it was configured via the `/connect/ca/configuration` endpoint. They are now `operator:write`.
2020-11-19 13:21:51 -07:00
Freddy cfd72af36c Require operator:write to get Connect CA config (#9240)
A vulnerability was identified in Consul and Consul Enterprise (“Consul”) such that operators with `operator:read` ACL permissions are able to read the Consul Connect CA configuration when explicitly configured with the `/v1/connect/ca/configuration` endpoint, including the private key. This allows the user to effectively privilege escalate by enabling the ability to mint certificates for any Consul Connect services. This would potentially allow them to masquerade (receive/send traffic) as any service in the mesh.

--

This PR increases the permissions required to read the Connect CA's private key when it was configured via the `/connect/ca/configuration` endpoint. They are now `operator:write`.
2020-11-19 17:15:23 +00:00
Matt Keeler 0c2eea2918
Backport #9156 to 1.8.x (#9164)
The Catalog, Config Entry, KV and Session resources potentially re-validate the input as its coming in. We need to prevent snapshot restoration failures due to missing namespaces or namespaces that are being deleted in enterprise.
2020-11-11 15:12:10 -05:00
Daniel Nephin 95ed6ec143 Merge pull request #8976 from joel0/wrap-eof
Wrap rpc error object
2020-11-11 16:51:48 +00:00
Daniel Nephin 52f8ada38e Merge pull request #9149 from joel0/wrap-errors
Use error wrapping to preserve error type info
2020-11-10 23:27:48 +00:00
Kyle Havlovitz b72e11aa9c Merge pull request #9053 from hashicorp/vault-token-lookupself
connect: Use the lookup-self endpoint for Vault token
2020-10-27 21:34:37 +00:00
hashicorp-ci 90324f1bac
update bindata_assetfs.go 2020-10-23 20:32:13 +00:00
R.B. Boyer a155423f29 server: config entry replication now correctly uses namespaces in comparisons (#9024)
Previously config entries sharing a kind & name but in different
namespaces could occasionally cause "stuck states" in replication
because the namespace fields were ignored during the differential
comparison phase.

Example:

Two config entries written to the primary:

    kind=A,name=web,namespace=bar
    kind=A,name=web,namespace=foo

Under the covers these both get saved to memdb, so they are sorted by
all 3 components (kind,name,namespace) during natural iteration. This
means that before the replication code does it's own incomplete sort,
the underlying data IS sorted by namespace ascending (bar comes before
foo).

After one pass of replication the primary and secondary datacenters have
the same set of config entries present. If
"kind=A,name=web,namespace=bar" were to be deleted, then things get
weird. Before replication the two sides look like:

primary: [
    kind=A,name=web,namespace=foo
]
secondary: [
    kind=A,name=web,namespace=bar
    kind=A,name=web,namespace=foo
]

The differential comparison phase walks these two lists in sorted order
and first compares "kind=A,name=web,namespace=foo" vs
"kind=A,name=web,namespace=bar" and falsely determines they are the SAME
and are thus cause an update of "kind=A,name=web,namespace=foo". Then it
compares "<nothing>" with "kind=A,name=web,namespace=foo" and falsely
determines that the latter should be DELETED.

During reconciliation the deletes are processed before updates, and so
for a brief moment in the secondary "kind=A,name=web,namespace=foo" is
erroneously deleted and then immediately restored.

Unfortunately after this replication phase the final state is identical
to the initial state, so when it loops around again (rate limited) it
repeats the same set of operations indefinitely.
2020-10-23 18:42:45 +00:00
R.B. Boyer 3456b57dec
connect: update supported envoy point releases to 1.14.5, 1.13.6, 1.12.7, 1.11.2 for 1.8.x (#8999)
Selective backport of #8944 to 1.8.x
2020-10-22 13:26:51 -05:00
Daniel Nephin 2ed5b108c5 Merge pull request #8924 from ShimmerGlass/fix-sidecar-deregister-after-restart
Fix: service LocallyRegisteredAsSidecar property is not persisted
2020-10-22 17:27:41 +00:00
Kyle Havlovitz a8cc967a02 Merge pull request #8784 from hashicorp/renew-intermediate-primary
connect: Enable renewing the intermediate cert in the primary DC
2020-10-09 12:26:49 -07:00
Kyle Havlovitz df160fee3e
Merge pull request #8862 from hashicorp/backport/1.8.x-vault-token-renew
backport(1.8.x): vault token renew
2020-10-09 08:10:45 -07:00
Matt Keeler 6cae442ef4 Add capability for the v1/connect/ca/roots endpoint to return a PEM encoded certificate chain (#8774)
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
2020-10-09 14:43:59 +00:00
Kyle Havlovitz 2dea87b5bb Run make update-vendor after cherry-pick 2020-10-07 16:40:28 -04:00
Kyle Havlovitz b8038d1814 Update vault CA for latest api client 2020-10-07 16:40:27 -04:00
Kyle Havlovitz 57a98945f5 Clean up CA shutdown logic and error 2020-10-07 16:40:27 -04:00
Kyle Havlovitz 9496780ab4 Clean up Vault renew tests and shutdown 2020-10-07 16:40:27 -04:00
Kyle Havlovitz 844e9ffe16 Use mapstructure for decoding vault data 2020-10-07 16:40:27 -04:00
Kyle Havlovitz 449103411d Add a stop function to make sure the renewer is shut down on leader change 2020-10-07 16:40:27 -04:00
Kyle Havlovitz 2fc2b61b48 Add a test for token renewal 2020-10-07 16:40:27 -04:00
Kyle Havlovitz f416c1a8bd Automatically renew the token used by the Vault CA provider 2020-10-07 16:40:27 -04:00
Hans Hasselberg 780d2d79fb fix ent error (#8750) 2020-09-25 10:41:18 -05:00
Hans Hasselberg 010abda64c use service datacenter for dns name (#8704)
* Use args.Datacenter instead of configured datacenter
2020-09-25 10:41:02 -05:00
R.B. Boyer e05c30de1f agent: when enable_central_service_config is enabled ensure agent reload doesn't revert check state to critical (#8747)
Likely introduced when #7345 landed.
2020-09-24 21:24:51 +00:00
Alexander Mykolaichuk e039087adf added permission denied error message (#8044) 2020-09-22 18:36:36 +00:00
Hans Hasselberg abd8e605cf fix TestLeader_SecondaryCA_IntermediateRenew (#8702)
* fix lessThanHalfTime
* get lock for CAProvider()
* make a var to relate both vars
* rename to getCAProviderWithLock
* move CertificateTimeDriftBuffer to agent/connect/ca
2020-09-18 08:14:09 +00:00
Mike Morris dc50eca37f test: update tags for database service registrations and queries (#8693) 2020-09-16 18:21:49 +00:00
Daniel Nephin 3ee2aa1325 Merge pull request #8685 from pierresouchay/do_not_flood_logs_with_Non-server_in_server-only_area
[BUGFIX] Avoid GetDatacenter* methods to flood Consul servers logs
2020-09-15 21:58:29 +00:00
Kyle Havlovitz 2ed68b9f45 Merge pull request #8646 from hashicorp/common-intermediate-ttl
Move IntermediateCertTTL to common CA config
2020-09-15 19:04:27 +00:00
hashicorp-ci 2394439344
update bindata_assetfs.go 2020-09-11 03:06:11 +00:00
Hans Hasselberg 89b7e80478 secondaryIntermediateCertRenewalWatch abort on success (#8588)
secondaryIntermediateCertRenewalWatch was using `retryLoopBackoff` to
renew the intermediate certificate. Once it entered the inner loop and
started `retryLoopBackoff` it would never leave that.
`retryLoopBackoffAbortOnSuccess` will return when renewing is
successful, like it was intended originally.
2020-09-04 09:49:16 +00:00