Commit Graph

10240 Commits

Author SHA1 Message Date
R.B. Boyer 8e22d80e35
connect: fix failover through a mesh gateway to a remote datacenter (#6259)
Failover is pushed entirely down to the data plane by creating envoy
clusters and putting each successive destination in a different load
assignment priority band. For example this shows that normally requests
go to 1.2.3.4:8080 but when that fails they go to 6.7.8.9:8080:

- name: foo
  load_assignment:
    cluster_name: foo
    policy:
      overprovisioning_factor: 100000
    endpoints:
    - priority: 0
      lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: 1.2.3.4
              port_value: 8080
    - priority: 1
      lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: 6.7.8.9
              port_value: 8080

Mesh gateways route requests based solely on the SNI header tacked onto
the TLS layer. Envoy currently only lets you configure the outbound SNI
header at the cluster layer.

If you try to failover through a mesh gateway you ideally would
configure the SNI value per endpoint, but that's not possible in envoy
today.

This PR introduces a simpler way around the problem for now:

1. We identify any target of failover that will use mesh gateway mode local or
   remote and then further isolate any resolver node in the compiled discovery
   chain that has a failover destination set to one of those targets.

2. For each of these resolvers we will perform a small measurement of
   comparative healths of the endpoints that come back from the health API for the
   set of primary target and serial failover targets. We walk the list of targets
   in order and if any endpoint is healthy we return that target, otherwise we
   move on to the next target.

3. The CDS and EDS endpoints both perform the measurements in (2) for the
   affected resolver nodes.

4. For CDS this measurement selects which TLS SNI field to use for the cluster
   (note the cluster is always going to be named for the primary target)

5. For EDS this measurement selects which set of endpoints will populate the
   cluster. Priority tiered failover is ignored.

One of the big downsides to this approach to failover is that the failover
detection and correction is going to be controlled by consul rather than
deferring that entirely to the data plane as with the prior version. This also
means that we are bound to only failover using official health signals and
cannot make use of data plane signals like outlier detection to affect
failover.

In this specific scenario the lack of data plane signals is ok because the
effectiveness is already muted by the fact that the ultimate destination
endpoints will have their data plane signals scrambled when they pass through
the mesh gateway wrapper anyway so we're not losing much.

Another related fix is that we now use the endpoint health from the
underlying service, not the health of the gateway (regardless of
failover mode).
2019-08-05 13:30:35 -05:00
Alvin Huang 9f58504f1c
Merge pull request #6274 from hashicorp/merge-master-de01a1e
Merge master at de01a1e279
2019-08-02 19:13:54 -04:00
Alvin Huang 37ea271eb7 fix grpc-addr-config hosts template 2019-08-02 19:00:39 -04:00
Alvin Huang 206b2016a4 Merge remote-tracking branch 'origin/master' into release/1-6 2019-08-02 18:09:32 -04:00
R.B. Boyer 856090e893 update changelog 2019-08-02 15:36:13 -05:00
R.B. Boyer c395affc93
connect: expose an API endpoint to compile the discovery chain (#6248)
In addition to exposing compilation over the API cleaned up the structures that would be exchanged to be cleaner and easier to support and understand.

Also removed ability to configure the envoy OverprovisioningFactor.
2019-08-02 15:34:54 -05:00
Matt Keeler 1ab3d6a990
Update CHANGELOG.md 2019-08-02 16:23:00 -04:00
Matt Keeler 5ce6bfa292
Add license management functions to API client (#6268)
* Add license management functions to API client

* Get rid of jsonapi struct tags
2019-08-02 16:20:38 -04:00
Alvin Huang de01a1e279
add generic master merge into release/* branches (#6249) 2019-08-02 16:11:41 -04:00
Todd Radel 96be92f3b5
connect: generate intermediate at same time as root (#6272)
Generate intermediate at same time as root
Co-Authored-By: Freddy <freddygv@users.noreply.github.com>
2019-08-02 15:36:03 -04:00
Alvin Huang dd7b3ece64
Add arm builds (#6263)
* try arm builds

* Update .circleci/config.yml

Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com>

* Update .circleci/config.yml

Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com>

* Update .circleci/config.yml

Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com>
2019-08-02 15:15:59 -04:00
R.B. Boyer 64b235990d update changelog 2019-08-02 09:19:37 -05:00
R.B. Boyer dcb609af83
connect: detect and prevent circular discovery chain references (#6246) 2019-08-02 09:18:45 -05:00
John Cowen 266c288fc0
ui: Adds readonly meta data to the serviceInstance and node detail pages (#6196) 2019-08-02 13:53:52 +02:00
R.B. Boyer 2ebd73ba0d update changelog 2019-08-01 23:07:54 -05:00
R.B. Boyer 1d0efdc69e
server: if inserting bootstrap config entries fails don't silence the errors (#6256) 2019-08-01 23:07:11 -05:00
R.B. Boyer 840409d994 update changelog 2019-08-01 22:45:01 -05:00
R.B. Boyer f02924fafe
connect: simplify the compiled discovery chain data structures (#6242)
This should make them better for sending over RPC or the API.

Instead of a chain implemented explicitly like a linked list (nodes
holding pointers to other nodes) instead switch to a flat map of named
nodes with nodes linking other other nodes by name. The shipped
structure is just a map and a string to indicate which key to start
from.

Other changes:

* inline the compiler option InferDefaults as true

* introduce compiled target config to avoid needing to send back
  additional maps of Resolvers; future target-specific compiled state
  can go here

* move compiled MeshGateway out of the Resolver and into the
  TargetConfig where it makes more sense.
2019-08-01 22:44:05 -05:00
R.B. Boyer 3128937145 update changelog 2019-08-01 22:05:02 -05:00
R.B. Boyer 6393edba53
connect: reconcile how upstream configuration works with discovery chains (#6225)
* connect: reconcile how upstream configuration works with discovery chains

The following upstream config fields for connect sidecars sanely
integrate into discovery chain resolution:

- Destination Namespace/Datacenter: Compilation occurs locally but using
different default values for namespaces and datacenters. The xDS
clusters that are created are named as they normally would be.

- Mesh Gateway Mode (single upstream): If set this value overrides any
value computed for any resolver for the entire discovery chain. The xDS
clusters that are created may be named differently (see below).

- Mesh Gateway Mode (whole sidecar): If set this value overrides any
value computed for any resolver for the entire discovery chain. If this
is specifically overridden for a single upstream this value is ignored
in that case. The xDS clusters that are created may be named differently
(see below).

- Protocol (in opaque config): If set this value overrides the value
computed when evaluating the entire discovery chain. If the normal chain
would be TCP or if this override is set to TCP then the result is that
we explicitly disable L7 Routing and Splitting. The xDS clusters that
are created may be named differently (see below).

- Connect Timeout (in opaque config): If set this value overrides the
value for any resolver in the entire discovery chain. The xDS clusters
that are created may be named differently (see below).

If any of the above overrides affect the actual result of compiling the
discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op
override to "tcp") then the relevant parameters are hashed and provided
to the xDS layer as a prefix for use in naming the Clusters. This is to
ensure that if one Upstream discovery chain has no overrides and
tangentially needs a cluster named "api.default.XXX", and another
Upstream does have overrides for "api.default.XXX" that they won't
cross-pollinate against the operator's wishes.

Fixes #6159
2019-08-01 22:03:34 -05:00
R.B. Boyer 5e32bbdf4d update changelog 2019-08-01 13:27:14 -05:00
R.B. Boyer 8564b6bb38
connect: validate upstreams and prevent duplicates (#6224)
* connect: validate upstreams and prevent duplicates

* Actually run Upstream.Validate() instead of ignoring it as dead code.

* Prevent two upstreams from declaring the same bind address and port.
  It wouldn't work anyway.

* Prevent two upstreams from being declared that use the same
  type+name+namespace+datacenter. Due to how the Upstream.Identity()
  function worked this ended up mostly being enforced in xDS at use-time,
  but it should be enforced more clearly at register-time.
2019-08-01 13:26:02 -05:00
Omer Zach 6785e33d8a Fix typo in architecture.html.md (#6261) 2019-08-01 12:21:37 -06:00
Venkata Krishna Annam 80f091e107 docs: Fix minor mistakes in index.html.md (#6239) 2019-08-01 12:57:26 -05:00
Sarah Adams 896749d585
fix 'consul connect envoy' to try to use previously-configured grpc port (#6245)
fix 'consul connect envoy' to try to use previously-configured grpc port on running agent before defaulting to 8502

Fixes #5011
2019-08-01 09:53:34 -07:00
Paul Banks e87cef2bb8 Revert "connect: support AWS PCA as a CA provider" (#6251)
This reverts commit 3497b7c00d.
2019-07-31 09:08:10 -04:00
Todd Radel 3497b7c00d
connect: support AWS PCA as a CA provider (#6189)
Port AWS PCA provider from consul-ent
2019-07-30 22:57:51 -04:00
Todd Radel 2552f4a11a
connect: Support RSA keys in addition to ECDSA (#6055)
Support RSA keys in addition to ECDSA
2019-07-30 17:47:39 -04:00
Freddy 919316f188
Update CHANGELOG.md 2019-07-30 11:03:16 -06:00
freddygv 1a14b94441 Update default gossip encryption key size to 32 bytes 2019-07-30 09:45:41 -06:00
Matt Keeler 5bb8d60786
Update CHANGELOG.md 2019-07-30 09:58:38 -04:00
Matt Keeler 1fdda51839
Fix envoy canBind (#6238)
* Fix envoy cli canBind function

The string form of an Addr was including the CIDR causing the str equals to not match.

* Remove debug prints
2019-07-30 09:56:56 -04:00
hashicorp-ci 847b90288a Merge Consul OSS branch 'master' at commit a1725e6b52 2019-07-30 02:00:29 +00:00
Matt Keeler a1725e6b52 Fix flaky tests (#6229) 2019-07-29 15:07:25 -04:00
Matt Keeler 5288dec952
Update CHANGELOG.md 2019-07-29 11:19:39 -04:00
Matt Keeler 64bc7a6c47
Update CHANGELOG.md 2019-07-29 11:17:58 -04:00
Matt Keeler fcc18c1675
Fix prepared query upstream endpoint generation (#6236)
Use the correct SNI value for prepared query upstreams
2019-07-29 11:15:55 -04:00
hashicorp-ci 5ff04a303e
Release v1.6.0-beta3 2019-07-26 23:15:20 +00:00
hashicorp-ci 1527616c8c
update bindata_assetfs.go 2019-07-26 23:15:20 +00:00
Alvin Huang b2944bdbe1 Merge remote-tracking branch 'origin/master' into release/1-6 2019-07-26 16:22:53 -04:00
Matt Keeler 0481187152
Update CHANGELOG.md 2019-07-26 15:59:20 -04:00
Matt Keeler 35e67b1d1a
Fix CA Replication when ACLs are enabled (#6201)
Secondary CA initialization steps are:

• Wait until the primary will be capable of signing intermediate certs. We use serf metadata to check the versions of servers in the primary which avoids needing a token like the previous implementation that used RPCs. We require at least one alive server in the primary and the all alive servers meet the version requirement.
• Initialize the secondary CA by getting the primary to sign an intermediate

When a primary dc is configured, if no existing CA is initialized and for whatever reason we cannot initialize a secondary CA the secondary DC will remain without a CA. As soon as it can it will initialize the secondary CA by pulling the primaries roots and getting the primary to sign an intermediate.

This also fixes a segfault that can happen during leadership revocation. There was a spot in the secondaryCARootsWatch that was getting the CA Provider and executing methods on it without nil checking. Under normal circumstances it wont be nil but during leadership revocation it gets nil'ed out. Therefore there is a period of time between closing the stop chan and when the go routine is actually stopped where it could read a nil provider and cause a segfault.
2019-07-26 15:57:57 -04:00
Matt Keeler 59454c7edc
Set --max-obj-name-len 256 when execing Envoy (#6202)
* Pass -max-obj-name-len 256 to envoy

* Update test expectations.

* Add a note about requireing the max-obj-name-len option to be set
2019-07-26 15:43:15 -04:00
Todd Radel dbae899796
Merge pull request #6210 from hashicorp/docs/fix-ambassador-link
Fix links to ambassador website
2019-07-26 14:29:03 -04:00
R.B. Boyer 3ca566a152
Merge pull request #6223 from hashicorp/master-merge-b3541c4f3
Master merge b3541c4f3
2019-07-26 11:44:01 -05:00
R.B. Boyer c6c4a2251a Merge Consul OSS branch master at commit b3541c4f34 2019-07-26 10:34:24 -05:00
Jack Pearkes b3541c4f34 Putting source back into Dev Mode 2019-07-25 17:58:56 -07:00
hashicorp-ci a42ded477c
Release v1.5.3 2019-07-25 23:41:17 +00:00
hashicorp-ci 0c9c5bfa98
update bindata_assetfs.go 2019-07-25 23:41:16 +00:00
Jack Pearkes 43996ce05f
Update CHANGELOG.md 2019-07-25 14:20:11 -07:00