This commit adds the xDS resources needed for INBOUND traffic from peer
clusters:
- 1 filter chain for all inbound peering requests.
- 1 cluster for all inbound peering requests.
- 1 endpoint per voting server with the gRPC TLS port configured.
There is one filter chain and cluster because unlike with WAN
federation, peer clusters will not attempt to dial individual servers.
Peer clusters will only dial the local mesh gateway addresses.
* feat(ingress gateway: support configuring limits in ingress-gateway config entry
- a new Defaults field with max_connections, max_pending_connections, max_requests
is added to ingress gateway config entry
- new field max_connections, max_pending_connections, max_requests in
individual services to overwrite the value in Default
- added unit test and integration test
- updated doc
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* draft commit
* add changelog, update test
* remove extra param
* fix test
* update type to account for nil value
* add test for custom passive health check
* update comments and tests
* update description in docs
* fix missing commas
* add golden files
* add support to http in tgateway egress destination
* fix slice sorting to include both address and port when using server_names
* fix listener loop for http destination
* fix routes to generate a route per port and a virtualhost per port-address combination
* sort virtual hosts list to have a stable order
* extract redundant serviceNode
Now that peered upstreams can generate envoy resources (#13758), we need a way to disambiguate local from peered resources in our metrics. The key difference is that datacenter and partition will be replaced with peer, since in the context of peered resources partition is ambiguous (could refer to the partition in a remote cluster or one that exists locally). The partition and datacenter of the proxy will always be that of the source service.
Regexes were updated to make emitting datacenter and partition labels mutually exclusive with peer labels.
Listener filter names were updated to better match the existing regex.
Cluster names assigned to peered upstreams were updated to be synthesized from local peer name (it previously used the externally provided primary SNI, which contained the peer name from the other side of the peering). Integration tests were updated to assert for the new peer labels.
Peered upstreams has a separate loop in xds from discovery chain upstreams. This PR adds similar but slightly modified code to add filters for peered upstream listeners, clusters, and endpoints in the case of transparent proxy.
Because peerings are pairwise, between two tuples of (datacenter,
partition) having any exported reference via a discovery chain that
crosses out of the peered datacenter or partition will ultimately not be
able to work for various reasons. The biggest one is that there is no
way in the ultimate destination to configure an intention that can allow
an external SpiffeID to access a service.
This PR ensures that a user simply cannot do this, so they won't run
into weird situations like this.
When the protocol is http-like, and an intention has a peered source
then the normal RBAC mTLS SAN field check is replaces with a joint combo
of:
mTLS SAN field must be the service's local mesh gateway leaf cert
AND
the first XFCC header (from the MGW) must have a URI field that matches the original intention source
Also:
- Update the regex program limit to be much higher than the teeny
defaults, since the RBAC regex constructions are more complicated now.
- Fix a few stray panics in xds generation.
This is only configured in xDS when a service with an L7 protocol is
exported.
They also load any relevant trust bundles for the peered services to
eventually use for L7 SPIFFE validation during mTLS termination.
When converting from Consul intentions to xds RBAC rules, services imported from other peers must encode additional data like partition (from the remote cluster) and trust domain.
This PR updates the PeeringTrustBundle to hold the sending side's local partition as ExportedPartition. It also updates RBAC code to encode SpiffeIDs of imported services with the ExportedPartition and TrustDomain.
Mesh gateways can use hostnames in their tagged addresses (#7999). This is useful
if you were to expose a mesh gateway using a cloud networking load balancer appliance
that gives you a DNS name but no reliable static IPs.
Envoy cannot accept hostnames via EDS and those must be configured using CDS.
There was already logic when configuring gateways in other locations in the code, but
given the illusions in play for peering the downstream of a peered service wasn't aware
that it should be doing that.
Also:
- ensuring that we always try to use wan-like addresses to cross peer boundaries.
Mesh gateways will now enable tcp connections with SNI names including peering information so that those connections may be proxied.
Note: this does not change the callers to use these mesh gateways.
Envoy's SPIFFE certificate validation extension allows for us to
validate against different root certificates depending on the trust
domain of the dialing proxy.
If there are any trust bundles from peers in the config snapshot then we
use the SPIFFE validator as the validation context, rather than the
usual TrustedCA.
The injected validation config includes the local root certificates as
well.
For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.
Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.
This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.
Description
Add x-fowarded-client-cert information on trusted incoming connections.
Envoy provides support forwarding and annotating the
x-forwarded-client-cert header via the forward_client_cert_details
set_current_client_cert_details filter fields. It would be helpful for
consul to support this directly in its config. The escape hatches are
a bit cumbersome for this purpose.
This has been implemented on incoming connections to envoy. Outgoing
(from the local service through the sidecar) will not have a
certificate, and so are left alone.
A service on an incoming connection will now get headers something like this:
```
X-Forwarded-Client-Cert:[By=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/counting;Hash=61ad5cbdfcb50f5a3ec0ca60923d61613c149a9d4495010a64175c05a0268ab2;Cert="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Chain="-----BEGIN%20CERTIFICATE-----%0AMIICHDCCAcOgAwIBAgIBCDAKBggqhkjOPQQDAjAxMS8wLQYDVQQDEyZwcmktMTli%0AYXdyb2YuY29uc3VsLmNhLmVmYWQ3MjgyLmNvbnN1bDAeFw0yMjA0MjkwMzE0NTBa%0AFw0yMjA1MDIwMzE0NTBaMAAwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARVIZ7Y%0AZEXfbOGBfxGa7Vuok1MIng%2FuzLQK2xLVlSTIPDbO5hstTGP%2B%2FGx182PYFP3jYqk5%0Aq6rYWe1wiPNMA30Io4H8MIH5MA4GA1UdDwEB%2FwQEAwIDuDAdBgNVHSUEFjAUBggr%0ABgEFBQcDAgYIKwYBBQUHAwEwDAYDVR0TAQH%2FBAIwADApBgNVHQ4EIgQgrp4q50oX%0AHHghMbxz5Bk8OJFWMdfgH0Upr350WlhyxvkwKwYDVR0jBCQwIoAgUe6uERAIj%2FLM%0AyuFzDc3Wbp9TGAKBJYAwyhF14ToOQCMwYgYDVR0RAQH%2FBFgwVoZUc3BpZmZlOi8v%0AZWZhZDcyODItZDliMi0zMjk4LWY2ZDgtMzhiMzdmYjU4ZGYzLmNvbnN1bC9ucy9k%0AZWZhdWx0L2RjL2RjMS9zdmMvZGFzaGJvYXJkMAoGCCqGSM49BAMCA0cAMEQCIDwb%0AFlchufggNTijnQ5SUcvTZrWlZyq%2FrdVC20nbbmWLAiAVshNNv1xBqJI1NmY2HI9n%0AgRMfb8aEPVSuxEHhqy57eQ%3D%3D%0A-----END%20CERTIFICATE-----%0A";Subject="";URI=spiffe://efad7282-d9b2-3298-f6d8-38b37fb58df3.consul/ns/default/dc/dc1/svc/dashboard]
```
Closes#12852
Just like standard upstreams the order of applicability in descending precedence:
1. caller's `service-defaults` upstream override for destination
2. caller's `service-defaults` upstream defaults
3. destination's `service-resolver` ConnectTimeout
4. system default of 5s
Co-authored-by: mrspanishviking <kcardenas@hashicorp.com>
- `tls.incoming`: applies to the inbound mTLS targeting the public
listener on `connect-proxy` and `terminating-gateway` envoy instances
- `tls.outgoing`: applies to the outbound mTLS dialing upstreams from
`connect-proxy` and `ingress-gateway` envoy instances
Fixes#11966
Prior to this PR for the envoy xDS golden tests in the agent/xds package we
were hand-creating a proxycfg.ConfigSnapshot structure in the proper format for
input to the xDS generator. Over time this intermediate structure has gotten
trickier to build correctly for the various tests.
This PR proposes to switch to using the existing mechanism for turning a
structs.NodeService and a sequence of cache.UpdateEvent copies into a
proxycfg.ConfigSnapshot, as that is less error prone to construct and aligns
more with how the data arrives.
NOTE: almost all of this is in test-related code. I tried super hard to craft
correct event inputs to get the golden files to be the same, or similar enough
after construction to feel ok that i recreated the spirit of the original test
cases.
Transparent proxies can set up filter chains that allow direct
connections to upstream service instances. Services that can be dialed
directly are stored in the PassthroughUpstreams map of the proxycfg
snapshot.
Previously these addresses were not being cleaned up based on new
service health data. The list of addresses associated with an upstream
service would only ever grow.
As services scale up and down, eventually they will have instances
assigned to an IP that was previously assigned to a different service.
When IP addresses are duplicated across filter chain match rules the
listener config will be rejected by Envoy.
This commit updates the proxycfg snapshot management so that passthrough
addresses can get cleaned up when no longer associated with a given
upstream.
There is still the possibility of a race condition here where due to
timing an address is shared between multiple passthrough upstreams.
That concern is mitigated by #12195, but will be further addressed
in a follow-up.
The gist here is that now we use a value-type struct proxycfg.UpstreamID
as the map key in ConfigSnapshot maps where we used to use "upstream
id-ish" strings. These are internal only and used just for bidirectional
trips through the agent cache keyspace (like the discovery chain target
struct).
For the few places where the upstream id needs to be projected into xDS,
that's what (proxycfg.UpstreamID).EnvoyID() is for. This lets us ALWAYS
inject the partition and namespace into these things without making
stuff like the golden testdata diverge.
* xds: refactor ingress listener SDS configuration
* xds: update resolveListenerSDS call args in listeners_test
* ingress: add TLS min, max and cipher suites to GatewayTLSConfig
* xds: implement envoyTLSVersions and envoyTLSCipherSuites
* xds: merge TLS config
* xds: configure TLS parameters with ingress TLS context from leaf
* xds: nil check in resolveListenerTLSConfig validation
* xds: nil check in makeTLSParameters* functions
* changelog: add entry for TLS params on ingress config entries
* xds: remove indirection for TLS params in TLSConfig structs
* xds: return tlsContext, nil instead of ambiguous err
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
* xds: switch zero checks to types.TLSVersionUnspecified
* ingress: add validation for ingress config entry TLS params
* ingress: validate listener TLS config
* xds: add basic ingress with TLS params tests
* xds: add ingress listeners mixed TLS min version defaults precedence test
* xds: add more explicit tests for ingress listeners inheriting gateway defaults
* xds: add test for single TLS listener on gateway without TLS defaults
* xds: regen golden files for TLSVersionInvalid zero value, add TLSVersionAuto listener test
* types/tls: change TLSVersion to string
* types/tls: update TLSCipherSuite to string type
* types/tls: implement validation functions for TLSVersion and TLSCipherSuites, make some maps private
* api: add TLS params to GatewayTLSConfig, add tests
* api: add TLSMinVersion to ingress gateway config entry test JSON
* xds: switch to Envoy TLS cipher suite encoding from types package
* xds: fixup validation for TLSv1_3 min version with cipher suites
* add some kitchen sink tests and add a missing struct tag
* xds: check if mergedCfg.TLSVersion is in TLSVersionsWithConfigurableCipherSuites
* xds: update connectTLSEnabled comment
* xds: remove unsued resolveGatewayServiceTLSConfig function
* xds: add makeCommonTLSContextFromLeafWithoutParams
* types/tls: add LessThan comparator function for concrete values
* types/tls: change tlsVersions validation map from string to TLSVersion keys
* types/tls: remove unused envoyTLSCipherSuites
* types/tls: enable chacha20 cipher suites for Consul agent
* types/tls: remove insecure cipher suites from allowed config
TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 and TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 are both explicitly listed as insecure and disabled in the Go source.
Refs https://cs.opensource.google/go/go/+/refs/tags/go1.17.3:src/crypto/tls/cipher_suites.go;l=329-330
* types/tls: add ValidateConsulAgentCipherSuites function, make direct lookup map private
* types/tls: return all unmatched cipher suites in validation errors
* xds: check that Envoy API value matching TLS version is found when building TlsParameters
* types/tls: check that value is found in map before appending to slice in MarshalEnvoyTLSCipherSuiteStrings
* types/tls: cast to string rather than fmt.Printf in TLSCihperSuite.String()
* xds: add TLSVersionUnspecified to list of configurable cipher suites
* structs: update note about config entry warning
* xds: remove TLS min version cipher suite unconfigurable test placeholder
* types/tls: update tests to remove assumption about private map values
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
The duo of `makeUpstreamFilterChainForDiscoveryChain` and `makeListenerForDiscoveryChain` were really hard to reason about, and led to concealing a bug in their branching logic. There were several issues here:
- They tried to accomplish too much: determining filter name, cluster name, and whether RDS should be used.
- They embedded logic to handle significantly different kinds of upstream listeners (passthrough, prepared query, typical services, and catch-all)
- They needed to coalesce different data sources (Upstream and CompiledDiscoveryChain)
Rather than handling all of those tasks inside of these functions, this PR pulls out the RDS/clusterName/filterName logic.
This refactor also fixed a bug with the handling of [UpstreamDefaults](https://www.consul.io/docs/connect/config-entries/service-defaults#defaults). These defaults get stored as UpstreamConfig in the proxy snapshot with a DestinationName of "*", since they apply to all upstreams. However, this wildcard destination name must not be used when creating the name of the associated upstream cluster. The coalescing logic in the original functions here was in some situations creating clusters with a `*.` prefix, which is not a valid destination.
Fixes an issue described in #10132, where if two DCs are WAN federated
over mesh gateways, and the gateway in the non-primary DC is terminated
and receives a new IP address (as is commonly the case when running them
on ephemeral compute instances) the primary DC is unable to re-establish
its connection until the agent running on its own gateway is restarted.
This was happening because we always preferred gateways discovered by
the `Internal.ServiceDump` RPC (which would fail because there's no way
to dial the remote DC) over those discovered in the federation state,
which is replicated as long as the primary DC's gateway is reachable.
Previously SAN validation for prepared queries was broken because we
validated against the name, namespace, and datacenter for prepared
queries.
However, prepared queries can target:
- Services with a name that isn't their own
- Services in multiple datacenters
This means that the SpiffeID to validate needs to be based on the
prepared query endpoints, and not the prepared query's upstream
definition.
This commit updates prepared query clusters to account for that.
- The TestNodeService helper created services with the fixed name "web",
and now that name is overridable.
- The discovery chain snapshot didn't have prepared query endpoints so
the endpoints tests were missing data for prepared queries
In the absence of stats_tags to handle this pattern, when we pass
"ingress_upstream.$port" as the stat_prefix, Envoy splits up that prefix
and makes the port a part of the metric name.
For example:
- stat_prefix: ingress_upstream.8080
This leads to metric names like envoy_http_8080_no_route. Changing the
stat_prefix to ingress_upstream_80880 yields the expected metric names
such as envoy_http_no_route.
Note that we don't encode the destination's name/ns/dc in this
stat_prefix because for HTTP services ingress gateways use a single
filter chain. Only cluster metrics are available on a per-upstream
basis.
This change makes it so that the stat prefix for terminating gateways
matches that of connect proxies. By using the structure of
"upstream.svc.ns.dc" we can extract labels for the destination service,
namespace, and datacenter.
Initially we were loading every potential upstream address into Envoy
and then routing traffic to the logical upstream service. The downside
of this behavior is that traffic meant to go to a specific instance
would be load balanced across ALL instances.
Traffic to specific instance IPs should be forwarded to the original
destination and if it's a destination in the mesh then we should ensure
the appropriate certificates are used.
This PR makes transparent proxying a Kubernetes-only feature for now
since support for other environments requires generating virtual IPs,
and Consul does not do that at the moment.
The only thing that needed fixing up pertained to this section of the 1.18.x release notes:
> grpc_stats: the default value for stats_for_all_methods is switched from true to false, in order to avoid possible memory exhaustion due to an untrusted downstream sending a large number of unique method names. The previous default value was deprecated in version 1.14.0. This only changes the behavior when the value is not set. The previous behavior can be used by setting the value to true. This behavior change by be overridden by setting runtime feature envoy.deprecated_features.grpc_stats_filter_enable_stats_for_all_methods_by_default.
For now to maintain status-quo I'm explicitly setting `stats_for_all_methods=true` in all versions to avoid relying upon the default.
Additionally the naming of the emitted metrics for these gRPC requests changed slightly so the integration test assertions for `case-grpc` needed adjusting.
Since we currently do no version switching this removes 75% of the PR
noise.
To generate all *.golden files were removed and then I ran:
go test ./agent/xds -update
Note that this does NOT upgrade to xDS v3. That will come in a future PR.
Additionally:
- Ignored staticcheck warnings about how github.com/golang/protobuf is deprecated.
- Shuffled some agent/xds imports in advance of a later xDS v3 upgrade.
- Remove support for envoy 1.13.x but don't add in 1.17.x yet. We have to wait until the xDS v3 support is added in a follow-up PR.
Fixes#8425
This PR updates the tags that we generate for Envoy stats.
Several of these come with breaking changes, since we can't keep two stats prefixes for a filter.
Extend Consul’s intentions model to allow for request-based access control enforcement for HTTP-like protocols in addition to the existing connection-based enforcement for unspecified protocols (e.g. tcp).
Related changes:
- hard-fail the xDS connection attempt if the envoy version is known to be too old to be supported
- remove the RouterMatchSafeRegex proxy feature since all supported envoy versions have it
- stop using --max-obj-name-len (due to: envoyproxy/envoy#11740)
A port can be sent in the Host header as defined in the HTTP RFC, so we
take any hosts that we want to match traffic to and also add another
host with the listener port added.
Also fix an issue with envoy integration tests not running the
case-ingress-gateway-tls test.
Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Currently when passing hostname clusters to Envoy, we set each service instance registered with Consul as an LbEndpoint for the cluster.
However, Envoy can only handle one per cluster:
[2020-06-04 18:32:34.094][1][warning][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC config for type.googleapis.com/envoy.api.v2.Cluster rejected: Error adding/updating cluster(s) dc2.internal.ddd90499-9b47-91c5-4616-c0cbf0fc358a.consul: LOGICAL_DNS clusters must have a single locality_lb_endpoint and a single lb_endpoint, server.dc2.consul: LOGICAL_DNS clusters must have a single locality_lb_endpoint and a single lb_endpoint
Envoy is currently handling this gracefully by only picking one of the endpoints. However, we should avoid passing multiple to avoid these warning logs.
This PR:
* Ensures we only pass one endpoint, which is tied to one service instance.
* We prefer sending an endpoint which is marked as Healthy by Consul.
* If no endpoints are healthy we emit a warning and skip the cluster.
* If multiple unique hostnames are spread across service instances we emit a warning and let the user know which will be resolved.
Previously, we did not require the 'service-name.*' host header value
when on a single http service was exposed. However, this allows a user
to get into a situation where, if they add another service to the
listener, suddenly the previous service's traffic might not be routed
correctly. Thus, we always require the Host header, even if there is
only 1 service.
Also, we add the make the default domain matching more restrictive by
matching "service-name.ingress.*" by default. This lines up better with
the namespace case and more accurately matches the Consul DNS value we
expect people to use in this case.
The DNS resolution will be handled by Envoy and defaults to LOGICAL_DNS. This discovery type can be overridden on a per-gateway basis with the envoy_dns_discovery_type Gateway Option.
If a service contains an instance with a hostname as an address we set the Envoy cluster to use DNS as the discovery type rather than EDS. Since both mesh gateways and terminating gateways route to clusters using SNI, whenever there is a mix of hostnames and IP addresses associated with a service we use the hostname + CDS rather than the IPs + EDS.
Note that we detect hostnames by attempting to parse the service instance's address as an IP. If it is not a valid IP we assume it is a hostname.
* Standardize support for Tagged and BindAddresses in Ingress Gateways
This updates the TaggedAddresses and BindAddresses behavior for Ingress
to match Mesh/Terminating gateways. The `consul connect envoy` command
now also allows passing an address without a port for tagged/bind
addresses.
* Update command/connect/envoy/envoy.go
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
* PR comments
* Check to see if address is an actual IP address
* Update agent/xds/listeners.go
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
* fix whitespace
Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
- Validate that this cannot be set on a 'tcp' listener nor on a wildcard
service.
- Add Hosts field to api and test in consul config write CLI
- xds: Configure envoy with user-provided hosts from ingress gateways
This commit adds the necessary changes to allow an ingress gateway to
route traffic from a single defined port to multiple different upstream
services in the Consul mesh.
To do this, we now require all HTTP requests coming into the ingress
gateway to specify a Host header that matches "<service-name>.*" in
order to correctly route traffic to the correct service.
- Differentiate multiple listener's route names by port
- Adds a case in xds for allowing default discovery chains to create a
route configuration when on an ingress gateway. This allows default
services to easily use host header routing
- ingress-gateways have a single route config for each listener
that utilizes domain matching to route to different services.
This commit copies many of the connect-proxy xds testcases and reuses
for ingress gateways. This allows us to more easily see changes to the
envoy configuration when make updates to ingress gateways.
* Implements a simple, tcp ingress gateway workflow
This adds a new type of gateway for allowing Ingress traffic into Connect from external services.
Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>
If a proxied service is a gRPC or HTTP2 service, but a path is exposed
using the HTTP1 or TCP protocol, Envoy should not be configured with
`http2ProtocolOptions` for the cluster backing the path.
A situation where this comes up is a gRPC service whose healthcheck or
metrics route (e.g. for Prometheus) is an HTTP1 service running on
a different port. Previously, if these were exposed either using
`Expose: { Checks: true }` or `Expose: { Paths: ... }`, Envoy would
still be configured to communicate with the path over HTTP2, which would
not work properly.
This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch:
There are several distinct chunks of code that are affected:
* new flags and config options for the server
* retry join WAN is slightly different
* retry join code is shared to discover primary mesh gateways from secondary datacenters
* because retry join logic runs in the *agent* and the results of that
operation for primary mesh gateways are needed in the *server* there are
some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur
at multiple layers of abstraction just to pass the data down to the right
layer.
* new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers
* the function signature for RPC dialing picked up a new required field (the
node name of the destination)
* several new RPCs for manipulating a FederationState object:
`FederationState:{Apply,Get,List,ListMeshGateways}`
* 3 read-only internal APIs for debugging use to invoke those RPCs from curl
* raft and fsm changes to persist these FederationStates
* replication for FederationStates as they are canonically stored in the
Primary and replicated to the Secondaries.
* a special derivative of anti-entropy that runs in secondaries to snapshot
their local mesh gateway `CheckServiceNodes` and sync them into their upstream
FederationState in the primary (this works in conjunction with the
replication to distribute addresses for all mesh gateways in all DCs to all
other DCs)
* a "gateway locator" convenience object to make use of this data to choose
the addresses of gateways to use for any given RPC or gossip operation to a
remote DC. This gets data from the "retry join" logic in the agent and also
directly calls into the FSM.
* RPC (`:8300`) on the server sniffs the first byte of a new connection to
determine if it's actually doing native TLS. If so it checks the ALPN header
for protocol determination (just like how the existing system uses the
type-byte marker).
* 2 new kinds of protocols are exclusively decoded via this native TLS
mechanism: one for ferrying "packet" operations (udp-like) from the gossip
layer and one for "stream" operations (tcp-like). The packet operations
re-use sockets (using length-prefixing) to cut down on TLS re-negotiation
overhead.
* the server instances specially wrap the `memberlist.NetTransport` when running
with gateway federation enabled (in a `wanfed.Transport`). The general gist is
that if it tries to dial a node in the SAME datacenter (deduced by looking
at the suffix of the node name) there is no change. If dialing a DIFFERENT
datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh
gateways to eventually end up in a server's :8300 port.
* a new flag when launching a mesh gateway via `consul connect envoy` to
indicate that the servers are to be exposed. This sets a special service
meta when registering the gateway into the catalog.
* `proxycfg/xds` notice this metadata blob to activate additional watches for
the FederationState objects as well as the location of all of the consul
servers in that datacenter.
* `xds:` if the extra metadata is in place additional clusters are defined in a
DC to bulk sink all traffic to another DC's gateways. For the current
datacenter we listen on a wildcard name (`server.<dc>.consul`) that load
balances all servers as well as one mini-cluster per node
(`<node>.server.<dc>.consul`)
* the `consul tls cert create` command got a new flag (`-node`) to help create
an additional SAN in certs that can be used with this flavor of federation.