Commit Graph

61 Commits

Author SHA1 Message Date
R.B. Boyer f557509e58
xds: allow for peered upstreams to use tagged addresses that are hostnames (#13422)
Mesh gateways can use hostnames in their tagged addresses (#7999). This is useful
if you were to expose a mesh gateway using a cloud networking load balancer appliance
that gives you a DNS name but no reliable static IPs.

Envoy cannot accept hostnames via EDS and those must be configured using CDS.
There was already logic when configuring gateways in other locations in the code, but
given the illusions in play for peering the downstream of a peered service wasn't aware
that it should be doing that.

Also:
- ensuring that we always try to use wan-like addresses to cross peer boundaries.
2022-06-10 16:11:40 -05:00
R.B. Boyer ab758b7b32
peering: allow mesh gateways to proxy L4 peered traffic (#13339)
Mesh gateways will now enable tcp connections with SNI names including peering information so that those connections may be proxied.

Note: this does not change the callers to use these mesh gateways.
2022-06-06 14:20:41 -05:00
R.B. Boyer 019aeaa57d
peering: update how cross-peer upstreams and represented in proxycfg and rendered in xds (#13362)
This removes unnecessary, vestigal remnants of discovery chains.
2022-06-03 16:42:50 -05:00
Freddy a09c776645 Update public listener with SPIFFE Validator
Envoy's SPIFFE certificate validation extension allows for us to
validate against different root certificates depending on the trust
domain of the dialing proxy.

If there are any trust bundles from peers in the config snapshot then we
use the SPIFFE validator as the validation context, rather than the
usual TrustedCA.

The injected validation config includes the local root certificates as
well.
2022-06-01 17:06:33 -06:00
Freddy 74ca6406ea
Configure upstream TLS context with peer root certs (#13321)
For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.

Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.

This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.
2022-06-01 15:53:52 -06:00
Dan Upton 2427e38839
Enable servers to configure arbitrary proxies from the catalog (#13244)
OSS port of enterprise PR 1822

Includes the necessary changes to the `proxycfg` and `xds` packages to enable
Consul servers to configure arbitrary proxies using catalog data.

Broadly, `proxycfg.Manager` now has public methods for registering,
deregistering, and listing registered proxies — the existing local agent
state-sync behavior has been moved into a separate component that makes use of
these methods.

When an xDS session is started for a proxy service in the catalog, a goroutine
will be spawned to watch the service in the server's state store and
re-register it with the `proxycfg.Manager` whenever it is updated (and clean
it up when the client goes away).
2022-05-27 12:38:52 +01:00
Mark Anderson 98a2e282be Fixup acl.EnterpriseMeta
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
2022-04-05 15:11:49 -07:00
R.B. Boyer e79ce8ab03
xds: adding control of the mesh-wide min/max TLS versions and cipher suites from the mesh config entry (#12601)
- `tls.incoming`: applies to the inbound mTLS targeting the public
  listener on `connect-proxy` and `terminating-gateway` envoy instances

- `tls.outgoing`: applies to the outbound mTLS dialing upstreams from
  `connect-proxy` and `ingress-gateway` envoy instances

Fixes #11966
2022-03-30 13:43:59 -05:00
freddygv cbea3d203c Fix race of upstreams with same passthrough ip
Due to timing, a transparent proxy could have two upstreams to dial
directly with the same address.

For example:
- The orders service can dial upstreams shipping and payment directly.
- An instance of shipping at address 10.0.0.1 is deregistered.
- Payments is scaled up and scheduled to have address 10.0.0.1.
- The orders service receives the event for the new payments instance
before seeing the deregistration for the shipping instance. At this
point two upstreams have the same passthrough address and Envoy will
reject the listener configuration.

To disambiguate this commit considers the Raft index when storing
passthrough addresses. In the example above, 10.0.0.1 would only be
associated with the newer payments service instance.
2022-02-10 17:01:57 -07:00
freddygv 659ebc05a9 Ensure passthrough addresses get cleaned up
Transparent proxies can set up filter chains that allow direct
connections to upstream service instances. Services that can be dialed
directly are stored in the PassthroughUpstreams map of the proxycfg
snapshot.

Previously these addresses were not being cleaned up based on new
service health data. The list of addresses associated with an upstream
service would only ever grow.

As services scale up and down, eventually they will have instances
assigned to an IP that was previously assigned to a different service.
When IP addresses are duplicated across filter chain match rules the
listener config will be rejected by Envoy.

This commit updates the proxycfg snapshot management so that passthrough
addresses can get cleaned up when no longer associated with a given
upstream.

There is still the possibility of a race condition here where due to
timing an address is shared between multiple passthrough upstreams.
That concern is mitigated by #12195, but will be further addressed
in a follow-up.
2022-02-10 17:01:57 -07:00
R.B. Boyer 424f3cdd2c
proxycfg: introduce explicit UpstreamID in lieu of bare string (#12125)
The gist here is that now we use a value-type struct proxycfg.UpstreamID
as the map key in ConfigSnapshot maps where we used to use "upstream
id-ish" strings. These are internal only and used just for bidirectional
trips through the agent cache keyspace (like the discovery chain target
struct).

For the few places where the upstream id needs to be projected into xDS,
that's what (proxycfg.UpstreamID).EnvoyID() is for. This lets us ALWAYS
inject the partition and namespace into these things without making
stuff like the golden testdata diverge.
2022-01-20 10:12:04 -06:00
freddygv 2fe27b748d Check ingress upstreams when gating chain watches 2021-12-13 18:56:28 -07:00
freddygv 70d6358426 Store intention upstreams in snapshot 2021-12-13 18:56:13 -07:00
freddygv 60066e5154 Exclude default partition from GatewayKey string
This will behave the way we handle SNI and SPIFFE IDs, where the default
partition is excluded.

Excluding the default ensures that don't attempt to compare default.dc2
to dc2 in OSS.
2021-11-01 14:45:52 -06:00
freddygv e3666b0bc4 Update GatewayKeys deduplication
Federation states data is only keyed on datacenter, so it cannot be
directly compared against keys for gateway groups.
2021-11-01 13:58:53 -06:00
freddygv 90ce897456 Store GatewayKey in proxycfg snapshot for re-use 2021-11-01 13:58:53 -06:00
freddygv 3a2061544d Fixup partitions assertion 2021-10-27 11:15:25 -06:00
freddygv 12923f5ebc PR comments 2021-10-27 11:15:25 -06:00
freddygv a33b6923e0 Account for partitions in xds gen for mesh gw
This commit avoids skipping gateways in remote partitions of the local
DC when generating listeners/clusters/endpoints.
2021-10-27 11:15:25 -06:00
freddygv 110fae820a Update xds pkg to account for GatewayKey 2021-10-27 09:03:56 -06:00
freddygv 7e65678c52 Update mesh gateway proxy watches for partitions
This commit updates mesh gateway watches for cross-partitions
communication.

* Mesh gateways are keyed by partition and datacenter.

* Mesh gateways will now watch gateways in partitions that export
services to their partition.

* Mesh gateways in non-default partitions will not have cross-datacenter
watches. They are not involved in traditional WAN federation.
2021-10-27 09:03:56 -06:00
freddygv 37a16e9487 Replace Split with SplitN 2021-10-26 23:36:01 -06:00
freddygv 62e0fc62c1 Configure sidecars to watch gateways in partitions
Previously the datacenter of the gateway was the key identifier, now it
is the datacenter and partition.

When dialing services in other partitions or datacenters we now watch
the appropriate partition.
2021-10-26 23:35:37 -06:00
Paul Banks 136928a90f Minor PR typo and cleanup fixes 2021-09-23 10:13:19 +01:00
Paul Banks ccbda0c285 Update proxycfg to hold more ingress config state 2021-09-23 10:08:02 +01:00
Paul Banks 4e39f03d5b Add ingress-gateway config for SDS 2021-09-23 10:08:02 +01:00
Paul Banks f439dfc04f Ingress gateway header manip plumbing 2021-09-10 21:09:24 +01:00
freddygv 47da00d3c7 Validate SANs for passthrough clusters and failovers 2021-07-14 22:21:55 -06:00
Freddy 429f9d8bb8
Add flag for transparent proxies to dial individual instances (#10329) 2021-06-09 14:34:17 -06:00
Freddy 078c40425f
Rename "cluster" config entry to "mesh" (#10127)
This config entry is being renamed primarily because in k8s the name
cluster could be confusing given that the config entry applies across
federated datacenters.

Additionally, this config entry will only apply to Consul as a service
mesh, so the more generic "cluster" name is not needed.
2021-04-28 16:13:29 -06:00
freddygv 7bd51ff536 Replace TransparentProxy bool with ProxyMode
This PR replaces the original boolean used to configure transparent
proxy mode. It was replaced with a string mode that can be set to:

- "": Empty string is the default for when the setting should be
defaulted from other configuration like config entries.
- "direct": Direct mode is how applications originally opted into the
mesh. Proxy listeners need to be dialed directly.
- "transparent": Transparent mode enables configuring Envoy as a
transparent proxy. Traffic must be captured and redirected to the
inbound and outbound listeners.

This PR also adds a struct for transparent proxy specific configuration.
Initially this is not stored as a pointer. Will revisit that decision
before GA.
2021-04-12 09:35:14 -06:00
R.B. Boyer 499fee73b3
connect: add toggle to globally disable wildcard outbound network access when transparent proxy is enabled (#9973)
This adds a new config entry kind "cluster" with a single special name "cluster" where this can be controlled.
2021-04-06 13:19:59 -05:00
freddygv a54d6a9010 Update proxycfg for transparent proxy 2021-03-17 13:40:39 -06:00
R.B. Boyer 43193a35c6
xds: prevent LDS flaps in mesh gateways due to unstable datacenter lists (#9651)
Also fix a similar issue in Terminating Gateways that was masked by an overzealous test.
2021-02-08 10:19:57 -06:00
R.B. Boyer 74d5df7c7a
xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569) 2020-08-27 12:20:58 -05:00
Daniel Nephin 068b43df90 Enable gofmt simplify
Code changes done automatically with 'gofmt -s -w'
2020-06-16 13:21:11 -04:00
freddygv 19e3954603 Move compound service names to use ServiceName type 2020-06-12 13:47:43 -06:00
Freddy 9ed325ba8b
Enable gateways to resolve hostnames to IPv4 addresses (#7999)
The DNS resolution will be handled by Envoy and defaults to LOGICAL_DNS. This discovery type can be overridden on a per-gateway basis with the envoy_dns_discovery_type Gateway Option.

If a service contains an instance with a hostname as an address we set the Envoy cluster to use DNS as the discovery type rather than EDS. Since both mesh gateways and terminating gateways route to clusters using SNI, whenever there is a mix of hostnames and IP addresses associated with a service we use the hostname + CDS rather than the IPs + EDS.

Note that we detect hostnames by attempting to parse the service instance's address as an IP. If it is not a valid IP we assume it is a hostname.
2020-06-03 15:28:45 -06:00
Chris Piraino 0bd5618cb2 Cleanup proxycfg for TLS
- Use correct enterprise metadata for finding config entry
- nil out cancel functions on config snapshot copy
- Look at HostsSet when checking validity
2020-05-07 10:22:57 -05:00
Kyle Havlovitz f14c54e25e Add TLS option and DNS SAN support to ingress config
xds: Only set TLS context for ingress listener when requested
2020-05-06 15:12:02 -05:00
Chris Piraino 881760f701 xds: Use only the port number as the configured route name
This removes duplication of protocol from the stats_prefix
2020-05-06 15:06:13 -05:00
Kyle Havlovitz 247f9eaf13 Allow ingress gateways to route traffic based on Host header
This commit adds the necessary changes to allow an ingress gateway to
route traffic from a single defined port to multiple different upstream
services in the Consul mesh.

To do this, we now require all HTTP requests coming into the ingress
gateway to specify a Host header that matches "<service-name>.*" in
order to correctly route traffic to the correct service.

- Differentiate multiple listener's route names by port
- Adds a case in xds for allowing default discovery chains to create a
  route configuration when on an ingress gateway. This allows default
  services to easily use host header routing
- ingress-gateways have a single route config for each listener
  that utilizes domain matching to route to different services.
2020-05-06 15:06:13 -05:00
Freddy 137a2c32c6
TLS Origination for Terminating Gateways (#7671) 2020-04-27 16:25:37 -06:00
freddygv 034d7d83d4 Fix snapshot IsEmpty 2020-04-27 11:08:41 -06:00
freddygv c0e1751878 Allow terminating-gateway to setup listener before servicegroups are known 2020-04-27 11:08:40 -06:00
freddygv 913b13f31f Add subset support 2020-04-27 11:08:40 -06:00
freddygv 24207226ca Add proxycfg state management for terminating-gateways 2020-04-27 11:07:06 -06:00
Kyle Havlovitz e9e8c0e730
Ingress Gateways for TCP services (#7509)
* Implements a simple, tcp ingress gateway workflow

This adds a new type of gateway for allowing Ingress traffic into Connect from external services.

Co-authored-by: Chris Piraino <cpiraino@hashicorp.com>
2020-04-16 14:00:48 -07:00
R.B. Boyer 6adad71125
wan federation via mesh gateways (#6884)
This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch:

There are several distinct chunks of code that are affected:

* new flags and config options for the server

* retry join WAN is slightly different

* retry join code is shared to discover primary mesh gateways from secondary datacenters

* because retry join logic runs in the *agent* and the results of that
  operation for primary mesh gateways are needed in the *server* there are
  some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur
  at multiple layers of abstraction just to pass the data down to the right
  layer.

* new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers

* the function signature for RPC dialing picked up a new required field (the
  node name of the destination)

* several new RPCs for manipulating a FederationState object:
  `FederationState:{Apply,Get,List,ListMeshGateways}`

* 3 read-only internal APIs for debugging use to invoke those RPCs from curl

* raft and fsm changes to persist these FederationStates

* replication for FederationStates as they are canonically stored in the
  Primary and replicated to the Secondaries.

* a special derivative of anti-entropy that runs in secondaries to snapshot
  their local mesh gateway `CheckServiceNodes` and sync them into their upstream
  FederationState in the primary (this works in conjunction with the
  replication to distribute addresses for all mesh gateways in all DCs to all
  other DCs)

* a "gateway locator" convenience object to make use of this data to choose
  the addresses of gateways to use for any given RPC or gossip operation to a
  remote DC. This gets data from the "retry join" logic in the agent and also
  directly calls into the FSM.

* RPC (`:8300`) on the server sniffs the first byte of a new connection to
  determine if it's actually doing native TLS. If so it checks the ALPN header
  for protocol determination (just like how the existing system uses the
  type-byte marker).

* 2 new kinds of protocols are exclusively decoded via this native TLS
  mechanism: one for ferrying "packet" operations (udp-like) from the gossip
  layer and one for "stream" operations (tcp-like). The packet operations
  re-use sockets (using length-prefixing) to cut down on TLS re-negotiation
  overhead.

* the server instances specially wrap the `memberlist.NetTransport` when running
  with gateway federation enabled (in a `wanfed.Transport`). The general gist is
  that if it tries to dial a node in the SAME datacenter (deduced by looking
  at the suffix of the node name) there is no change. If dialing a DIFFERENT
  datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh
  gateways to eventually end up in a server's :8300 port.

* a new flag when launching a mesh gateway via `consul connect envoy` to
  indicate that the servers are to be exposed. This sets a special service
  meta when registering the gateway into the catalog.

* `proxycfg/xds` notice this metadata blob to activate additional watches for
  the FederationState objects as well as the location of all of the consul
  servers in that datacenter.

* `xds:` if the extra metadata is in place additional clusters are defined in a
  DC to bulk sink all traffic to another DC's gateways. For the current
  datacenter we listen on a wildcard name (`server.<dc>.consul`) that load
  balances all servers as well as one mini-cluster per node
  (`<node>.server.<dc>.consul`)

* the `consul tls cert create` command got a new flag (`-node`) to help create
  an additional SAN in certs that can be used with this flavor of federation.
2020-03-09 15:59:02 -05:00
Matt Keeler 4c9577678e
xDS Mesh Gateway Resolver Subset Fixes (#7294)
* xDS Mesh Gateway Resolver Subset Fixes

The first fix was that clusters were being generated for every service resolver subset regardless of there being any service instances of the associated service in that dc. The previous logic didn’t care at all but now it will omit generating those clusters unless we also have service instances that should be proxied.

The second fix was to respect the DefaultSubset of a service resolver so that mesh-gateways would configure the endpoints of the unnamed subset cluster to only those endpoints matched by the default subsets filters.

* Refactor the gateway endpoint generation to be a little easier to read
2020-02-19 11:57:55 -05:00