This commit swaps the partition field to the local partition for
discovery chains targeting peers. Prior to this change, peer upstreams
would always use a value of default regardless of which partition they
exist in. This caused several issues in xds / proxycfg because of id
mismatches.
Some prior fixes were made to deal with one-off id mismatches that this
PR also cleans up, since they are no longer needed.
Protobuf Refactoring for Multi-Module Cleanliness
This commit includes the following:
Moves all packages that were within proto/ to proto/private
Rewrites imports to account for the packages being moved
Adds in buf.work.yaml to enable buf workspaces
Names the proto-public buf module so that we can override the Go package imports within proto/buf.yaml
Bumps the buf version dependency to 1.14.0 (I was trying out the version to see if it would get around an issue - it didn't but it also doesn't break things and it seemed best to keep up with the toolchain changes)
Why:
In the future we will need to consume other protobuf dependencies such as the Google HTTP annotations for openapi generation or grpc-gateway usage.
There were some recent changes to have our own ratelimiting annotations.
The two combined were not working when I was trying to use them together (attempting to rebase another branch)
Buf workspaces should be the solution to the problem
Buf workspaces means that each module will have generated Go code that embeds proto file names relative to the proto dir and not the top level repo root.
This resulted in proto file name conflicts in the Go global protobuf type registry.
The solution to that was to add in a private/ directory into the path within the proto/ directory.
That then required rewriting all the imports.
Is this safe?
AFAICT yes
The gRPC wire protocol doesn't seem to care about the proto file names (although the Go grpc code does tack on the proto file name as Metadata in the ServiceDesc)
Other than imports, there were no changes to any generated code as a result of this.
* Stub proxycfg handler for API gateway
* Add Service Kind constants/handling for API Gateway
* Begin stubbing for SDS
* Add new Secret type to xDS order of operations
* Continue stubbing of SDS
* Iterate on proxycfg handler for API gateway
* Handle BoundAPIGateway config entry subscription in proxycfg-glue
* Add API gateway to config snapshot validation
* Add API gateway to config snapshot clone, leaf, etc.
* Subscribe to bound route + cert config entries on bound-api-gateway
* Track routes + certs on API gateway config snapshot
* Generate DeepCopy() for types used in watch.Map
* Watch all active references on api-gateway, unwatch inactive
* Track loading of initial bound-api-gateway config entry
* Use proper proto package for SDS mapping
* Use ResourceReference instead of ServiceName, collect resources
* Fix typo, add + remove TODOs
* Watch discovery chains for TCPRoute
* Add TODO for updating gateway services for api-gateway
* make proto
* Regenerate deep-copy for proxycfg
* Set datacenter on upstream ID from query source
* Watch discovery chains for http-route service backends
* Add ServiceName getter to HTTP+TCP Service structs
* Clean up unwatched discovery chains on API Gateway
* Implement watch for ingress leaf certificate
* Collect upstreams on http-route + tcp-route updates
* Remove unused GatewayServices update handler
* Remove unnecessary gateway services logic for API Gateway
* Remove outdate TODO
* Use .ToIngress where appropriate, including TODO for cleaning up
* Cancel before returning error
* Remove GatewayServices subscription
* Add godoc for handlerAPIGateway functions
* Update terminology from Connect => Consul Service Mesh
Consistent with terminology changes in https://github.com/hashicorp/consul/pull/12690
* Add missing TODO
* Remove duplicate switch case
* Rerun deep-copy generator
* Use correct property on config snapshot
* Remove unnecessary leaf cert watch
* Clean up based on code review feedback
* Note handler properties that are initialized but set elsewhere
* Add TODO for moving helper func into structs pkg
* Update generated DeepCopy code
* gofmt
* Begin stubbing for SDS
* Start adding tests
* Remove second BoundAPIGateway case in glue
* TO BE PICKED: fix formatting of str
* WIP
* Fix merge conflict
* Implement HTTP Route to Discovery Chain config entries
* Stub out function to create discovery chain
* Add discovery chain merging code (#16131)
* Test adding TCP and HTTP routes
* Add some tests for the synthesizer
* Run go mod tidy
* Pairing with N8
* Run deep copy
* Clean up GatewayChainSynthesizer
* Fix missing assignment of BoundAPIGateway topic
* Separate out synthesizeChains and toIngressTLS
* Fix build errors
* Ensure synthesizer skips non-matching routes by protocol
* Rebase on N8s work
* Generate DeepCopy() for API gateway listener types
* Improve variable name
* Regenerate DeepCopy() code
* Fix linting issue
* fix protobuf import
* Fix more merge conflict errors
* Fix synthesize test
* Run deep copy
* Add URLRewrite to proto
* Update agent/consul/discoverychain/gateway_tcproute.go
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* Remove APIGatewayConfigEntry that was extra
* Error out if route kind is unknown
* Fix formatting errors in proto
---------
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>
* Stub proxycfg handler for API gateway
* Add Service Kind constants/handling for API Gateway
* Begin stubbing for SDS
* Add new Secret type to xDS order of operations
* Continue stubbing of SDS
* Iterate on proxycfg handler for API gateway
* Handle BoundAPIGateway config entry subscription in proxycfg-glue
* Add API gateway to config snapshot validation
* Add API gateway to config snapshot clone, leaf, etc.
* Subscribe to bound route + cert config entries on bound-api-gateway
* Track routes + certs on API gateway config snapshot
* Generate DeepCopy() for types used in watch.Map
* Watch all active references on api-gateway, unwatch inactive
* Track loading of initial bound-api-gateway config entry
* Use proper proto package for SDS mapping
* Use ResourceReference instead of ServiceName, collect resources
* Fix typo, add + remove TODOs
* Watch discovery chains for TCPRoute
* Add TODO for updating gateway services for api-gateway
* make proto
* Regenerate deep-copy for proxycfg
* Set datacenter on upstream ID from query source
* Watch discovery chains for http-route service backends
* Add ServiceName getter to HTTP+TCP Service structs
* Clean up unwatched discovery chains on API Gateway
* Implement watch for ingress leaf certificate
* Collect upstreams on http-route + tcp-route updates
* Remove unused GatewayServices update handler
* Remove unnecessary gateway services logic for API Gateway
* Remove outdate TODO
* Use .ToIngress where appropriate, including TODO for cleaning up
* Cancel before returning error
* Remove GatewayServices subscription
* Add godoc for handlerAPIGateway functions
* Update terminology from Connect => Consul Service Mesh
Consistent with terminology changes in https://github.com/hashicorp/consul/pull/12690
* Add missing TODO
* Remove duplicate switch case
* Rerun deep-copy generator
* Use correct property on config snapshot
* Remove unnecessary leaf cert watch
* Clean up based on code review feedback
* Note handler properties that are initialized but set elsewhere
* Add TODO for moving helper func into structs pkg
* Update generated DeepCopy code
* gofmt
* Generate DeepCopy() for API gateway listener types
* Improve variable name
* Regenerate DeepCopy() code
* Fix linting issue
* Temporarily remove the secret type from resource generation
Ensure nothing in the troubleshoot go module depends on consul's top level module. This is so we can import troubleshoot into consul-k8s and not import all of consul.
* turns troubleshoot into a go module [authored by @curtbushko]
* gets the envoy protos into the troubleshoot module [authored by @curtbushko]
* adds a new go module `envoyextensions` which has xdscommon and extensioncommon folders that both the xds package and the troubleshoot package can import
* adds testing and linting for the new go modules
* moves the unit tests in `troubleshoot/validateupstream` that depend on proxycfg/xds into the xds package, with a comment describing why those tests cannot be in the troubleshoot package
* fixes all the imports everywhere as a result of these changes
Co-authored-by: Curt Bushko <cbushko@gmail.com>
Fix configuration merging for implicit tproxy upstreams.
Change the merging logic so that the wildcard upstream has correct proxy-defaults
and service-defaults values combined into it. It did not previously merge all fields,
and the wildcard upstream did not exist unless service-defaults existed (it ignored
proxy-defaults, essentially).
Change the way we fetch upstream configuration in the xDS layer so that it falls back
to the wildcard when no matching upstream is found. This is what allows implicit peer
upstreams to have the correct "merged" config.
Change proxycfg to always watch local mesh gateway endpoints whenever a peer upstream
is found. This simplifies the logic so that we do not have to inspect the "merged"
configuration on peer upstreams to extract the mesh gateway mode.
* Protobuf Modernization
Remove direct usage of golang/protobuf in favor of google.golang.org/protobuf
Marshallers (protobuf and json) needed some changes to account for different APIs.
Moved to using the google.golang.org/protobuf/types/known/* for the well known types including replacing some custom Struct manipulation with whats available in the structpb well known type package.
This also updates our devtools script to install protoc-gen-go from the right location so that files it generates conform to the correct interfaces.
* Fix go-mod-tidy make target to work on all modules
* feat(ingress-gateway): support outlier detection of upstream service for ingress gateway
* changelog
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
This commit adds the xDS resources needed for INBOUND traffic from peer
clusters:
- 1 filter chain for all inbound peering requests.
- 1 cluster for all inbound peering requests.
- 1 endpoint per voting server with the gRPC TLS port configured.
There is one filter chain and cluster because unlike with WAN
federation, peer clusters will not attempt to dial individual servers.
Peer clusters will only dial the local mesh gateway addresses.
* feat(ingress gateway: support configuring limits in ingress-gateway config entry
- a new Defaults field with max_connections, max_pending_connections, max_requests
is added to ingress gateway config entry
- new field max_connections, max_pending_connections, max_requests in
individual services to overwrite the value in Default
- added unit test and integration test
- updated doc
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Routing peering control plane traffic through mesh gateways can be
enabled or disabled at runtime with the mesh config entry.
This commit updates proxycfg to add or cancel watches for local servers
depending on this central config.
Note that WAN federation over mesh gateways is determined by a service
metadata flag, and any updates to the gateway service registration will
force the creation of a new snapshot. If enabled, WAN-fed over mesh
gateways will trigger a local server watch on initialize().
Because of this we will only add/remove server watches if WAN federation
over mesh gateways is disabled.
Now that peered upstreams can generate envoy resources (#13758), we need a way to disambiguate local from peered resources in our metrics. The key difference is that datacenter and partition will be replaced with peer, since in the context of peered resources partition is ambiguous (could refer to the partition in a remote cluster or one that exists locally). The partition and datacenter of the proxy will always be that of the source service.
Regexes were updated to make emitting datacenter and partition labels mutually exclusive with peer labels.
Listener filter names were updated to better match the existing regex.
Cluster names assigned to peered upstreams were updated to be synthesized from local peer name (it previously used the externally provided primary SNI, which contained the peer name from the other side of the peering). Integration tests were updated to assert for the new peer labels.
Peered upstreams has a separate loop in xds from discovery chain upstreams. This PR adds similar but slightly modified code to add filters for peered upstream listeners, clusters, and endpoints in the case of transparent proxy.
Because peerings are pairwise, between two tuples of (datacenter,
partition) having any exported reference via a discovery chain that
crosses out of the peered datacenter or partition will ultimately not be
able to work for various reasons. The biggest one is that there is no
way in the ultimate destination to configure an intention that can allow
an external SpiffeID to access a service.
This PR ensures that a user simply cannot do this, so they won't run
into weird situations like this.
Mesh gateways can use hostnames in their tagged addresses (#7999). This is useful
if you were to expose a mesh gateway using a cloud networking load balancer appliance
that gives you a DNS name but no reliable static IPs.
Envoy cannot accept hostnames via EDS and those must be configured using CDS.
There was already logic when configuring gateways in other locations in the code, but
given the illusions in play for peering the downstream of a peered service wasn't aware
that it should be doing that.
Also:
- ensuring that we always try to use wan-like addresses to cross peer boundaries.
This is the OSS portion of enterprise PR 1994
Rather than directly interrogating the agent-local state for HTTP
checks using the `HTTPCheckFetcher` interface, we now rely on the
config snapshot containing the checks.
This reduces the number of changes required to support server xDS
sessions.
It's not clear why the fetching approach was introduced in
931d167ebb.
For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.
Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.
This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.