Fix issue with wanfed lan ip conflicts.
Prior to this commit, the connection pools were unaware which datacenter the
connection was associated with. This meant that any time servers with
overlapping LAN IP addresses and node shortnames existed, they would be
incorrectly co-located in the same pool. Whenever this occurred, the servers
would get stuck in an infinite loop of forwarding RPCs to themselves (rather
than the intended remote DC) until they eventually run out of memory.
Most notably, this issue can occur whenever wan federation through mesh
gateways is enabled.
This fix adds extra metadata to specify which DC the connection is associated
with in the pool.
* fix: update watch endpoint to default based on scope
* test: additional test
* refactor: rename list validate function
* refactor: rename validate<Op>Request() -> ensure<Op>RequestValid() for consistency
* cover all protocols in local_app golden tests
* fix xds tests
* updating latest
* fix broken test
* add sorting of routers to TestBuildLocalApp to get rid of the flaking
* init
* computed exported service
* make proto
* exported services resource
* exported services test
* added some tests and namespace exported service
* partition exported services
* computed service
* computed services tests
* register types
* fix comment
* make proto lint
* fix proto format make proto
* make codegen
* Update proto-public/pbmulticluster/v1alpha1/computed_exported_services.proto
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* Update internal/multicluster/internal/types/computed_exported_services.go
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* using different way of resource creation in tests
* make proto
* fix computed exported services test
* fix tests
* differnet validation for computed services for ent and ce
* Acls for exported services
* added validations for enterprise features in ce
* fix error
* fix acls test
* Update internal/multicluster/internal/types/validation_exported_services_ee.go
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* removed the create method
* update proto
* removed namespace
* created seperate function for ce and ent
* test files updated and validations fixed
* added nil checks
* fix tests
* added comments
* removed tenancy check
* added mutation function
* fix mutation method
* fix list permissions in test
* fix pr comments
* fix tests
* lisence
* busl license
* Update internal/multicluster/internal/types/helpers_ce.go
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* Update internal/multicluster/internal/types/helpers_ce.go
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* Update internal/multicluster/internal/types/helpers_ce.go
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* make proto
* some pr comments addressed
* some pr comments addressed
* acls helper
* some comment changes
* removed unused files
* fixes
* fix function in file
* caps
* some positioing
* added test for validation error
* fix names
* made valid a function
* remvoed patch
* removed mutations
* v2 beta1
* v2beta1
* rmeoved v1alpha1
* validate error
* merge ent
* some nits
* removed dup func
* removed nil check
---------
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* xds: Ensure v2 route match is populated for gRPC
Similar to HTTP, ensure that route match config (which is required by
Envoy) is populated when default values are used.
Because the default matches generated for gRPC contain a single empty
`GRPCRouteMatch`, and that proto does not directly support prefix-based
config, an interpretation of the empty struct is needed to generate the
same output that the `HTTPRouteMatch` is explicitly configured to
provide in internal/mesh/internal/controllers/routes/generate.go.
* xds: Ensure protocol set for gRPC resources
Add explicit protocol in `ProxyStateTemplate` builders and validate it
is always set on clusters. This ensures that HTTP filters and
`http2_protocol_options` are populated in all the necessary places for
gRPC traffic and prevents future unintended omissions of non-TCP
protocols.
Co-authored-by: John Murret <john.murret@hashicorp.com>
---------
Co-authored-by: John Murret <john.murret@hashicorp.com>
* NET-5397 - wire up golden tests from sidecar-proxy controller for xds controller and xdsv2
* WIP
* WIP
* everything matching except leafCerts. need to mock those
* single port destinations working except mixed destinations
* golden test input to xds controller tests for destinations
* proposed fix for failover group naming errors
* clean up test to use helper.
* clean up test to use helper.
* fix test file
* add docstring for test function.
* add docstring for test function.
* fix linting error
* fixing test after route fix merged into main
* first source test works
* WIP
* modify all source files
* source tests pass
* fixing tests after bug fix in main
* got first destination working.
* adding destinations
* fix docstring for test
* fixing tests after bug fix in main
* adding source proxies
* fixing tests after bug fix in main
* got first destination working.
* adding destinations
* fix docstring for test
* fixing tests after bug fix in main
* got first destination working.
* adding destinations
* fix docstring for test
* fixing tests after bug fix in main
* NET-5397 - wire up golden tests from sidecar-proxy controller for xds controller and xdsv2
* WIP
* WIP
* everything matching except leafCerts. need to mock those
* single port destinations working except mixed destinations
* golden test input to xds controller tests for destinations
* proposed fix for failover group naming errors
* clean up test to use helper.
* clean up test to use helper.
* fix test file
* add docstring for test function.
* add docstring for test function.
* fix linting error
* fixing test after route fix merged into main
* first source test works
* WIP
* modify all source files
* source tests pass
* fixing tests after bug fix in main
* got first destination working.
* adding destinations
* fix docstring for test
* fixing tests after bug fix in main
Prior to the introduction of this configuration, grpc keepalive messages were
sent after 2 hours of inactivity on the stream. This posed issues in various
scenarios where the server-side xds connection balancing was unaware that envoy
instances were uncleanly killed / force-closed, since the connections would
only be cleaned up after ~5 minutes of TCP timeouts occurred. Setting this
config to a 30 second interval with a 20 second timeout ensures that at most,
it should take up to 50 seconds for a dead xds connection to be closed.
Ensure LB policy set for locality-aware routing (CE)
`overprovisioningFactor` should be overridden with the expected value
(100,000) when there are multiple endpoint groups. Update code and
tests to enforce this.
This is an Enterprise feature. This commit represents the CE portions of
the change; tests are added in the corresponding `consul-enterprise`
change.
This change adds ACL hooks to the remaining catalog and mesh resources, excluding any computed ones. Those will for now continue using the default operator:x permissions.
It refactors a lot of the common testing functions so that they can be re-used between resources.
There are also some types that we don't yet support (e.g. virtual IPs) that this change adds ACL hooks to for future-proofing.
* xdsv2: support l7 by adding xfcc policy/headers, tweaking routes, and make a bunch of listeners l7 tests pass
* sidecarproxycontroller: add l7 local app support
* trafficpermissions: make l4 traffic permissions work on l7 workloads
* rename route name field for consistency with l4 cluster name field
* resolve conflicts and rebase
* fix: ensure route name is used in l7 destination route name as well. previously it was only in the route names themselves, now the route name and l7 destination route name line up
When the v2 catalog experiment is enabled the old v1 catalog apis will be
forcibly disabled at both the API (json) layer and the RPC (msgpack) layer.
This will also disable anti-entropy as it uses the v1 api.
This includes all of /v1/catalog/*, /v1/health/*, most of /v1/agent/*,
/v1/config/*, and most of /v1/internal/*.
This PR fixes an issue where upstreams did not correctly inherit the proper
namespace / partition from the parent service when attempting to fetch the
upstream protocol due to inconsistent normalization.
Some of the merge-service-configuration logic would normalize to default, while
some of the proxycfg logic would normalize to match the parent service. Due to
this mismatch in logic, an incorrect service-defaults configuration entry would
be fetched and have its protocol applied to the upstream.
* Add InboundPeerTrustBundle maps to Terminating Gateway
* Add notify and cancelation of watch for inbound peer trust bundles
* Pass peer trust bundles to the RBAC creation function
* Regenerate Golden Files
* add changelog, also adds another spot that needed peeredTrustBundles
* Add basic test for terminating gateway with peer trust bundle
* Add intention to cluster peered golden test
* rerun codegen
* update changelog
* really update the changelog
---------
Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com>
Fix issues with empty sources
* Validate that each permission on traffic permissions resources has at least one source.
* Don't construct RBAC policies when there aren't any principals. This resulted in Envoy rejecting xDS updates with a validation error.
```
error=
| rpc error: code = Internal desc = Error adding/updating listener(s) public_listener: Proto constraint validation failed (RBACValidationError.Rules: embedded message failed validation | caused by RBACValidationError.Policies[consul-intentions-layer4-1]: embedded message failed validation | caused by PolicyValidationError.Principals: value must contain at least 1 item(s)): rules {
```
The ACLs.Read hook for a resource only allows for the identity of a
resource to be passed in for use in authz consideration. For some
resources we wish to allow for the current stored value to dictate how
to enforce the ACLs (such as reading a list of applicable services from
the payload and allowing service:read on any of them to control reading the enclosing resource).
This change update the interface to usually accept a *pbresource.ID,
but if the hook decides it needs more data it returns a sentinel error
and the resource service knows to defer the authz check until after
fetching the data from storage.
* dns token
fix whitespace for docs and comments
fix test cases
fix test cases
remove tabs in help text
Add changelog
Peering dns test
Peering dns test
Partial implementation of Peered DNS test
Swap to new topology lib
expose dns port for integration tests on client
remove partial test implementation
remove extra port exposure
remove changelog from the ent pr
Add dns token to set-agent-token switch
Add enterprise golden file
Use builtin/dns template in tests
Update ent dns policy
Update ent dns template test
remove local gen certs
fix templated policy specs
* add changelog
* go mod tidy
* add namespace proto and registration
* fix proto generation
* add missing copywrite headers
* fix proto linter errors
* fix exports and Type export
* add mutate hook and more validation
* add more validation rules and tests
* Apply suggestions from code review
Co-authored-by: Semir Patel <semir.patel@hashicorp.com>
* fix owner error and add test
* remove ACL for now
* add tests around space suffix prefix.
* only fait when ns and ap are default, add test for it
---------
Co-authored-by: Semir Patel <semir.patel@hashicorp.com>
Ensure that configuring a FailoverPolicy for a service that is reachable via a xRoute or a direct upstream causes an envoy aggregate cluster to be created for the original cluster name, but with separate clusters for each one of the possible destinations.
Adding coauthors who mobbed/paired at various points throughout last week.
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: Iryna Shustava <iryna@hashicorp.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>
Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: Michael Wilkerson <mwilkerson@hashicorp.com>
Configure Envoy to use the same HTTP protocol version used by the
downstream caller when forwarding requests to a local application that
is configured with the protocol set to either `http2` or `grpc`.
This allows upstream applications that support both HTTP/1.1 and
HTTP/2 on a single port to receive requests using either protocol. This
is beneficial when the application primarily communicates using HTTP/2,
but also needs to support HTTP/1.1, such as to respond to Kubernetes
HTTP readiness/liveness probes.
Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>
Previously, when using implicit upstreams, we'd build outbound listener per destination instead of one for all destinations. This will result in port conflicts when trying to send this config to envoy.
This PR also makes sure that leaf and root references are always added (before we would only add it if there are inbound non-mesh ports).
Also, black-hole traffic when there are no inbound ports other than mesh
Reworks the sidecar controller to accept ComputedRoutes as an input and use it to generate appropriate ProxyStateTemplate resources containing L4/L7 mesh configuration.
The renaming of files from oss -> ce caused incorrect snapshots
to be created due to ce writes now happening prior to ent writes.
When this happens various entities will attempt to be restored
from the snapshot prior to a partition existing and will cause a
panic to occur.
* Refactors the leafcert package to not have a dependency on agent/consul and agent/cache to avoid import cycles. This way the xds controller can just import the leafcert package to use the leafcert manager.
The leaf cert logic in the controller:
* Sets up watches for leaf certs that are referenced in the ProxyStateTemplate (which generates the leaf certs too).
* Gets the leaf cert from the leaf cert cache
* Stores the leaf cert in the ProxyState that's pushed to xds
* For the cert watches, this PR also uses a bimapper + a thin wrapper to map leaf cert events to related ProxyStateTemplates
Since bimapper uses a resource.Reference or resource.ID to map between two resource types, I've created an internal type for a leaf certificate to use for the resource.Reference, since it's not a v2 resource.
The wrapper allows mapping events to resources (as opposed to mapping resources to resources)
The controller tests:
Unit: Ensure that we resolve leaf cert references
Lifecycle: Ensure that when the CA is updated, the leaf cert is as well
Also adds a new spiffe id type, and adds workload identity and workload identity URI to leaf certs. This is so certs are generated with the new workload identity based SPIFFE id.
* Pulls out some leaf cert test helpers into a helpers file so it
can be used in the xds controller tests.
* Wires up leaf cert manager dependency
* Support getting token from proxytracker
* Add workload identity spiffe id type to the authorize and sign functions
---------
Co-authored-by: John Murret <john.murret@hashicorp.com>
* mesh-controller: handle L4 protocols for a proxy without upstreams
* sidecar-controller: Support explicit destinations for L4 protocols and single ports.
* This controller generates and saves ProxyStateTemplate for sidecar proxies.
* It currently supports single-port L4 ports only.
* It keeps a cache of all destinations to make it easier to compute and retrieve destinations.
* It will update the status of the pbmesh.Upstreams resource if anything is invalid.
* endpoints-controller: add workload identity to the service endpoints resource
* small fixes
* review comments
* Address PR comments
* sidecar-proxy controller: Add support for transparent proxy
This currently does not support inferring destinations from intentions.
* PR review comments
* mesh-controller: handle L4 protocols for a proxy without upstreams
* sidecar-controller: Support explicit destinations for L4 protocols and single ports.
* This controller generates and saves ProxyStateTemplate for sidecar proxies.
* It currently supports single-port L4 ports only.
* It keeps a cache of all destinations to make it easier to compute and retrieve destinations.
* It will update the status of the pbmesh.Upstreams resource if anything is invalid.
* endpoints-controller: add workload identity to the service endpoints resource
* small fixes
* review comments
* Make sure endpoint refs route to mesh port instead of an app port
* Address PR comments
* fixing copyright
* tidy imports
* sidecar-proxy controller: Add support for transparent proxy
This currently does not support inferring destinations from intentions.
* tidy imports
* add copyright headers
* Prefix sidecar proxy test files with source and destination.
* Update controller_test.go
* NET-5132 - Configure multiport routing for connect proxies in TProxy mode
* formatting golden files
* reverting golden files and adding changes in manually. build implicit destinations still has some issues.
* fixing files that were incorrectly repeating the outbound listener
* PR comments
* extract AlpnProtocol naming convention to getAlpnProtocolFromPortName(portName)
* removing address level filtering.
* adding license to resources_test.go
---------
Co-authored-by: Iryna Shustava <iryna@hashicorp.com>
Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com>
This commit adds support for transparent proxy to the sidecar proxy controller. As we do not yet support inferring destinations from intentions, this assumes that all services in the cluster are destinations.