This implements the Filter field on pbcatalog.WorkloadSelector to be
a post-fetch in-memory filter using the https://github.com/hashicorp/go-bexpr
expression language to filter resources based on their envelope metadata fields.
All existing usages of WorkloadSelector should be able to make use of the filter.
* xdsv2: support l7 by adding xfcc policy/headers, tweaking routes, and make a bunch of listeners l7 tests pass
* sidecarproxycontroller: add l7 local app support
* trafficpermissions: make l4 traffic permissions work on l7 workloads
* rename route name field for consistency with l4 cluster name field
* resolve conflicts and rebase
* fix: ensure route name is used in l7 destination route name as well. previously it was only in the route names themselves, now the route name and l7 destination route name line up
Sometimes workloads could come with unspecified protocols such as when running on Kubernetes. Currently, if this is the case, we will just default to tcp protocol.
However, to make sidecar-proxy controller work with l7 protocols we should instead inherit the protocol from service. This change adds tracking for services that a workload is part of and attempts to inherit the protocol whenever services a workload is part of doesn't have conflicting protocols.
This change builds on #19043 and #19067 and updates the sidecar controller to use those computed resources. This achieves several benefits:
* The cache is now simplified which helps us solve for previous bugs (such as multiple Upstreams/Destinations targeting the same service would overwrite each other)
* We no longer need proxy config cache
* We no longer need to do merging of proxy configs as part of the controller logic
* Controller watches are simplified because we no longer need to have complex mapping using cache and can instead use the simple ReplaceType mapper.
It also makes several other improvements/refactors:
* Unifies all caches into one. This is because originally the caches were more independent, however, now that they need to interact with each other it made sense to unify them where sidecar proxy controller uses one cache with 3 bimappers
* Unifies cache and mappers. Mapper already needed all caches anyway and so it made sense to make the cache do the mapping also now that the cache is unified.
* Gets rid of service endpoints watches. This was needed to get updates in a case when service's identities have changed and we need to update proxy state template's spiffe IDs for those destinations. This will however generate a lot of reconcile requests for this controller as service endpoints objects can change a lot because they contain workload's health status. This is solved by adding a status to the service object tracking "bound identities" and have service endpoints controller update it. Having service's status updated allows us to get updates in the sidecar proxy controller because it's already watching service objects
* Add a watch for workloads. We need it so that we get updates if workload's ports change. This also ensures that we update cached identities in case workload's identity changes.
This commit adds a new type ComputedDestinations that will contain all destinations from any Destinations resources and will be name-aligned with a workload. This also adds an explicit-destinations controller that computes these resources.
This is needed to simplify the tracking we need to do currently in the sidecar-proxy controller and makes it easier to query all explicit destinations that apply to a workload.
* Introduce a new type `ComputedProxyConfiguration` and add a controller for it. This is needed for two reasons. The first one is that external integrations like kubernetes may need to read the fully computed and sorted proxy configuration per workload. The second reasons is that it makes sidecar-proxy controller logic quite a bit simpler as it no longer needs to do this.
* Generalize workload selection mapper and fix a bug where it would delete IDs from the tree if only one is left after a removal is done.
Fix issues with empty sources
* Validate that each permission on traffic permissions resources has at least one source.
* Don't construct RBAC policies when there aren't any principals. This resulted in Envoy rejecting xDS updates with a validation error.
```
error=
| rpc error: code = Internal desc = Error adding/updating listener(s) public_listener: Proto constraint validation failed (RBACValidationError.Rules: embedded message failed validation | caused by RBACValidationError.Policies[consul-intentions-layer4-1]: embedded message failed validation | caused by PolicyValidationError.Principals: value must contain at least 1 item(s)): rules {
```
Convert more of the xRoutes features that were skipped in an earlier PR into ComputedRoutes and make them work:
- DestinationPolicy defaults
- more timeouts
- load balancer policy
- request/response header mutations
- urlrewrite
- GRPCRoute matches
xRoute resource types contain a slice of parentRefs to services that they
manipulate traffic for. All xRoutes that have a parentRef to given Service
will be merged together to generate a ComputedRoutes resource
name-aligned with that Service.
This means that a write of an xRoute with 2 parent ref pointers will cause
at most 2 reconciles for ComputedRoutes.
If that xRoute's list of parentRefs were ever to be reduced, or otherwise
lose an item, that subsequent map event will only emit events for the current
set of refs. The removed ref will not cause the generated ComputedRoutes
related to that service to be re-reconciled to omit the influence of that xRoute.
To combat this, we will store on the ComputedRoutes resource a
BoundResources []*pbresource.Reference field with references to all
resources that were used to influence the generated output.
When the routes controller reconciles, it will use a bimapper to index this
influence, and the dependency mappers for the xRoutes will look
themselves up in that index to discover additional (former) ComputedRoutes
that need to be notified as well.
xRoute resources are not name-aligned with the Services they control. They
have a list of "parent ref" services that they alter traffic flow for, and they
contain a list of "backend ref" services that they direct that traffic to.
The ACLs should be:
- list: (default)
- read:
- ALL service:<parent_ref_service>:read
- write:
- ALL service:<parent_ref_service>:write
- ALL service:<backend_ref_service>:read
DestinationPolicy resources are name-aligned with the Service they control.
The ACLs should be:
- list: (default)
- read: service:<resource_name>:read
- write: service:<resource_name>:write
FailoverPolicy resources are name-aligned with the Service they control.
They also contain a list of possible failover destinations that are References
to other Services.
The ACLs should be:
- list: (default)
- read: service:<resource_name>:read
- write: service:<resource_name>:write + service:<destination_name>:read (for any destination)
The ACLs.Read hook for a resource only allows for the identity of a
resource to be passed in for use in authz consideration. For some
resources we wish to allow for the current stored value to dictate how
to enforce the ACLs (such as reading a list of applicable services from
the payload and allowing service:read on any of them to control reading the enclosing resource).
This change update the interface to usually accept a *pbresource.ID,
but if the hook decides it needs more data it returns a sentinel error
and the resource service knows to defer the authz check until after
fetching the data from storage.
* add multiple upstream ports to golden file test for destination builder
* NET-5131 - add unit tests for multiple ported upstreams
* fix merge conflicts
* add namespace proto and registration
* fix proto generation
* add missing copywrite headers
* fix proto linter errors
* fix exports and Type export
* add mutate hook and more validation
* add more validation rules and tests
* Apply suggestions from code review
Co-authored-by: Semir Patel <semir.patel@hashicorp.com>
* fix owner error and add test
* remove ACL for now
* add tests around space suffix prefix.
* only fait when ns and ap are default, add test for it
---------
Co-authored-by: Semir Patel <semir.patel@hashicorp.com>
Ensure that configuring a FailoverPolicy for a service that is reachable via a xRoute or a direct upstream causes an envoy aggregate cluster to be created for the original cluster name, but with separate clusters for each one of the possible destinations.
Adding coauthors who mobbed/paired at various points throughout last week.
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: Iryna Shustava <iryna@hashicorp.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>
Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: Michael Wilkerson <mwilkerson@hashicorp.com>