### Description
<!-- Please describe why you're making this change, in plain English.
-->
- Currently the jwt-auth filter doesn't take into account the service
identity when validating jwt-auth, it only takes into account the path
and jwt provider during validation. This causes issues when multiple
source intentions restrict access to an endpoint with different JWT
providers.
- To fix these issues, rather than use the JWT auth filter for
validation, we use it in metadata mode and allow it to forward the
successful validated JWT token payload to the RBAC filter which will
make the decisions.
This PR ensures requests with and without JWT tokens successfully go
through the jwt-authn filter. The filter however only forwards the data
for successful/valid tokens. On the RBAC filter level, we check the
payload for claims and token issuer + existing rbac rules.
### Testing & Reproduction steps
<!--
* In the case of bugs, describe how to replicate
* If any manual tests were done, document the steps and the conditions
to replicate
* Call out any important/ relevant unit tests, e2e tests or integration
tests you have added or are adding
-->
- This test covers a multi level jwt requirements (requirements at top
level and permissions level). It also assumes you have envoy running,
you have a redis and a sidecar proxy service registered, and have a way
to generate jwks with jwt. I mostly use:
https://www.scottbrady91.com/tools/jwt for this.
- first write your proxy defaults
```
Kind = "proxy-defaults"
name = "global"
config {
protocol = "http"
}
```
- Create two providers
```
Kind = "jwt-provider"
Name = "auth0"
Issuer = "https://ronald.local"
JSONWebKeySet = {
Local = {
JWKS = "eyJrZXlzIjog....."
}
}
```
```
Kind = "jwt-provider"
Name = "okta"
Issuer = "https://ronald.local"
JSONWebKeySet = {
Local = {
JWKS = "eyJrZXlzIjogW3...."
}
}
```
- add a service intention
```
Kind = "service-intentions"
Name = "redis"
JWT = {
Providers = [
{
Name = "okta"
},
]
}
Sources = [
{
Name = "*"
Permissions = [{
Action = "allow"
HTTP = {
PathPrefix = "/workspace"
}
JWT = {
Providers = [
{
Name = "okta"
VerifyClaims = [
{
Path = ["aud"]
Value = "my_client_app"
},
{
Path = ["sub"]
Value = "5be86359073c434bad2da3932222dabe"
}
]
},
]
}
},
{
Action = "allow"
HTTP = {
PathPrefix = "/"
}
JWT = {
Providers = [
{
Name = "auth0"
},
]
}
}]
}
]
```
- generate 3 jwt tokens: 1 from auth0 jwks, 1 from okta jwks with
different claims than `/workspace` expects and 1 with correct claims
- connect to your envoy (change service and address as needed) to view
logs and potential errors. You can add: `-- --log-level debug` to see
what data is being forwarded
```
consul connect envoy -sidecar-for redis1 -grpc-addr 127.0.0.1:8502
```
- Make the following requests:
```
curl -s -H "Authorization: Bearer $Auth0_TOKEN" --insecure --cert leaf.cert --key leaf.key --cacert connect-ca.pem https://localhost:20000/workspace -v
RBAC filter denied
curl -s -H "Authorization: Bearer $Okta_TOKEN_with_wrong_claims" --insecure --cert leaf.cert --key leaf.key --cacert connect-ca.pem https://localhost:20000/workspace -v
RBAC filter denied
curl -s -H "Authorization: Bearer $Okta_TOKEN_with_correct_claims" --insecure --cert leaf.cert --key leaf.key --cacert connect-ca.pem https://localhost:20000/workspace -v
Successful request
```
### TODO
* [x] Update test coverage
* [ ] update integration tests (follow-up PR)
* [x] appropriate backport labels added
* fix(cli): remove failing check from 'connect envoy' registration for api gateway
* test(integration): add tests to check catalog statsus of gateways on startup
* remove extra sleep comment
* Update test/integration/consul-container/libs/assert/service.go
* changelog
- Provide more realistics examples for setting properties not already
supported natively by Consul
- Remove superfluous commas from HCL, correct target service name, and
fix service defaults vs. proxy defaults in examples
- Align existing integration test to updated docs
* Implement the Catalog V2 controller integration container tests
This now allows the container tests to import things from the root module. However for now we want to be very restrictive about which packages we allow importing.
* Add an upgrade test for the new catalog
Currently this should be dormant and not executed. However its put in place to detect breaking changes in the future and show an example of how to do an upgrade test with integration tests structured like catalog v2.
* Make testutil.Retry capable of performing cleanup operations
These cleanup operations are executed after each retry attempt.
* Move TestContext to taking an interface instead of a concrete testing.T
This allows this to be used on a retry.R or generally anything that meets the interface.
* Move to using TestContext instead of background contexts
Also this forces all test methods to implement the Cleanup method now instead of that being an optional interface.
Co-authored-by: Daniel Upton <daniel@floppy.co>
Ensure that the embedded api struct is properly parsed when
deserializing config containing a set ResourceFilter.Services field.
Also enhance existing integration test to guard against bugs and
exercise this field.
TLDR with many modules the versions included in each diverged quite a bit. Attempting to use Go Workspaces produces a bunch of errors.
This commit:
1. Fixes envoy-library-references.sh to work again
2. Ensures we are pulling in go-control-plane@v0.11.0 everywhere (previously it was at that version in some modules and others were much older)
3. Remove one usage of golang/protobuf that caused us to have a direct dependency on it.
4. Remove deprecated usage of the Endpoint field in the grpc resolver.Target struct. The current version of grpc (v1.55.0) has removed that field and recommended replacement with URL.Opaque and calls to the Endpoint() func when needing to consume the previous field.
4. `go work init <all the paths to go.mod files>` && `go work sync`. This syncrhonized versions of dependencies from the main workspace/root module to all submodules
5. Updated .gitignore to ignore the go.work and go.work.sum files. This seems to be standard practice at the moment.
6. Update doc comments in protoc-gen-consul-rate-limit to be go fmt compatible
7. Upgraded makefile infra to perform linting, testing and go mod tidy on all modules in a flexible manner.
8. Updated linter rules to prevent usage of golang/protobuf
9. Updated a leader peering test to account for an extra colon in a grpc error message.
* endpoints xds cluster configuration
* resources test fix
* fix reversion in resources_test
* Update agent/proxycfg/api_gateway.go
Co-authored-by: John Maguire <john.maguire@hashicorp.com>
* gofmt
* Modify getReadyUpstreams to filter upstreams by listener (#17410)
Each listener would previously have all upstreams from any route that bound to the listener. This is problematic when a route bound to one listener also binds to other listeners and so includes upstreams for multiple listeners. The list for a given listener would then wind up including upstreams for other listeners.
* Update agent/proxycfg/api_gateway.go
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* Restore import blocking
* Skip to next route if route has no upstreams
* cleanup
* change set from bool to empty struct
---------
Co-authored-by: John Maguire <john.maguire@hashicorp.com>
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* upgrade test: add targetimage name as parameter to upgrade function
- the image name of latest version and target version could be
different. Add the parameter of targetImage to the upgrade
function
* fix a bug of expected error
* Add MaxEjectionPercent to config entry
* Add BaseEjectionTime to config entry
* Add MaxEjectionPercent and BaseEjectionTime to protobufs
* Add MaxEjectionPercent and BaseEjectionTime to api
* Fix integration test breakage
* Verify MaxEjectionPercent and BaseEjectionTime in integration test upstream confings
* Website docs for MaxEjectionPercent and BaseEjection time
* Add `make docs` to browse docs at http://localhost:3000
* Changelog entry
* so that is the difference between consul-docker and dev-docker
* blah
* update proto funcs
* update proto
---------
Co-authored-by: Maliz <maliheh.monshizadeh@hashicorp.com>
* TProxy integration test
* Fix GHA compatibility integration test command
Previously, when test splitting allocated multiple test directories to a
runner, the workflow ran `go tests "./test/dir1 ./test/dir2"` which
results in a directory not found error. This fixes that.
* Fix straggler from renaming Register->RegisterTypes
* somehow a lint failure got through previously
* Fix lint-consul-retry errors
* adding in fix for success jobs getting skipped. (#17132)
* Temporarily disable inmem backend conformance test to get green pipeline
* Another test needs disabling
---------
Co-authored-by: John Murret <john.murret@hashicorp.com>
* normalize status conditions for gateways and routes
* Added tests for checking condition status and panic conditions for
validating combinations, added dummy code for fsm store
* get rid of unneeded gateway condition generator struct
* Remove unused file
* run go mod tidy
* Update tests, add conflicted gateway status
* put back removed status for test
* Fix linting violation, remove custom conflicted status
* Update fsm commands oss
* Fix incorrect combination of type/condition/status
* cleaning up from PR review
* Change "invalidCertificate" to be of accepted status
* Move status condition enums into api package
* Update gateways controller and generated code
* Update conditions in fsm oss tests
* run go mod tidy on consul-container module to fix linting
* Fix type for gateway endpoint test
* go mod tidy from changes to api
* go mod tidy on troubleshoot
* Fix route conflicted reason
* fix route conflict reason rename
* Fix text for gateway conflicted status
* Add valid certificate ref condition setting
* Revert change to resolved refs to be handled in future PR
* add ability to start container tests in debug mode and attach a debugger to consul while running it.
* add a debug message with the debug port
* use pod to get the right port
* fix image used in basic test
* add more data to identify which container to debug.
* fix comment
Co-authored-by: Evan Culver <eculver@users.noreply.github.com>
* rename debugUri to debugURI
---------
Co-authored-by: Evan Culver <eculver@users.noreply.github.com>
- added Sameness Group to config entries
- added Sameness Group to subscriptions
* generated proto files
* added Sameness Group events to the state store
- added test cases
* Refactored health RPC Client
- moved code that is common to rpcclient under rpcclient common.go. This will help set us up to support future RPC clients
* Refactored proxycfg glue views
- Moved views to rpcclient config entry. This will allow us to reuse this code for a config entry client
* added config entry RPC Client
- Copied most of the testing code from rpcclient/health
* hooked up new rpcclient in agent
* fixed documentation and comments for clarity
* Exand route flattening test for multiple namespaces
* Add helper for checking http route config entry exists without checking for bound
status
* Fix port and hostname check for http route flattening test
* NET-2397: Add readme.md to upgrade test subdirectory
* remove test code
* fix link and update steps of adding new test cases (#16654)
* fix link and update steps of adding new test cases
* Apply suggestions from code review
Co-authored-by: Nick Irvine <115657443+nfi-hashicorp@users.noreply.github.com>
---------
Co-authored-by: Nick Irvine <115657443+nfi-hashicorp@users.noreply.github.com>
---------
Co-authored-by: cskh <hui.kang@hashicorp.com>
Co-authored-by: Nick Irvine <115657443+nfi-hashicorp@users.noreply.github.com>
* add snapshot restore test
* add logstore as test parameter
* Use the correct image version
* make sure we read the logs from a followers to test the follower snapshot install path.
* update to raf-wal v0.3.0
* add changelog.
* updating changelog for bug description and removed integration test.
* setting up test container builder to only set logStore for 1.15 and higher
---------
Co-authored-by: Paul Banks <pbanks@hashicorp.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
* Refactored "NewGatewayService" to handle namespaces, fixed
TestHTTPRouteFlattening test
* Fixed existing http_route tests for namespacing
* Squash aclEnterpriseMeta for ResourceRefs and HTTPServices, accept
namespace for creating connect services and regular services
* Use require instead of assert after creating namespaces in
http_route_tests
* Refactor NewConnectService and NewGatewayService functions to use cfg
objects to reduce number of method args
* Rename field on SidecarConfig in tests from `SidecarServiceName` to
`Name` to avoid stutter
* Fix issue where terminating gateway service resolvers weren't properly cleaned up
* Add integration test for cleaning up resolvers
* Add changelog entry
* Use state test and drop integration test
* test(gateways): add API Gateway HTTPRoute ParentRef change test
* test(gateways): add checkRouteError helper
* test(gateways): remove EOF check
in CI this seems to sometimes be 'connection reset by peer' instead
* Update test/integration/consul-container/test/gateways/http_route_test.go
* wip, proof of concept, gateway service being registered, don't know how to hit it
* checkpoint
* Fix up API Gateway go tests (#16297)
* checkpoint, getting InvalidDiscoveryChain route protocol does not match targeted service protocol
* checkpoint
* httproute hittable
* tests working, one header test failing
* differentiate services by status code, minor cleanup
* working tests
* updated GetPort interface
* fix getport
---------
Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>
* Include secret type when building resources from config snapshot
* First pass at generating envoy secrets from api-gateway snapshot
* Update comments for xDS update order
* Add secret type + corresponding golden files to existing tests
* Initialize test helpers for testing api-gateway resource generation
* Generate golden files for new api-gateway xDS resource test
* Support ADS for TLS certificates on api-gateway
* Configure TLS on api-gateway listeners
* Inline TLS cert code
* update tests
* Add SNI support so we can have multiple certificates
* Remove commented out section from helper
* regen deep-copy
* Add tcp tls test
---------
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* Simple API Gateway e2e test for tcp routes
* Drop DNSSans since we don't front the Gateway with a leaf cert
* WIP listener tests for api-gateway
* Return early if no routes
* Add back in leaf cert to testing
* Fix merge conflicts
* Re-add kind to setup
* Fix iteration over listener upstreams
* New tcp listener test
* Add tests for API Gateway with TCP and HTTP routes
* Move zero-route check back
* Drop generateIngressDNSSANs
* Check for chains not routes
---------
Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>
* draft
* expose internal admin port and add proxy test
* update tests
* move comment
* add failure case, fix lint issues
* cleanup
* handle error
* revert changes to service interface
* address review comments
* fix merge conflict
* merge the tests so cluster is created once
* fix other test
* [API Gateway] Add integration test for conflicted TCP listeners
* [API Gateway] Update simple test to leverage intentions and multiple listeners
* Fix broken unit test
* [API Gateway] Add integration test for HTTP routes
* [API Gateway] Add integration test for conflicted TCP listeners
* [API Gateway] Update simple test to leverage intentions and multiple listeners
* Fix broken unit test
* PR suggestions
Prior to this commit, secondary datacenters could not be initialized
as peering acceptors if ACLs were enabled. This is due to the fact that
internal server-to-server API calls would fail because the management
token was not generated. This PR makes it so that both primary and
secondary datacenters generate their own management token whenever
a leader is elected in their respective clusters.
1. Upgraded agent can inherit the persisted token and join the cluster
2. Agent token prior to upgrade is still valid after upgrade
3. Enable ACL in the agent configuration
* rate limit test
* Have tests for the 3 modes
* added assertions for logs and metrics
* add comments to test sections
* add check for rate limit exceeded text in log assertion section.
* fix linting error
* updating test to use KV get and put. move log assertion tolast.
* Adding logging for blocking messages in enforcing mode. refactoring tests.
* modified test description
* formatting
* Apply suggestions from code review
Co-authored-by: Dan Upton <daniel@floppy.co>
* Update test/integration/consul-container/test/ratelimit/ratelimit_test.go
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
* expand log checking so that it ensures both logs are they when they are supposed to be and not there when they are not expected to be.
* add retry on test
* Warn once when rate limit exceed regardless of enforcing vs permissive.
* Update test/integration/consul-container/test/ratelimit/ratelimit_test.go
Co-authored-by: Dan Upton <daniel@floppy.co>
Co-authored-by: Dan Upton <daniel@floppy.co>
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
- remove dep on consul main module
- use 'consul tls' subcommands instead of tlsutil
- use direct json config construction instead of agent/config structs
- merge libcluster and libagent packages together
- more widely use BuildContext
- get the OSS/ENT runner stuff working properly
- reduce some flakiness
- fix some correctness related to http/https API
* Protobuf Modernization
Remove direct usage of golang/protobuf in favor of google.golang.org/protobuf
Marshallers (protobuf and json) needed some changes to account for different APIs.
Moved to using the google.golang.org/protobuf/types/known/* for the well known types including replacing some custom Struct manipulation with whats available in the structpb well known type package.
This also updates our devtools script to install protoc-gen-go from the right location so that files it generates conform to the correct interfaces.
* Fix go-mod-tidy make target to work on all modules
* Refactoring the peering integ test to accommodate coming changes of other upgrade scenarios.
- Add a utils package under test that contains methods to set up various test scenarios.
- Deduplication: have a single CreatingPeeringClusterAndSetup replace
CreatingAcceptingClusterAndSetup and CreateDialingClusterAndSetup.
- Separate peering cluster creation and server registration.
* Apply suggestions from code review
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Adds automation for generating the map of `gRPC Method Name → Rate Limit Type`
used by the middleware introduced in #15550, and will ensure we don't forget
to add new endpoints.
Engineers must annotate their RPCs in the proto file like so:
```
rpc Foo(FooRequest) returns (FooResponse) {
option (consul.internal.ratelimit.spec) = {
operation_type: READ,
};
}
```
When they run `make proto` a protoc plugin `protoc-gen-consul-rate-limit` will
be installed that writes rate-limit specs as a JSON array to a file called
`.ratelimit.tmp` (one per protobuf package/directory).
After running Buf, `make proto` will execute a post-process script that will
ingest all of the `.ratelimit.tmp` files and generate a Go file containing the
mappings in the `agent/grpc-middleware` package. In the enterprise repository,
it will write an additional file with the enterprise-only endpoints.
If an engineer forgets to add the annotation to a new RPC, the plugin will
return an error like so:
```
RPC Foo is missing rate-limit specification, fix it with:
import "proto-public/annotations/ratelimit/ratelimit.proto";
service Bar {
rpc Foo(...) returns (...) {
option (hashicorp.consul.internal.ratelimit.spec) = {
operation_type: OPERATION_READ | OPERATION_WRITE | OPERATION_EXEMPT,
};
}
}
```
In the future, this annotation can be extended to support rate-limit
category (e.g. KV vs Catalog) and to determine the retry policy.
* feat(ingress-gateway): support outlier detection of upstream service for ingress gateway
* changelog
Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>
* integ-test: fix flaky test - case-cfg-splitter-peering-ingress-gateways
* add retry peering to all peering cases
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* integ-test: test consul upgrade from the snapshot of a running cluster
* use Target version as default
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
* update go version to 1.18 for api and sdk, go mod tidy
* removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward.
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
This is instead of the current behavior where we feed the config entries in using the config_entries.bootstrap configuration which oddly races against other setup code in some circumstances.
I converted ALL tests to explicitly create config entries.
The integration test TestEnvoy/case-ingress-gateway-multiple-services is flaky
and this possibly reduces the flakiness by explicitly waiting for services to show
up in the catalog as healthy before waiting for them to show up in envoy as
healthy which gives it just a bit more time to sync.