This PR fixes an issue where upstreams did not correctly inherit the proper
namespace / partition from the parent service when attempting to fetch the
upstream protocol due to inconsistent normalization.
Some of the merge-service-configuration logic would normalize to default, while
some of the proxycfg logic would normalize to match the parent service. Due to
this mismatch in logic, an incorrect service-defaults configuration entry would
be fetched and have its protocol applied to the upstream.
* Add InboundPeerTrustBundle maps to Terminating Gateway
* Add notify and cancelation of watch for inbound peer trust bundles
* Pass peer trust bundles to the RBAC creation function
* Regenerate Golden Files
* add changelog, also adds another spot that needed peeredTrustBundles
* Add basic test for terminating gateway with peer trust bundle
* Add intention to cluster peered golden test
* rerun codegen
* update changelog
* really update the changelog
---------
Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com>
Ongoing work to support Nomad Workload Identity for authenticating with Consul
will mean that Nomad's service registration sync with Consul will want to use
Consul tokens scoped to individual workloads for registering services and
checks. The `CheckRegister` method in the API doesn't have an option to pass the
token in, which prevent us from sharing the same Consul connection for all
workloads. Add a `CheckRegisterOpts` to match the behavior of
`ServiceRegisterOpts`.
Ongoing work to support Nomad Workload Identity for authenticating with Consul
will mean that Nomad's service registration sync with Consul will want to use
Consul tokens scoped to individual workloads for registering services and
checks. The `ServiceRegisterOpts` type in the API doesn't have an option to pass
the token in, which prevent us from sharing the same Consul connection for all
workloads. Add a `Token` field to match the behavior of `ServiceDeregisterOpts`.
* dns token
fix whitespace for docs and comments
fix test cases
fix test cases
remove tabs in help text
Add changelog
Peering dns test
Peering dns test
Partial implementation of Peered DNS test
Swap to new topology lib
expose dns port for integration tests on client
remove partial test implementation
remove extra port exposure
remove changelog from the ent pr
Add dns token to set-agent-token switch
Add enterprise golden file
Use builtin/dns template in tests
Update ent dns policy
Update ent dns template test
remove local gen certs
fix templated policy specs
* add changelog
* go mod tidy
Configure Envoy to use the same HTTP protocol version used by the
downstream caller when forwarding requests to a local application that
is configured with the protocol set to either `http2` or `grpc`.
This allows upstream applications that support both HTTP/1.1 and
HTTP/2 on a single port to receive requests using either protocol. This
is beneficial when the application primarily communicates using HTTP/2,
but also needs to support HTTP/1.1, such as to respond to Kubernetes
HTTP readiness/liveness probes.
Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>
* debug since
* fix docs
* chagelog added
* fix go mod
* debug test fix
* fix test
* tabs test fix
* Update .changelog/18797.txt
Co-authored-by: Ganesh S <ganesh.seetharaman@hashicorp.com>
---------
Co-authored-by: Ganesh S <ganesh.seetharaman@hashicorp.com>
* Add response header filters to http-route config entry definitions
* Map response header filters from config entry when constructing route destination
* Support response header modifiers at the service level as well
* Update protobuf definitions
* Update existing unit tests
* Add response filters to route consolidation logic
* Make existing unit tests more robust
* Add missing docstring
* Add changelog entry
* Add response filter modifiers to existing integration test
* Add more robust testing for response header modifiers in the discovery chain
* Add more robust testing for request header modifiers in the discovery chain
* Modify test to verify that service filter modifiers take precedence over rule filter modifiers
* [NET-5325] ACL templated policies support in tokens and roles
- Add API support for creating tokens/roles with templated-policies
- Add CLI support for creating tokens/roles with templated-policies
* adding changelog
This PR enables the GetEnvoyBootstrapParams endpoint to construct envoy bootstrap parameters from v2 catalog and mesh resources.
* Make bootstrap request and response parameters less specific to services so that we can re-use them for workloads or service instances.
* Remove ServiceKind from bootstrap params response. This value was unused previously and is not needed for V2.
* Make access logs generation generic so that we can generate them using v1 or v2 resources.
Add support for querying tokens by service name
The consul-k8s endpoints controller has a workflow where it fetches all tokens.
This is not performant for large clusters, where there may be a sizable number
of tokens. This commit attempts to alleviate that problem and introduces a new
way to query by the token's service name.
Fix issue where agentless endpoints would fail to populate after snapshot restore.
Fixes an issue that was introduced in #17775. This issue happens because
a long-lived pointer to the state store is held, which is unsafe to do.
Snapshot restorations will swap out this state store, meaning that the
proxycfg watches would break for agentless.
* init
* tests added and few fixes
* revert arg message
* changelog added
* removed var declaration
* fix CI
* fix test
* added node name and status
* updated save.mdx
* added example
* fix tense
* fix description
* fix for #18406 , non presence of consul-version meta
* removed redundant checks
* updated mock-api to mimic api response for synthetic nodes
* added test to test getDistinctConsulVersions method with synthetic-node case
* updated typo in comments
* added change log
* Added oss config entries for Policy and JWT on APIGW
* Updated structs for config entry
* Updated comments, ran deep-copy
* Move JWT configuration into OSS file
* Add in the config entry OSS file for jwts
* Added changelog
* fixing proto spacing
* Moved to using manually written deep copy method
* Use pointers for override/default fields in apigw config entries
* Run gen scripts for changed types
* Add logging to locality policy application
In OSS, this is currently a no-op.
* Inherit locality when registering sidecars
When sidecar locality is not explicitly configured, inherit locality
from the proxied service.
* OTElExporter now uses an EndpointProvider to discover the endpoint
* OTELSink uses a ConfigProvider to obtain filters and labels configuration
* improve tests for otel_sink
* Regex logic is moved into client for a method on the TelemetryConfig object
* Create a telemetry_config_provider and update deps to use it
* Fix conversion
* fix import newline
* Add logger to hcp client and move telemetry_config out of the client.go file
* Add a telemetry_config.go to refactor client.go
* Update deps
* update hcp deps test
* Modify telemetry_config_providers
* Check for nil filters
* PR review updates
* Fix comments and move around pieces
* Fix comments
* Remove context from client struct
* Moved ctx out of sink struct and fixed filters, added a test
* Remove named imports, use errors.New if not fformatting
* Remove HCP dependencies in telemetry package
* Add success metric and move lock only to grab the t.cfgHahs
* Update hash
* fix nits
* Create an equals method and add tests
* Improve telemetry_config_provider.go tests
* Add race test
* Add missing godoc
* Remove mock for MetricsClient
* Avoid goroutine test panics
* trying to kick CI lint issues by upgrading mod
* imprve test code and add hasher for testing
* Use structure logging for filters, fix error constants, and default to allow all regex
* removed hashin and modify logic to simplify
* Improve race test and fix PR feedback by removing hash equals and avoid testing the timer.Ticker logic, and instead unit test
* Ran make go-mod-tidy
* Use errtypes in the test
* Add changelog
* add safety check for exporter endpoint
* remove require.Contains by using error types, fix structure logging, and fix success metric typo in exporter
* Fixed race test to have changing config values
* Send success metric before modifying config
* Avoid the defer and move the success metric under
* [CC-5719] Add support for builtin global-read-only policy
* Add changelog
* Add read-only to docs
* Fix some minor issues.
* Change from ReplaceAll to Sprintf
* Change IsValidPolicy name to return an error instead of bool
* Fix PolicyList test
* Fix other tests
* Apply suggestions from code review
Co-authored-by: Paul Glass <pglass@hashicorp.com>
* Fix state store test for policy list.
* Fix naming issues
* Update acl/validation.go
Co-authored-by: Chris Thain <32781396+cthain@users.noreply.github.com>
* Update agent/consul/acl_endpoint.go
---------
Co-authored-by: Paul Glass <pglass@hashicorp.com>
Co-authored-by: Chris Thain <32781396+cthain@users.noreply.github.com>
Prevent partial application of Envoy extensions
Ensure that non-required extensions do not change xDS resources before
exiting on failure by cloning proto messages prior to applying each
extension.
To support this change, also move `CanApply` checks up a layer and make
them prior to attempting extension application, s.t. we avoid
unnecessary copies where extensions can't be applied.
Last, ensure that we do not allow panics from `CanApply` or `Extend`
checks to escape the attempted extension application.
* Fix topoloy intention with mixed connect-native/normal services.
If a service is registered twice, once with connect-native and once
without, the topology views would prune the existing intentions. This
change brings the code more in line with the transparent proxy behavior.
* Dedupe nodes in the ServiceTopology ui endpoint (like done with tags).
* Consider a service connect-native as soon as one instance is.
* api-gateway: subscribe to bound-api-gateway only after receiving api-gateway
This fixes a race condition due to our dependency on having the listener(s) from the api-gateway config entry in order to fully and properly process the resources on the bound-api-gateway config entry.
* Apply suggestions from code review
* Add changelog entry
Backfill changelog entry for c2bbe67 and 7402d06
Add a changelog entry for the follow-up PR since it was specific to the
fix and references the original change.
Update Go version to 1.20.6
This resolves [CVE-2023-29406]
(https://nvd.nist.gov/vuln/detail/CVE-2023-29406) for uses of the
`net/http` standard library.
Note that until the follow-up to #18124 is done, the version of Go used
in those impacted tests will need to remain on 1.20.5.
Bump golang.org/x/net to 0.12.0
While not necessary to directly address CVE-2023-29406 (which should be
handled by using a patched version of Go when building), an
accompanying change to HTTP/2 error handling does impact agent code.
See https://go-review.googlesource.com/c/net/+/506995 for the HTTP/2
change.
Bump this dependency across our submodules as well for the sake of
potential indirect consumers of `x/net/http`.
Updating RootPKIPath but not IntermediatePKIPath would not update
leaf signing certs with the new root. Unsure if this happens in practice
but manual testing showed it is a bug that would break mesh and agent
connections once the old root is pruned.
* update UINodes and UINodeInfo response with consul-version info added as NodeMeta, fetched from serf members
* update test cases TestUINodes, TestUINodeInfo
* added nil check for map
* add consul-version in local agent node metadata
* get consul version from serf member and add this as node meta in catalog register request
* updated ui mock response to include consul versions as node meta
* updated ui trans and added version as query param to node list route
* updates in ui templates to display consul version with filter and sorts
* updates in ui - model class, serializers,comparators,predicates for consul version feature
* added change log for Consul Version Feature
* updated to get version from consul service, if for some reason not available from serf
* updated changelog text
* updated dependent testcases
* multiselection version filter
* Update agent/consul/state/catalog.go
comments updated
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
---------
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
* fix(cli): remove failing check from 'connect envoy' registration for api gateway
* test(integration): add tests to check catalog statsus of gateways on startup
* remove extra sleep comment
* Update test/integration/consul-container/libs/assert/service.go
* changelog
This PR fixes a bug that was introduced in:
https://github.com/hashicorp/consul/pull/16021
A user setting a protocol in proxy-defaults would cause tproxy implicit
upstreams to not honor the upstream service's protocol set in its
`ServiceDefaults.Protocol` field, and would instead always use the
proxy-defaults value.
Due to the fact that upstreams configured with "tcp" can successfully contact
upstream "http" services, this issue was not recognized until recently (a
proxy-defaults with "tcp" and a listening service with "http" would make
successful requests, but not the opposite).
As a temporary work-around, users experiencing this issue can explicitly set
the protocol on the `ServiceDefaults.UpstreamConfig.Overrides`, which should
take precedence.
The fix in this PR removes the proxy-defaults protocol from the wildcard
upstream that tproxy uses to configure implicit upstreams. When the protocol
was included, it would always overwrite the value during discovery chain
compilation, which was not correct. The discovery chain compiler also consumes
proxy defaults to determine the protocol, so simply excluding it from the
wildcard upstream config map resolves the issue.
* # This is a combination of 9 commits.
# This is the 1st commit message:
init without tests
# This is the commit message #2:
change log
# This is the commit message #3:
fix tests
# This is the commit message #4:
fix tests
# This is the commit message #5:
added tests
# This is the commit message #6:
change log breaking change
# This is the commit message #7:
removed breaking change
# This is the commit message #8:
fix test
# This is the commit message #9:
keeping the test behaviour same
* # This is a combination of 12 commits.
# This is the 1st commit message:
init without tests
# This is the commit message #2:
change log
# This is the commit message #3:
fix tests
# This is the commit message #4:
fix tests
# This is the commit message #5:
added tests
# This is the commit message #6:
change log breaking change
# This is the commit message #7:
removed breaking change
# This is the commit message #8:
fix test
# This is the commit message #9:
keeping the test behaviour same
# This is the commit message #10:
made enable debug atomic bool
# This is the commit message #11:
fix lint
# This is the commit message #12:
fix test true enable debug
* parent 10f500e895d92cc3691ade7b74a33db755d22039
author absolutelightning <ashesh.vidyut@hashicorp.com> 1687352587 +0530
committer absolutelightning <ashesh.vidyut@hashicorp.com> 1687352592 +0530
init without tests
change log
fix tests
fix tests
added tests
change log breaking change
removed breaking change
fix test
keeping the test behaviour same
made enable debug atomic bool
fix lint
fix test true enable debug
using enable debug in agent as atomic bool
test fixes
fix tests
fix tests
added update on correct locaiton
fix tests
fix reloadable config enable debug
fix tests
fix init and acl 403
* revert commit
* Ensure RSA keys are at least 2048 bits in length
* Add changelog
* update key length check for FIPS compliance
* Fix no new variables error and failing to return when error exists from
validating
* clean up code for better readability
* actually return value
* Fix a bug that wrongly trims domains when there is an overlap with DC name
Before this change, when DC name and domain/alt-domain overlap, the domain name incorrectly trimmed from the query.
Example:
Given: datacenter = dc-test, alt-domain = test.consul.
Querying for "test-node.node.dc-test.consul" will faile, because the
code was trimming "test.consul" instead of just ".consul"
This change, fixes the issue by adding dot (.) before trimming
* trimDomain: ensure domain trimmed without modyfing original domains
* update changelog
---------
Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>
Update CA provider docs
Clarify that providers can differ between
primary and secondary datacenters
Provide a comparison chart for consul vs
vault CA providers
Loosen Vault CA provider validation for RootPKIPath
Update Vault CA provider documentation
* Reject inbound Prop Override patch with Services
Services filtering is only supported for outbound TrafficDirection patches.
* Improve Prop Override unexpected type validation
- Guard against additional invalid parent and target types
- Add specific error handling for Any fields (unsupported)
Fix issue with streaming service health watches.
This commit fixes an issue where the health streams were unaware of service
export changes. Whenever an exported-services config entry is modified, it is
effectively an ACL change.
The bug would be triggered by the following situation:
- no services are exported
- an upstream watch to service X is spawned
- the streaming backend filters out data for service X (due to lack of exports)
- service X is finally exported
In the situation above, the streaming backend does not trigger a refresh of its
data. This means that any events that were supposed to have been received prior
to the export are NOT backfilled, and the watches never see service X spawning.
We currently have decided to not trigger a stream refresh in this situation due
to the potential for a thundering herd effect (touching exports would cause a
re-fetch of all watches for that partition, potentially). Therefore, a local
blocking-query approach was added by this commit for agentless.
It's also worth noting that the streaming subscription is currently bypassed
most of the time with agentful, because proxycfg has a `req.Source.Node != ""`
which prevents the `streamingEnabled` check from passing. This means that while
agents should technically have this same issue, they don't experience it with
mesh health watches.
Note that this is a temporary fix that solves the issue for proxycfg, but not
service-discovery use cases.
* agent: remove agent cache dependency from service mesh leaf certificate management
This extracts the leaf cert management from within the agent cache.
This code was produced by the following process:
1. All tests in agent/cache, agent/cache-types, agent/auto-config,
agent/consul/servercert were run at each stage.
- The tests in agent matching .*Leaf were run at each stage.
- The tests in agent/leafcert were run at each stage after they
existed.
2. The former leaf cert Fetch implementation was extracted into a new
package behind a "fake RPC" endpoint to make it look almost like all
other cache type internals.
3. The old cache type was shimmed to use the fake RPC endpoint and
generally cleaned up.
4. I selectively duplicated all of Get/Notify/NotifyCallback/Prepopulate
from the agent/cache.Cache implementation over into the new package.
This was renamed as leafcert.Manager.
- Code that was irrelevant to the leaf cert type was deleted
(inlining blocking=true, refresh=false)
5. Everything that used the leaf cert cache type (including proxycfg
stuff) was shifted to use the leafcert.Manager instead.
6. agent/cache-types tests were moved and gently replumbed to execute
as-is against a leafcert.Manager.
7. Inspired by some of the locking changes from derek's branch I split
the fat lock into N+1 locks.
8. The waiter chan struct{} was eventually replaced with a
singleflight.Group around cache updates, which was likely the biggest
net structural change.
9. The awkward two layers or logic produced as a byproduct of marrying
the agent cache management code with the leaf cert type code was
slowly coalesced and flattened to remove confusion.
10. The .*Leaf tests from the agent package were copied and made to work
directly against a leafcert.Manager to increase direct coverage.
I have done a best effort attempt to port the previous leaf-cert cache
type's tests over in spirit, as well as to take the e2e-ish tests in the
agent package with Leaf in the test name and copy those into the
agent/leafcert package to get more direct coverage, rather than coverage
tangled up in the agent logic.
There is no net-new test coverage, just coverage that was pushed around
from elsewhere.
* backport ent changes to oss
* Update .changelog/_5669.txt
Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>
---------
Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>