consul

Commit Graph

Author	SHA1	Message	Date
Ranjandas	1b1f33f224	Fixes Secondary ConnectCA update (#17846 ) This fixes a bug that was identified which resulted in subsequent ConnectCA configuration update not to persist in the cluster.	2023-06-29 14:24:24 +00:00
Samantha	f019457815	tlsutil: Fix check TLS configuration (#17481 ) * tlsutil: Fix check TLS configuration * Rewording docs. * Update website/content/docs/services/configuration/checks-configuration-reference.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> * Fix typos and add changelog entry. --------- Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2023-06-28 09:24:51 -07:00
John Maguire	67a239a821	Ensure RSA keys are at least 2048 bits in length (#17911 ) * Ensure RSA keys are at least 2048 bits in length * Add changelog * update key length check for FIPS compliance * Fix no new variables error and failing to return when error exists from validating * clean up code for better readability * actually return value	2023-06-28 15:34:09 +00:00
Joshua Timmons	55056be093	Add emit_tags_as_labels to envoy bootstrap config when using Consul Telemetry Collector (#17888 )	2023-06-27 12:34:38 -04:00
Alex Simenduev	33a2d90852	Fix a bug that wrongly trims domains when there is an overlap with DC name (#17160 ) * Fix a bug that wrongly trims domains when there is an overlap with DC name Before this change, when DC name and domain/alt-domain overlap, the domain name incorrectly trimmed from the query. Example: Given: datacenter = dc-test, alt-domain = test.consul. Querying for "test-node.node.dc-test.consul" will faile, because the code was trimming "test.consul" instead of just ".consul" This change, fixes the issue by adding dot (.) before trimming * trimDomain: ensure domain trimmed without modyfing original domains * update changelog --------- Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>	2023-06-26 10:57:11 -04:00
cskh	f16c5d87ab	watch: support -filter for consul watch: checks, services, nodes, service (#17780 ) * watch: support -filter for watch checks * Add filter for watch nodes, services, and service - unit test added - Add changelog - update doc	2023-06-23 12:00:46 -04:00
Chris S. Kim	a4653de8da	CA provider doc updates and Vault provider minor update (#17831 ) Update CA provider docs Clarify that providers can differ between primary and secondary datacenters Provide a comparison chart for consul vs vault CA providers Loosen Vault CA provider validation for RootPKIPath Update Vault CA provider documentation	2023-06-21 19:34:42 +00:00
George Bolo	82441a27fa	fixes #17732 - AccessorID in request body should be optional when updating ACL token (#17739 ) * AccessorID in request body should be optional when updating ACL token * add a test case * fix test case * add changelog entry for PR #17739	2023-06-21 13:31:40 -05:00
Ronald	5f95f5f6d8	Stop referenced jwt providers from being deleted (#17755 ) * Stop referenced jwt providers from being deleted	2023-06-16 10:31:53 -04:00
Michael Zalimeni	f9aa7aebb3	Property Override validation improvements (#17759 ) * Reject inbound Prop Override patch with Services Services filtering is only supported for outbound TrafficDirection patches. * Improve Prop Override unexpected type validation - Guard against additional invalid parent and target types - Add specific error handling for Any fields (unsupported)	2023-06-15 13:51:47 -04:00
Derek Menteer	04edace1de	Fix issue with streaming service health watches. (#17775 ) Fix issue with streaming service health watches. This commit fixes an issue where the health streams were unaware of service export changes. Whenever an exported-services config entry is modified, it is effectively an ACL change. The bug would be triggered by the following situation: - no services are exported - an upstream watch to service X is spawned - the streaming backend filters out data for service X (due to lack of exports) - service X is finally exported In the situation above, the streaming backend does not trigger a refresh of its data. This means that any events that were supposed to have been received prior to the export are NOT backfilled, and the watches never see service X spawning. We currently have decided to not trigger a stream refresh in this situation due to the potential for a thundering herd effect (touching exports would cause a re-fetch of all watches for that partition, potentially). Therefore, a local blocking-query approach was added by this commit for agentless. It's also worth noting that the streaming subscription is currently bypassed most of the time with agentful, because proxycfg has a `req.Source.Node != ""` which prevents the `streamingEnabled` check from passing. This means that while agents should technically have this same issue, they don't experience it with mesh health watches. Note that this is a temporary fix that solves the issue for proxycfg, but not service-discovery use cases.	2023-06-15 12:46:58 -05:00
Derek Menteer	8c74a1d33e	Add transparent proxy enhancements changelog (#17757 )	2023-06-15 11:48:39 -05:00
Ashesh Vidyut	fa40654885	[NET-3865] [Supportability] Additional Information in the output of 'consul operator raft list-peers' (#17582 ) * init * fix tests * added -detailed in docs * added change log * fix doc * checking for entry in map * fix tests * removed detailed flag * removed detailed flag * revert unwanted changes * removed unwanted changes * updated change log * pr review comment changes * pr comment changes single API instead of two * fix change log * fix tests * fix tests * fix test operator raft endpoint test * Update .changelog/17582.txt Co-authored-by: Semir Patel <semir.patel@hashicorp.com> * nits * updated docs --------- Co-authored-by: Semir Patel <semir.patel@hashicorp.com>	2023-06-14 15:12:50 +00:00
David Yu	212e0902fb	Bump Alpine to 3.18 (#17719 ) * Update Dockerfile * Create 17719.txt	2023-06-14 01:02:05 +00:00
Dan Stough	d497623266	docs: missing changelog for _5517 (#17706 )	2023-06-13 15:11:33 -04:00
R.B. Boyer	72f991d8d3	agent: remove agent cache dependency from service mesh leaf certificate management (#17075 ) * agent: remove agent cache dependency from service mesh leaf certificate management This extracts the leaf cert management from within the agent cache. This code was produced by the following process: 1. All tests in agent/cache, agent/cache-types, agent/auto-config, agent/consul/servercert were run at each stage. - The tests in agent matching .Leaf were run at each stage. - The tests in agent/leafcert were run at each stage after they existed. 2. The former leaf cert Fetch implementation was extracted into a new package behind a "fake RPC" endpoint to make it look almost like all other cache type internals. 3. The old cache type was shimmed to use the fake RPC endpoint and generally cleaned up. 4. I selectively duplicated all of Get/Notify/NotifyCallback/Prepopulate from the agent/cache.Cache implementation over into the new package. This was renamed as leafcert.Manager. - Code that was irrelevant to the leaf cert type was deleted (inlining blocking=true, refresh=false) 5. Everything that used the leaf cert cache type (including proxycfg stuff) was shifted to use the leafcert.Manager instead. 6. agent/cache-types tests were moved and gently replumbed to execute as-is against a leafcert.Manager. 7. Inspired by some of the locking changes from derek's branch I split the fat lock into N+1 locks. 8. The waiter chan struct{} was eventually replaced with a singleflight.Group around cache updates, which was likely the biggest net structural change. 9. The awkward two layers or logic produced as a byproduct of marrying the agent cache management code with the leaf cert type code was slowly coalesced and flattened to remove confusion. 10. The .Leaf tests from the agent package were copied and made to work directly against a leafcert.Manager to increase direct coverage. I have done a best effort attempt to port the previous leaf-cert cache type's tests over in spirit, as well as to take the e2e-ish tests in the agent package with Leaf in the test name and copy those into the agent/leafcert package to get more direct coverage, rather than coverage tangled up in the agent logic. There is no net-new test coverage, just coverage that was pushed around from elsewhere.	2023-06-13 10:54:45 -05:00
Dan Stough	bba5cd8455	fix: stop peering delete routine on leader loss (#17483 )	2023-06-13 10:20:56 -04:00
Ashesh Vidyut	d54d5fb85c	[NET-4107][Supportability] Log Level set to TRACE and duration set to 5m for consul-debug (#17596 ) * changed duration to 5 mins and log level to trace * documentation update * change log	2023-06-13 11:07:46 +05:30
Joshua Timmons	28d81ec79f	Fix two WAL metrics in docs/agent/telemetry.mdx (#17593 )	2023-06-12 18:50:59 -04:00
Andrew Stucki	3cb70566a9	[API Gateway] Fix rate limiting for API gateways (#17631 ) * [API Gateway] Fix rate limiting for API gateways * Add changelog * Fix failing unit tests * Fix operator usage tests for api package	2023-06-09 08:22:32 -04:00
Michael Zalimeni	30e0c234ab	Update list of Envoy versions (#17546 )	2023-06-09 02:37:49 +00:00
Ronald	7ae457c586	enterprise changelog update for audit (#17625 )	2023-06-08 19:50:51 -04:00
Ronald	17f4689379	backport ent changes to oss (#17614 ) * backport ent changes to oss * Update .changelog/_5669.txt Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> --------- Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>	2023-06-08 16:34:31 +00:00
Andrew Stucki	9a4f503b2b	[API Gateway] Fix trust domain for external peered services in synthesis code (#17609 ) * [API Gateway] Fix trust domain for external peered services in synthesis code * Add changelog	2023-06-08 12:18:17 -04:00
Ronald	8118aae5c1	Add writeAuditRPCEvent to agent_oss (#17607 ) * Add writeAuditRPCEvent to agent_oss * fix the other diffs * backport change log	2023-06-07 22:35:48 +00:00
Joshua Timmons	7a2ee145bf	Fix metric names in Consul agent telemetry docs (#17577 )	2023-06-06 14:42:30 -04:00
Andrew Stucki	f9d9d4db60	Fix subscribing/fetching objects not in the default partition (#17581 ) * Fix subscribing/fetching objects not in the default namespace * add changelog	2023-06-06 09:09:33 -04:00
Andrew Stucki	4ddb88ec7e	Fix up case where subscription is terminated due to ACLs changing or a snapshot restore occurring (#17566 ) * Fix up case where subscription is terminated due to ACLs changing or a snapshot restore occurring * Add changelog entry * Switch to use errors.Is	2023-06-05 13:10:17 -04:00
Dave Rawks	a55d368a0e	Resolves issue-16844 - systemd notify by default (#16845 ) * updates `consul.service` systemd service unit to use `Type=notify` to resolve issue #16844 * add changelog update to match	2023-06-02 10:04:48 -07:00
Poonam Jadhav	d9e18b4bf0	changelog: add changelog for reporting (#17535 )	2023-06-02 08:59:48 -04:00
Dan Stough	a043981cc6	Revert "fix(connect envoy): set initial_fetch_timeout to wait for initial xDS… (#17317 )" (#17540 ) This reverts commit `be7d2a4d84`.	2023-06-01 13:10:41 -04:00
Andrew Stucki	ca12ce926b	[API Gateway] Fix use of virtual resolvers in HTTPRoutes (#17055 ) * [API Gateway] Fix use of virtual resolvers in routes * Add changelog entry	2023-05-31 16:58:40 -04:00
Nathan Coleman	b438a07326	Export peering cli (#15654 ) * Sujata's peering-cli branch * Added error message for connecting to cluster * We can export service to peer * export handling multiple peers * export handles multiple peers * export now can handle multiple services * Export after 1st cleanup * Successful export * Added the namespace option * Add .changelog entry * go mod tidy * Stub unit tests for peering export command * added export in peering.go * Adding export_test * Moved the code to services from peers and cleaned the serviceNamespace * Added support for exporting to partitions * Fixed partition bug * Added unit tests for export command * Add multi-tenancy flags * gofmt * Add some helpful comments * Exclude namespace + partition flags when running OSS * cleaned up partition stuff * Validate required flags differently for OSS vs. ENT * Update success output to include only the requested consumers * cleaned up * fixed broken test * gofmt * Include all flags in OSS build * Remove example previously added to peering command * Move stray import into correct block * Update changelog entry to include support for exporting to a partition * Add required-ness label to consumer-peers flag description * Update command/services/export/export.go Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * Add docs placeholder for new services export command * Moved piece of code to OSS * Break config entry init + update into separate functions * fixed * Vary existing service export comparison for OSS vs. ENT * Move OSS-specific test to export_oss_test.go * Set config entry name based on partition being exported from * Set namespace on added services * Adding namespace * Remove export documentation We will include documentation in a followup PR * Consolidate code from export_oss into export.go * Consolidated export_oss_test.go and export_test.go * Add example of partition export to command synopsis * Allow empty peers flag if partitions flag provided * Add test coverage for -consumer-partitions flag * Update command/services/export/export.go Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Update command/services/export/export.go Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Update changelog entry * Use "cluster peers" to clear up any possible confusion * Update test assertions --------- Co-authored-by: 20sr20 <sujata@hashicorp.com> Co-authored-by: Dan Stough <dan.stough@hashicorp.com> Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>	2023-05-31 14:27:35 -04:00
Dhia Ayachi	da94cbdb25	add changelog (#17528 )	2023-05-31 13:29:59 -04:00
Jared Kirschner	b9c9d79778	Accept ap, datacenter, and namespace query params (#17525 ) This commit only contains the OSS PR (datacenter query param support). A separate enterprise PR adds support for ap and namespace query params. Resources in Consul can exists within scopes such as datacenters, cluster peers, admin partitions, and namespaces. You can refer to those resources from interfaces such as the CLI, HTTP API, DNS, and configuration files. Some scope levels have consistent naming: cluster peers are always referred to as "peer". Other scope levels use a short-hand in DNS lookups... - "ns" for namespace - "ap" for admin partition - "dc" for datacenter ...But use long-hand in CLI commands: - "namespace" for namespace - "partition" for admin partition - and "datacenter" However, HTTP API query parameters do not follow a consistent pattern, supporting short-hand for some scopes but long-hand for others: - "ns" for namespace - "partition" for admin partition - and "dc" for datacenter. This inconsistency is confusing, especially for users who have been exposed to providing scope names through another interface such as CLI or DNS queries. This commit improves UX by consistently supporting both short-hand and long-hand forms of the namespace, partition, and datacenter scopes in HTTP API query parameters.	2023-05-31 11:50:24 -04:00
Nick Ethier	44f90132e0	hoststats: add package for collecting host statistics including cpu memory and disk usage (#17038 )	2023-05-30 18:43:29 +00:00
Ronald	55e283dda9	[NET-3092] JWT Verify claims handling (#17452 ) * [NET-3092] JWT Verify claims handling	2023-05-30 13:38:33 -04:00
Dan Stough	bc9bb99a56	build(deps): update UBI base image to 9.2 (#17513 )	2023-05-30 12:48:13 -04:00
Chris Thain	65b8ccdc1b	Enable Network filters for Wasm Envoy Extension (#17505 )	2023-05-30 07:17:33 -07:00
Ashvitha	091925bcb7	HCP Telemetry Feature (#17460 ) * Move hcp client to subpackage hcpclient (#16800) * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] OTELExporter (#17128) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * [HCP Observability] OTELSink (#17159) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Fix nits * [HCP Observability] Init OTELSink in Telemetry (#17162) * Move hcp client to subpackage hcpclient (#16800) * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * [HCP Observability] OTELExporter (#17128) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Initialize OTELSink with sync.Map for all the instrument stores. * Added telemetry agent to client and init sink in deps * Fixed client * Initalize sink in deps * init sink in telemetry library * Init deps before telemetry * Use concrete telemetry.OtelSink type * add /v1/metrics * Avoid returning err for telemetry init * move sink init within the IsCloudEnabled() * Use HCPSinkOpts in deps instead * update golden test for configuration file * Switch to using extra sinks in the telemetry library * keep name MetricsConfig * fix log in verifyCCMRegistration * Set logger in context * pass around MetricSink in deps * Fix imports * Rebased onto otel sink pr * Fix URL in test * [HCP Observability] OTELSink (#17159) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Fix nits * pass extraSinks as function param instead * Add default interval as package export * remove verifyCCM func * Add clusterID * Fix import and add t.Parallel() for missing tests * Kick Vercel CI * Remove scheme from endpoint path, and fix error logging * return metrics.MetricSink for sink method * Update SDK * [HCP Observability] Metrics filtering and Labels in Go Metrics sink (#17184) * Move hcp client to subpackage hcpclient (#16800) * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * [HCP Observability] OTELExporter (#17128) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Initialize OTELSink with sync.Map for all the instrument stores. * Added telemetry agent to client and init sink in deps * Fixed client * Initalize sink in deps * init sink in telemetry library * Init deps before telemetry * Use concrete telemetry.OtelSink type * add /v1/metrics * Avoid returning err for telemetry init * move sink init within the IsCloudEnabled() * Use HCPSinkOpts in deps instead * update golden test for configuration file * Switch to using extra sinks in the telemetry library * keep name MetricsConfig * fix log in verifyCCMRegistration * Set logger in context * pass around MetricSink in deps * Fix imports * Rebased onto otel sink pr * Fix URL in test * [HCP Observability] OTELSink (#17159) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Fix nits * pass extraSinks as function param instead * Add default interval as package export * remove verifyCCM func * Add clusterID * Fix import and add t.Parallel() for missing tests * Kick Vercel CI * Remove scheme from endpoint path, and fix error logging * return metrics.MetricSink for sink method * Update SDK * Added telemetry agent to client and init sink in deps * Add node_id and __replica__ default labels * add function for default labels and set x-hcp-resource-id * Fix labels tests * Commit suggestion for getDefaultLabels Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com> * Fixed server.id, and t.Parallel() * Make defaultLabels a method on the TelemetryConfig object * Rename FilterList to lowercase filterList * Cleanup filter implemetation by combining regex into a single one, and making the type lowercase * Fix append * use regex directly for filters * Fix x-resource-id test to use mocked value * Fix log.Error formats * Forgot the len(opts.Label) optimization) * Use cfg.NodeID instead --------- Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com> * remove replic tag (#17484) * [HCP Observability] Add custom metrics for OTEL sink, improve logging, upgrade modules and cleanup metrics client (#17455) * Add custom metrics for Exporter and transform operations * Improve deps logging Run go mod tidy * Upgrade SDK and OTEL * Remove the partial success implemetation and check for HTTP status code in metrics client * Add x-channel * cleanup logs in deps.go based on PR feedback * Change to debug log and lowercase * address test operation feedback * use GetHumanVersion on version * Fix error wrapping * Fix metric names * [HCP Observability] Turn off retries for now until dynamically configurable (#17496) * Remove retries for now until dynamic configuration is possible * Clarify comment * Update changelog * improve changelog --------- Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com>	2023-05-29 16:11:08 -04:00
Michael Zalimeni	5a46a8c604	Add `builtin/property-override` Envoy Extension (#17487 ) `property-override` is an extension that allows for arbitrarily patching Envoy resources based on resource matching filters. Patch operations resemble a subset of the JSON Patch spec with minor differences to facilitate patching pre-defined (protobuf) schemas. See Envoy Extension product documentation for more details. Co-authored-by: Eric Haberkorn <eric.haberkorn@hashicorp.com> Co-authored-by: Kyle Havlovitz <kyle@hashicorp.com>	2023-05-26 19:52:09 +00:00
Chris Thain	516eb4febc	Add `builtin/ext-authz` Envoy Extension (#17495 )	2023-05-26 12:22:54 -07:00
Lincoln Stoll	3605fde865	perf: Remove expensive reflection from raft/mesh hot path (#16552 ) * perf: Remove expensive reflection from raft/mesh hot path Replaces a reflection-based copy of a struct in the mesh topology with a deep-copy generated implementation. This is in the hot-path of raft FSM updates, and the reflection overhead was a substantial part of mesh registration times (~90%). This could manifest as raft thread saturation, and resulting instability. Co-authored-by: Joel Brandhorst <joel.brandhorst@gmail.com> * add changelog --------- Co-authored-by: Joel Brandhorst <joel.brandhorst@gmail.com> Co-authored-by: John Murret <john.murret@hashicorp.com>	2023-05-26 11:42:05 -06:00
Derek Menteer	a90c9ce2b0	Fix ACL check on health endpoint (#17424 ) Fix ACL check on health endpoint Prior to this change, the service health API would not explicitly return an error whenever a token with invalid permissions was given, and it would instead return empty results. With this change, a "Permission denied" error is returned whenever data is queried. This is done to better support the agent cache, which performs a fetch backoff sleep whenever ACL errors are encountered. Affected endpoints are: `/v1/health/connect/` and `/v1/health/ingress/`.	2023-05-24 16:35:55 -05:00
Derek Menteer	e2f15cfe56	Fix namespaced peer service updates / deletes. (#17456 ) * Fix namespaced peer service updates / deletes. This change fixes a function so that namespaced services are correctly queried when handling updates / deletes. Prior to this change, some peered services would not correctly be un-exported. * Add changelog.	2023-05-24 16:32:45 -05:00
Dan Stough	d935c7b466	[OSS] gRPC Blocking Queries (#17426 ) * feat: initial grpc blocking queries * changelog and docs update	2023-05-23 17:29:10 -04:00
Paul Glass	7f4fd2735a	Only synthesize anonymous token in primary DC (#17231 ) * Only synthesize anonymous token in primary DC * Add integration test for wan fed issue	2023-05-23 09:38:04 -05:00
Michael Zalimeni	b8d2640429	Disable remote proxy patching except AWS Lambda (#17415 ) To avoid unintended tampering with remote downstreams via service config, refactor BasicEnvoyExtender and RuntimeConfig to disallow typical Envoy extensions from being applied to non-local proxies. Continue to allow this behavior for AWS Lambda and the read-only Validate builtin extensions. Addresses CVE-2023-2816.	2023-05-23 11:55:06 +00:00
John Landa	8f6b9fe177	Add ACLs Enabled field to consul agent startup status message (#17086 ) * Add ACLs Enabled field to consul agent startup status message * Add changelog * Update startup messages to include default ACL policy configuration * Correct import groupings	2023-05-16 13:47:02 -05:00
Connor	0789661ce5	Rename hcp-metrics-collector to consul-telemetry-collector (#17327 ) * Rename hcp-metrics-collector to consul-telemetry-collector * Fix docs * Fix doc comment --------- Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>	2023-05-16 14:36:05 -04:00
Dan Stough	be7d2a4d84	fix(connect envoy): set initial_fetch_timeout to wait for initial xDS… (#17317 ) * fix(connect envoy): set initial_fetch_timeout to wait for initial xDS indefinitely --------- Co-authored-by: Kiril Angov <kiril.angov@gmail.com>	2023-05-15 10:45:16 -04:00
Dan Bond	95f462d5f1	agent: prevent very old servers re-joining a cluster with stale data (#17171 ) * agent: configure server lastseen timestamp Signed-off-by: Dan Bond <danbond@protonmail.com> * use correct config Signed-off-by: Dan Bond <danbond@protonmail.com> * add comments Signed-off-by: Dan Bond <danbond@protonmail.com> * use default age in test golden data Signed-off-by: Dan Bond <danbond@protonmail.com> * add changelog Signed-off-by: Dan Bond <danbond@protonmail.com> * fix runtime test Signed-off-by: Dan Bond <danbond@protonmail.com> * agent: add server_metadata Signed-off-by: Dan Bond <danbond@protonmail.com> * update comments Signed-off-by: Dan Bond <danbond@protonmail.com> * correctly check if metadata file does not exist Signed-off-by: Dan Bond <danbond@protonmail.com> * follow instructions for adding new config Signed-off-by: Dan Bond <danbond@protonmail.com> * add comments Signed-off-by: Dan Bond <danbond@protonmail.com> * update comments Signed-off-by: Dan Bond <danbond@protonmail.com> * Update agent/agent.go Co-authored-by: Dan Upton <daniel@floppy.co> * agent/config: add validation for duration with min Signed-off-by: Dan Bond <danbond@protonmail.com> * docs: add new server_rejoin_age_max config definition Signed-off-by: Dan Bond <danbond@protonmail.com> * agent: add unit test for checking server last seen Signed-off-by: Dan Bond <danbond@protonmail.com> * agent: log continually for 60s before erroring Signed-off-by: Dan Bond <danbond@protonmail.com> * pr comments Signed-off-by: Dan Bond <danbond@protonmail.com> * remove unneeded todo * agent: fix error message Signed-off-by: Dan Bond <danbond@protonmail.com> --------- Signed-off-by: Dan Bond <danbond@protonmail.com> Co-authored-by: Dan Upton <daniel@floppy.co>	2023-05-15 04:05:47 -07:00
R.B. Boyer	cd80ea18ff	grpc: ensure grpc resolver correctly uses lan/wan addresses on servers (#17270 ) The grpc resolver implementation is fed from changes to the router.Router. Within the router there is a map of various areas storing the addressing information for servers in those areas. All map entries are of the WAN variety except a single special entry for the LAN. Addressing information in the LAN "area" are local addresses intended for use when making a client-to-server or server-to-server request. The client agent correctly updates this LAN area when receiving lan serf events, so by extension the grpc resolver works fine in that scenario. The server agent only initially populates a single entry in the LAN area (for itself) on startup, and then never mutates that area map again. For normal RPCs a different structure is used for LAN routing. Additionally when selecting a server to contact in the local datacenter it will randomly select addresses from either the LAN or WAN addressed entries in the map. Unfortunately this means that the grpc resolver stack as it exists on server agents is either broken or only accidentally functions by having servers dial each other over the WAN-accessible address. If the operator disables the serf wan port completely likely this incidental functioning would break. This PR enforces that local requests for servers (both for stale reads or leader forwarded requests) exclusively use the LAN "area" information and also fixes it so that servers keep that area up to date in the router. A test for the grpc resolver logic was added, as well as a higher level full-stack test to ensure the externally perceived bug does not return.	2023-05-11 11:08:57 -05:00
cskh	48f7d99305	snapshot: some improvments to the snapshot process (#17236 ) * snapshot: some improvments to the snapshot process Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2023-05-09 15:28:52 -04:00
Derek Menteer	4f6da20fe5	Fix multiple issues related to proxycfg health queries. (#17241 ) Fix multiple issues related to proxycfg health queries. 1. The datacenter was not being provided to a proxycfg query, which resulted in bypassing agentless query optimizations and using the normal API instead. 2. The health rpc endpoint would return a zero index when insufficient ACLs were detected. This would result in the agent cache performing an infinite loop of queries in rapid succession without backoff.	2023-05-09 12:37:58 -05:00
Derek Menteer	50ef6a697e	Fix issue with peer stream node cleanup. (#17235 ) Fix issue with peer stream node cleanup. This commit encompasses a few problems that are closely related due to their proximity in the code. 1. The peerstream utilizes node IDs in several locations to determine which nodes / services / checks should be cleaned up or created. While VM deployments with agents will likely always have a node ID, agentless uses synthetic nodes and does not populate the field. This means that for consul-k8s deployments, all services were likely bundled together into the same synthetic node in some code paths (but not all), resulting in strange behavior. The Node.Node field should be used instead as a unique identifier, as it should always be populated. 2. The peerstream cleanup process for unused nodes uses an incorrect query for node deregistration. This query is NOT namespace aware and results in the node (and corresponding services) being deregistered prematurely whenever it has zero default-namespace services and 1+ non-default-namespace services registered on it. This issue is tricky to find due to the incorrect logic mentioned in #1, combined with the fact that the affected services must be co-located on the same node as the currently deregistering service for this to be encountered. 3. The stream tracker did not understand differences between services in different namespaces and could therefore report incorrect numbers. It was updated to utilize the full service name to avoid conflicts and return proper results.	2023-05-08 13:13:25 -05:00
John Murret	6fa104409e	security: update go version to 1.20.4 (#17240 ) * update go version to 1.20.3 * add changelog * rename changelog file to remove underscore * update to use 1.20.4 * update change log entry to reflect 1.20.4	2023-05-08 11:57:11 -06:00
John Eikenberry	bd76fdeaeb	enable auto-tidy expired issuers in vault (as CA) When using vault as a CA and generating the local signing cert, try to enable the PKI endpoint's auto-tidy feature with it set to tidy expired issuers.	2023-05-03 20:30:37 +00:00
Eric Haberkorn	2c0da88ce7	fix panic in `injectSANMatcher` when `tlsContext` is `nil` (#17185 )	2023-04-28 16:27:57 -04:00
Paul Glass	e4a341c88a	Permissive mTLS: Config entry filtering and CLI warnings (#17183 ) This adds filtering for service-defaults: consul config list -filter 'MutualTLSMode == "permissive"'. It adds CLI warnings when the CLI writes a config entry and sees that either service-defaults or proxy-defaults contains MutualTLSMode=permissive, or sees that the mesh config entry contains AllowEnablingPermissiveMutualTLSMode=true.	2023-04-28 12:51:36 -05:00
R.B. Boyer	6b4986907d	peering: ensure that merged central configs of peered upstreams for partitioned downstreams work (#17179 ) Partitioned downstreams with peered upstreams could not properly merge central config info (i.e. proxy-defaults and service-defaults things like mesh gateway modes) if the upstream had an empty DestinationPartition field in Enterprise. Due to data flow, if this setup is done using Consul client agents the field is never empty and thus does not experience the bug. When a service is registered directly to the catalog as is the case for consul-dataplane use this field may be empty and and the internal machinery of the merging function doesn't handle this well. This PR ensures the internal machinery of that function is referentially self-consistent.	2023-04-28 12:36:08 -05:00
John Landa	eded58b62a	Remove artificial ACLTokenMaxTTL limit for configuring acl token expiry (#17066 ) * Remove artificial ACLTokenMaxTTL limit for configuring acl token expiry * Add changelog * Remove test on default MaxTokenTTL * Change to imperitive tense for changelog entry	2023-04-28 10:57:30 -05:00
Freddy	e02ef16f02	Update HCP bootstrapping to support existing clusters (#16916 ) * Persist HCP management token from server config We want to move away from injecting an initial management token into Consul clusters linked to HCP. The reasoning is that by using a separate class of token we can have more flexibility in terms of allowing HCP's token to co-exist with the user's management token. Down the line we can also more easily adjust the permissions attached to HCP's token to limit it's scope. With these changes, the cloud management token is like the initial management token in that iit has the same global management policy and if it is created it effectively bootstraps the ACL system. * Update SDK and mock HCP server The HCP management token will now be sent in a special field rather than as Consul's "initial management" token configuration. This commit also updates the mock HCP server to more accurately reflect the behavior of the CCM backend. * Refactor HCP bootstrapping logic and add tests We want to allow users to link Consul clusters that already exist to HCP. Existing clusters need care when bootstrapped by HCP, since we do not want to do things like change ACL/TLS settings for a running cluster. Additional changes: * Deconstruct MaybeBootstrap so that it can be tested. The HCP Go SDK requires HTTPS to fetch a token from the Auth URL, even if the backend server is mocked. By pulling the hcp.Client creation out we can modify its TLS configuration in tests while keeping the secure behavior in production code. * Add light validation for data received/loaded. * Sanitize initial_management token from received config, since HCP will only ever use the CloudConfig.MangementToken. * Add changelog entry	2023-04-27 22:27:39 +02:00
John Maguire	391ed069c4	APIGW: Update how status conditions for certificates are handled (#17115 ) * Move status condition for invalid certifcate to reference the listener that is using the certificate * Fix where we set the condition status for listeners and certificate refs, added tests * Add changelog	2023-04-27 15:54:44 +00:00
Semir Patel	5eaeb7b8e5	Support Envoy's MaxEjectionPercent and BaseEjectionTime config entries for passive health checks (#15979 ) * Add MaxEjectionPercent to config entry * Add BaseEjectionTime to config entry * Add MaxEjectionPercent and BaseEjectionTime to protobufs * Add MaxEjectionPercent and BaseEjectionTime to api * Fix integration test breakage * Verify MaxEjectionPercent and BaseEjectionTime in integration test upstream confings * Website docs for MaxEjectionPercent and BaseEjection time * Add `make docs` to browse docs at http://localhost:3000 * Changelog entry * so that is the difference between consul-docker and dev-docker * blah * update proto funcs * update proto --------- Co-authored-by: Maliz <maliheh.monshizadeh@hashicorp.com>	2023-04-26 15:59:48 -07:00
Anita Akaeze	d4cacc7232	Merge pull request #5200 from hashicorp/NET-3758 (#17102 ) * Merge pull request #5200 from hashicorp/NET-3758 NET-3758: connect: update supported envoy versions to 1.26.0 * lint	2023-04-24 18:23:24 +00:00
Paul Banks	a011d8c944	Bump raft to 1.5.0 (#17081 ) * Bump raft to 1.5.0 * Add CHANGELOG entry * Add CHANGELOG entry with right extension (thanks VSCode) * Add CHANGELOG entry with right extension (thanks VSCode) * Go mod tidy	2023-04-21 20:13:55 +01:00
Paul Glass	77ecff3209	Permissive mTLS (#17035 ) This implements permissive mTLS , which allows toggling services into "permissive" mTLS mode. Permissive mTLS mode allows incoming "non Consul-mTLS" traffic to be forward unmodified to the application. * Update service-defaults and proxy-defaults config entries with a MutualTLSMode field * Update the mesh config entry with an AllowEnablingPermissiveMutualTLS field and implement the necessary validation. AllowEnablingPermissiveMutualTLS must be true to allow changing to MutualTLSMode=permissive, but this does not require that all proxy-defaults and service-defaults are currently in strict mode. * Update xDS listener config to add a "permissive filter chain" when MutualTLSMode=permissive for a particular service. The permissive filter chain matches incoming traffic by the destination port. If the destination port matches the service port from the catalog, then no mTLS is required and the traffic sent is forwarded unmodified to the application.	2023-04-19 14:45:00 -05:00
R.B. Boyer	d07aac8d7e	Revert "cache: refactor agent cache fetching to prevent unnecessary f… (#16818 ) (#17046 ) Revert "cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956)" Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>	2023-04-19 13:17:21 -05:00
Kyle Havlovitz	bdc3dd14c2	Avoid decoding nil pointer in map walker (#17048 )	2023-04-19 10:23:38 -07:00
Kevin Wang	268f93e6f4	Bump the golang.org/x/net to 0.7.0 to address CVE-2022-41723 (#16754 ) * Bump the golang.org/x/net to 0.7.0 to address CVE-2022-41723 https://nvd.nist.gov/vuln/detail/CVE-2022-41723 * Add changelog entry --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>	2023-04-18 17:31:08 +00:00
Andrei Komarov	eb9f671eaf	api: enable query options on agent force-leave endpoint (#15987 )	2023-04-18 11:31:48 -05:00
Nathan Coleman	5410139575	Update list of Envoy versions (#16889 ) * Update list of Envoy versions * Update docs + CI + tests * Add changelog entry * Add newly-released Envoy versions 1.23.8 and 1.24.6 * Add newly-released Envoy version 1.22.11	2023-04-12 17:43:15 -04:00
Dhia Ayachi	b85a149eaf	Memdb Txn Commit race condition fix (#16871 ) * Add a test to reproduce the race condition * Fix race condition by publishing the event after the commit and adding a lock to prevent out of order events. * split publish to generate the list of events before committing the transaction. * add changelog * remove extra func * Apply suggestions from code review Co-authored-by: Dan Upton <daniel@floppy.co> * add comment to explain test --------- Co-authored-by: Dan Upton <daniel@floppy.co>	2023-04-12 13:18:01 -04:00
Derek Menteer	1bcaeabfc3	Remove deprecated service-defaults upstream behavior. (#16957 ) Prior to this change, peer services would be targeted by service-default overrides as long as the new `peer` field was not found in the config entry. This commit removes that deprecated backwards-compatibility behavior. Now it is necessary to specify the `peer` field in order for upstream overrides to apply to a peer upstream.	2023-04-11 10:20:33 -05:00
Chris Thain	175bb1a303	Wasm Envoy HTTP extension (#16877 )	2023-04-06 14:12:07 -07:00
Freddy	f6de5ff635	Allow dialer to re-establish terminated peering (#16776 ) Currently, if an acceptor peer deletes a peering the dialer's peering will eventually get to a "terminated" state. If the two clusters need to be re-peered the acceptor will re-generate the token but the dialer will encounter this error on the call to establish: "failed to get addresses to dial peer: failed to refresh peer server addresses, will continue to use initial addresses: there is no active peering for "<<<ID>>>"" This is because in `exchangeSecret().GetDialAddresses()` we will get an error if fetching addresses for an inactive peering. The peering shows up as inactive at this point because of the existing terminated state. Rather than checking whether a peering is active we can instead check whether it was deleted. This way users do not need to delete terminated peerings in the dialing cluster before re-establishing them.	2023-04-03 12:07:45 -06:00
Eric Haberkorn	0d1d2fc4c9	add order by locality failover to Consul enterprise (#16791 )	2023-03-30 10:08:38 -04:00
John Maguire	c833464daf	Update normalization of route refs (#16789 ) * Use merge of enterprise meta's rather than new custom method * Add merge logic for tcp routes * Add changelog * Normalize certificate refs on gateways * Fix infinite call loop * Explicitly call enterprise meta	2023-03-28 11:23:49 -04:00
Michael Wilkerson	e5d58c59c9	changes to support new PQ enterprise fields (#16793 )	2023-03-27 15:40:49 -07:00
John Maguire	351bdc3c0d	Fix struct tags for TCPService enterprise meta (#16781 ) * Fix struct tags for TCPService enterprise meta * Add changelog	2023-03-27 16:17:04 +00:00
Derek Menteer	2236975011	Change partition for peers in discovery chain targets (#16769 ) This commit swaps the partition field to the local partition for discovery chains targeting peers. Prior to this change, peer upstreams would always use a value of default regardless of which partition they exist in. This caused several issues in xds / proxycfg because of id mismatches. Some prior fixes were made to deal with one-off id mismatches that this PR also cleans up, since they are no longer needed.	2023-03-24 15:40:19 -05:00
Luke Kysow	4845816c60	Changelog for audit logging fix. (#16700 ) * Changelog for audit logging fix.	2023-03-22 13:06:53 -07:00
Eric Haberkorn	3c5c53aa80	fix bug where pqs that failover to a cluster peer dont un-fail over (#16729 )	2023-03-22 09:24:13 -04:00
Nitya Dhanushkodi	b9bd2c3780	peering: peering partition failover fixes (#16673 ) add local source partition for peered upstreams	2023-03-20 10:00:29 -07:00
John Maguire	1ef9f4dade	Fix route subscription when using namespaces (#16677 ) * Fix route subscription when using namespaces * Update changelog * Fix changelog entry to reference that the bug was enterprise only	2023-03-20 12:42:30 -04:00
Melisa Griffin	606f8fbbab	Adds check to verify that the API Gateway is being created with at least one listener	2023-03-20 12:37:30 -04:00
Dhia Ayachi	b9d8552e25	Snapshot restore tests (#16647 ) * add snapshot restore test * add logstore as test parameter * Use the correct image version * make sure we read the logs from a followers to test the follower snapshot install path. * update to raf-wal v0.3.0 * add changelog. * updating changelog for bug description and removed integration test. * setting up test container builder to only set logStore for 1.15 and higher --------- Co-authored-by: Paul Banks <pbanks@hashicorp.com> Co-authored-by: John Murret <john.murret@hashicorp.com>	2023-03-18 14:43:22 -06:00
Andrew Stucki	501b87fd31	[API Gateway] Fix invalid cluster causing gateway programming delay (#16661 ) * Add test for http routes * Add fix * Fix tests * Add changelog entry * Refactor and fix flaky tests	2023-03-17 13:31:04 -04:00
Valeriia Ruban	b473151994	fix: add AccessorID property to PUT token request (#16660 )	2023-03-16 18:57:59 -07:00
Valeriia Ruban	ad25ba3068	feat: update typography to consume hds styles (#16577 )	2023-03-14 19:49:14 -07:00
Derek Menteer	8f75d99299	Fix issue with trust bundle read ACL check. (#16630 ) This commit fixes an issue where trust bundles could not be read by services in a non-default namespace, unless they had excessive ACL permissions given to them. Prior to this change, `service:write` was required in the default namespace in order to read the trust bundle. Now, `service:write` to a service in any namespace is sufficient.	2023-03-14 12:24:33 -05:00
Chris S. Kim	d5677e5680	Preserve CARoots when updating Vault CA configuration (#16592 ) If a CA config update did not cause a root change, the codepath would return early and skip some steps which preserve its intermediate certificates and signing key ID. This commit re-orders some code and prevents updates from generating new intermediate certificates.	2023-03-13 17:32:59 -04:00
Ashvitha	f95ffe0355	Allow HCP metrics collection for Envoy proxies Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory where a unix socket will be created with the name `<namespace>_<proxy_id>.sock` to forward Envoy metrics. If set, this will configure: - In bootstrap configuration a local stats_sink and static cluster. These will forward metrics to a loopback listener sent over xDS. - A dynamic listener listening at the socket path that the previously defined static cluster is sending metrics to. - A dynamic cluster that will forward traffic received at this listener to the hcp-metrics-collector service. Reasons for having a static cluster pointing at a dynamic listener: - We want to secure the metrics stream using TLS, but the stats sink can only be defined in bootstrap config. With dynamic listeners/clusters we can use the proxy's leaf certificate issued by the Connect CA, which isn't available at bootstrap time. - We want to intelligently route to the HCP collector. Configuring its addreess at bootstrap time limits our flexibility routing-wise. More on this below. Reasons for defining the collector as an upstream in `proxycfg`: - The HCP collector will be deployed as a mesh service. - Certificate management is taken care of, as mentioned above. - Service discovery and routing logic is automatically taken care of, meaning that no code changes are required in the xds package. - Custom routing rules can be added for the collector using discovery chain config entries. Initially the collector is expected to be deployed to each admin partition, but in the future could be deployed centrally in the default partition. These config entries could even be managed by HCP itself.	2023-03-10 13:52:54 -07:00
Tyler Wendlandt	e6aeb31a26	UI: Fix htmlsafe errors throughout the app (#16574 ) * Upgrade ember-intl * Add changelog * Add yarn lock	2023-03-09 12:43:35 -07:00
Eric Haberkorn	89de91b263	fix bug that can lead to peering service deletes impacting the state of local services (#16570 )	2023-03-08 11:24:03 -05:00
John Eikenberry	f5641ffccc	support vault auth config for alicloud ca provider Add support for using existing vault auto-auth configurations as the provider configuration when using Vault's CA provider with AliCloud. AliCloud requires 2 extra fields to enable it to use STS (it's preferred auth setup). Our vault-plugin-auth-alicloud package contained a method to help generate them as they require you to make an http call to a faked endpoint proxy to get them (url and headers base64 encoded).	2023-03-07 03:02:05 +00:00
Valeriia Ruban	63204b5183	feat: update notification to use hds toast component (#16519 )	2023-03-06 14:10:09 -08:00
Chris S. Kim	8daddff08d	Follow-up fixes to consul connect envoy command (#16530 )	2023-03-06 10:32:06 -05:00
Ronald	bf501a337b	Improve ux around ACL token to help users avoid overwriting node/service identities (#16506 ) * Deprecate merge-node-identities and merge-service-identities flags * added tests for node identities changes * added changelog file and docs	2023-03-06 15:00:39 +00:00
Melisa Griffin	fc232326a0	NET-2904 Fixes API Gateway Route Service Weight Division Error	2023-03-06 08:41:57 -05:00
Andrew Stucki	897e5ef2d3	Add some basic UI improvements for api-gateway services (#16508 ) * Add some basic ui improvements for api-gateway services * Add changelog entry * Use ternary for null check * Update gateway doc links * rename changelog entry for new PR * Fix test	2023-03-03 16:59:04 -05:00
Melisa Griffin	129eca8fdb	NET-2903 Normalize weight for http routes (#16512 ) * NET-2903 Normalize weight for http routes * Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2023-03-03 16:39:59 -05:00
R.B. Boyer	9a485cdb49	proxycfg: ensure that an irrecoverable error in proxycfg closes the xds session and triggers a replacement proxycfg watcher (#16497 ) Receiving an "acl not found" error from an RPC in the agent cache and the streaming/event components will cause any request loops to cease under the assumption that they will never work again if the token was destroyed. This prevents log spam (#14144, #9738). Unfortunately due to things like: - authz requests going to stale servers that may not have witnessed the token creation yet - authz requests in a secondary datacenter happening before the tokens get replicated to that datacenter - authz requests from a primary TO a secondary datacenter happening before the tokens get replicated to that datacenter The caller will get an "acl not found" before the token exists, rather than just after. The machinery added above in the linked PRs will kick in and prevent the request loop from looping around again once the tokens actually exist. For `consul-dataplane` usages, where xDS is served by the Consul servers rather than the clients ultimately this is not a problem because in that scenario the `agent/proxycfg` machinery is on-demand and launched by a new xDS stream needing data for a specific service in the catalog. If the watching goroutines are terminated it ripples down and terminates the xDS stream, which CDP will eventually re-establish and restart everything. For Consul client usages, the `agent/proxycfg` machinery is ahead-of-time launched at service registration time (called "local" in some of the proxycfg machinery) so when the xDS stream comes in the data is already ready to go. If the watching goroutines terminate it should terminate the xDS stream, but there's no mechanism to re-spawn the watching goroutines. If the xDS stream reconnects it will see no `ConfigSnapshot` and will not get one again until the client agent is restarted, or the service is re-registered with something changed in it. This PR fixes a few things in the machinery: - there was an inadvertent deadlock in fetching snapshot from the proxycfg machinery by xDS, such that when the watching goroutine terminated the snapshots would never be fetched. This caused some of the xDS machinery to get indefinitely paused and not finish the teardown properly. - Every 30s we now attempt to re-insert all locally registered services into the proxycfg machinery. - When services are re-inserted into the proxycfg machinery we special case "dead" ones such that we unilaterally replace them rather that doing that conditionally.	2023-03-03 14:27:53 -06:00
John Eikenberry	56ffee6d42	add provider ca support for approle auth-method Adds support for the approle auth-method. Only handles using the approle role/secret to auth and it doesn't support the agent's extra management configuration options (wrap and delete after read) as they are not required as part of the auth (ie. they are vault agent things).	2023-03-03 19:29:53 +00:00
Andrew Stucki	cc0765b87d	Fix resolution of service resolvers with subsets for external upstreams (#16499 ) * Fix resolution of service resolvers with subsets for external upstreams * Add tests * Add changelog entry * Update view filter logic	2023-03-03 14:17:11 -05:00
Andrew Stucki	5deffbd95b	Fix issue where terminating gateway service resolvers weren't properly cleaned up (#16498 ) * Fix issue where terminating gateway service resolvers weren't properly cleaned up * Add integration test for cleaning up resolvers * Add changelog entry * Use state test and drop integration test	2023-03-03 09:56:57 -05:00
Andrew Stucki	4b661d1e0c	Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable (#16495 ) * Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable * Regenerate golden files * Add RequestTimeout field * Add changelog entry	2023-03-03 09:37:12 -05:00
John Eikenberry	e8eec1fa80	add provider ca auth support for kubernetes Adds support for Kubernetes jwt/token file based auth. Only needs to read the file and save the contents as the jwt/token.	2023-03-02 22:05:40 +00:00
John Eikenberry	4211069080	add provider ca support for jwt file base auth Adds support for a jwt token in a file. Simply reads the file and sends the read in jwt along to the vault login. It also supports a legacy mode with the jwt string being passed directly. In which case the path is made optional.	2023-03-02 20:33:06 +00:00
Ronald	4f8594b28f	Improve ux to help users avoid overwriting fields of ACL tokens, roles and policies (#16288 ) * Deprecate merge-policies and add options add-policy-name/add-policy-id to improve CLI token update command * deprecate merge-roles fields * Fix potential flakey tests and update ux to remove 'completely' + typo fixes	2023-03-01 15:00:37 -05:00
cskh	3970115753	fix (cli): return error msg if acl policy not found (#16485 ) * fix: return error msg if acl policy not found * changelog * add test	2023-03-01 19:50:03 +00:00
John Eikenberry	4f2d9a91e5	add provider ca auth-method support for azure Does the required dance with the local HTTP endpoint to get the required data for the jwt based auth setup in Azure. Keeps support for 'legacy' mode where all login data is passed on via the auth methods parameters. Refactored check for hardcoded /login fields.	2023-03-01 00:07:33 +00:00
R.B. Boyer	26820219cd	cli: ensure acl token read -self works (#16445 ) Fixes a regression in #16044 The consul acl token read -self cli command should not require an -accessor-id because typically the persona invoking this would not already know the accessor id of their own token.	2023-02-28 10:58:29 -06:00
Tyler Wendlandt	a0862e64a5	UI: Fix rendering issue in search and lists (#16444 ) * Upgrade ember-cli-string-helpers * add extra lock change	2023-02-27 16:31:47 -07:00
Valeriia Ruban	859abf8a34	fix: ui tests run is fixed (applying class attribute twice to the hbs element caused the issue (#16428 )	2023-02-24 23:46:45 -08:00
Valeriia Ruban	d9e6748738	feat: update alerts to Hds::Alert component (CC-4035) (#16412 )	2023-02-24 20:07:12 -08:00
Valeriia Ruban	0c66bbf2b4	[UI] CC-4031: change from Action, a and button to hds::Button (#16251 )	2023-02-22 13:05:15 -08:00
Nathan Coleman	c0384c2e30	Add changelog entry for API Gateway (Beta) (#16369 ) * Placeholder commit for changelog entry * Add changelog entry announcing support for API Gateway on VMs * Adjust casing	2023-02-22 13:10:05 -06:00
Derek Menteer	5309f68bc0	Upgrade Alpine image to 3.17 (#16358 )	2023-02-22 10:09:41 -06:00
Derek Menteer	ad865f549b	Fix issue with peer services incorrectly appearing as connect-enabled. (#16339 ) Prior to this commit, all peer services were transmitted as connect-enabled as long as a one or more mesh-gateways were healthy. With this change, there is now a difference between typical services and connect services transmitted via peering. A service will be reported as "connect-enabled" as long as any of these conditions are met: 1. a connect-proxy sidecar is registered for the service name. 2. a connect-native instance of the service is registered. 3. a service resolver / splitter / router is registered for the service name. 4. a terminating gateway has registered the service.	2023-02-21 13:59:36 -06:00
cskh	8e5942f5ca	fix: add tls config to unix socket when https is used (#16301 ) * fix: add tls config to unix socket when https is used * unit test and changelog	2023-02-21 08:28:13 -05:00
malizz	c9c49ea3a2	new docs for consul and consul-k8s troubleshoot command (#16284 ) * new docs for consul and consul-k8s troubleshoot command * add changelog * add troubleshoot command * address comments, and update cli output to match * revert changes to troubleshoot upstreams, changes will happen in separate pr * Update .changelog/16284.txt Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com> * address comments * update trouble proxy output * add missing s, add required fields in usage --------- Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>	2023-02-17 13:25:49 -08:00
Dan Stough	f1436109ea	[OSS] security: update go to 1.20.1 (#16263 ) * security: update go to 1.20.1	2023-02-17 15:04:12 -05:00
Dhia Ayachi	388876c206	add server side rate-limiter changelog entry (#16292 )	2023-02-16 19:21:50 +00:00
Derek Menteer	30112288c8	Fix mesh gateways incorrectly matching peer locality. (#16257 ) Fix mesh gateways incorrectly matching peer locality. This fixes an issue where local mesh gateways use an incorrect address when attempting to forward traffic to a peered datacenter. Prior to this change it would use the lan address instead of the wan if the locality matched. This should never be done for peering, since we must route all traffic through the remote mesh gateway.	2023-02-16 09:22:41 -06:00
Curt Bushko	1d9ee50681	[OSS] connect: Bump Envoy 1.22.5 to 1.22.7, 1.23.2 to 1.23.4, 1.24.0 to 1.24.2, add 1.25.1, remove 1.21.5 (#16274 ) * Bump Envoy 1.22.5 to 1.22.7, 1.23.2 to 1.23.4, 1.24.0 to 1.24.2, add 1.25.1, remove 1.21.5	2023-02-15 11:45:43 -05:00
Tyler Wendlandt	3f22879106	UI: CC-4032 - Update sidebar width (#16204 ) * Update chrome-width var to be 280px * Formatting & Changelog	2023-02-13 11:48:31 -07:00
Valeriia Ruban	663a5642c2	[UI]: update Ember to 3.27 (#16227 ) * Upgrade to 3.25 via ember-cli-update * v3.25.3...v3.26.1 * v3.26.1...v3.27.0 Co-authored-by: Michael Klein <michael@firstiwaslike.com>	2023-02-10 13:32:19 -08:00
Derek Menteer	4f2ce60654	Fix peering acceptors in secondary datacenters. (#16230 ) Prior to this commit, secondary datacenters could not be initialized as peering acceptors if ACLs were enabled. This is due to the fact that internal server-to-server API calls would fail because the management token was not generated. This PR makes it so that both primary and secondary datacenters generate their own management token whenever a leader is elected in their respective clusters.	2023-02-10 09:47:17 -06:00
skpratt	6f0b226b0d	ACL error improvements: incomplete bootstrapping and non-existent token (#16105 ) * add bootstrapping detail for acl errors * error detail improvements * update acl bootstrapping test coverage * update namespace errors * update test coverage * add changelog * update message for unbootstrapped error * consolidate error message code and update changelog * logout message change	2023-02-08 23:49:44 +00:00
Kyle Havlovitz	898e59b13c	Add the `operator usage instances` command and api endpoint (#16205 ) This endpoint shows total services, connect service instances and billable service instances in the local datacenter or globally. Billable instances = total service instances - connect services - consul server instances.	2023-02-08 12:07:21 -08:00
Paul Banks	5397e9ee7f	Adding experimental support for a more efficient LogStore implementation (#16176 ) * Adding experimental support for a more efficient LogStore implementation * Adding changelog entry * Fix go mod tidy issues	2023-02-08 16:50:22 +00:00
skpratt	9199e99e21	Update token language to distinguish Accessor and Secret ID usage (#16044 ) * remove legacy tokens * remove lingering legacy token references from docs * update language and naming for token secrets and accessor IDs * updates all tokenID references to clarify accessorID * remove token type references and lookup tokens by accessorID index * remove unnecessary constants * replace additional tokenID param names * Add warning info for deprecated -id parameter Co-authored-by: Paul Glass <pglass@hashicorp.com> * Update field comment Co-authored-by: Paul Glass <pglass@hashicorp.com> --------- Co-authored-by: Paul Glass <pglass@hashicorp.com>	2023-02-07 12:26:30 -06:00
Dhia Ayachi	e42ab7e429	Remove empty tags 2 (#16113 ) * Add support for RemoveEmptyTags in API client * Add changelog --------- Co-authored-by: Rémi Lapeyre <remi.lapeyre@lenstra.fr>	2023-02-06 11:12:43 -08:00
skpratt	a010902978	Remove legacy acl policies (#15922 ) * remove legacy tokens * remove legacy acl policies * flatten test policies to _prefix address oss feedback re: phrasing and tests	2023-02-06 15:35:52 +00:00
Derek Menteer	2f149d60cc	[OSS] Add Peer field to service-defaults upstream overrides (#15956 ) * Add Peer field to service-defaults upstream overrides. * add api changes, compat mode for service default overrides * Fixes based on testing --------- Co-authored-by: DanStough <dan.stough@hashicorp.com>	2023-02-03 10:51:53 -05:00
Paul Glass	a884d0d7c7	Use agent token for service/check deregistration during anti-entropy (#16097 ) Use only the agent token for deregistration during anti-entropy The previous behavior had the agent attempt to use the "service" token (i.e. from the `token` field in a service definition file), and if that was not set then it would use the agent token. The previous behavior was problematic because, if the service token had been deleted, the deregistration request would fail. The agent would retry the deregistration during each anti-entropy sync, and the situation would never resolve. The new behavior is to only/always use the agent token for service and check deregistration during anti-entropy. This approach is: * Simpler: No fallback logic to try different tokens * Faster (slightly): No time spent attempting the service token * Correct: The agent token is able to deregister services on that agent's node, because: * node:write permissions allow deregistration of services/checks on that node. * The agent token must have node:write permission, or else the agent is not be able to (de)register itself into the catalog Co-authored-by: Vesa Hagström <weeezes@gmail.com>	2023-02-03 08:45:11 -06:00
Kyle Havlovitz	d53c331a37	Add a flag for enabling debug logs to the `connect envoy` command (#15988 ) * Add a flag for enabling debug logs to the `connect envoy` command * Update website/content/commands/connect/envoy.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Add changelog note * Add debug log note to envoy proxy doc page * Update website/content/docs/connect/proxies/envoy.mdx Co-authored-by: Kendall Strautman <36613477+kendallstrautman@users.noreply.github.com> * Wording tweak in envoy bootstrap section --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> Co-authored-by: Kendall Strautman <36613477+kendallstrautman@users.noreply.github.com>	2023-01-31 13:30:20 -08:00
skpratt	ad43846755	Remove legacy acl tokens (#15947 ) * remove legacy tokens * Update test comment Co-authored-by: Paul Glass <pglass@hashicorp.com> * fix imports * update docs for additional CLI changes * add test case for anonymous token * set deprecated api fields to json ignore and fix patch errors * update changelog to breaking-change * fix import * update api docs to remove legacy reference * fix docs nav data --------- Co-authored-by: Paul Glass <pglass@hashicorp.com>	2023-01-27 09:17:07 -06:00
Ronald	6167aef641	Warn when the token query param is used for auth (#16009 )	2023-01-24 16:21:41 +00:00
cskh	25396d81c9	Apply agent partition to load services and agent api (#16024 ) * Apply agent partition to load services and agent api changelog	2023-01-20 12:59:26 -05:00
Ashwin Venkatesh	a1e2a4f8d6	Add support for envoy readiness flags (#16015 ) * Add support for envoy readiness flags - add flags 'envoy-ready-bind-port` and `envoy-ready-bind-addr` on consul connect envoy to create a ready listener on that address.	2023-01-19 16:54:11 -05:00
Chris Thain	2f4c8e50f2	Support Vault agent auth config for AWS/GCP CA provider auth (#15970 )	2023-01-18 11:53:04 -08:00
Derek Menteer	2facf50923	Fix configuration merging for implicit tproxy upstreams. (#16000 ) Fix configuration merging for implicit tproxy upstreams. Change the merging logic so that the wildcard upstream has correct proxy-defaults and service-defaults values combined into it. It did not previously merge all fields, and the wildcard upstream did not exist unless service-defaults existed (it ignored proxy-defaults, essentially). Change the way we fetch upstream configuration in the xDS layer so that it falls back to the wildcard when no matching upstream is found. This is what allows implicit peer upstreams to have the correct "merged" config. Change proxycfg to always watch local mesh gateway endpoints whenever a peer upstream is found. This simplifies the logic so that we do not have to inspect the "merged" configuration on peer upstreams to extract the mesh gateway mode.	2023-01-18 13:43:53 -06:00
Dan Upton	7a55de375c	xds: don't attempt to load-balance sessions for local proxies (#15789 ) Previously, we'd begin a session with the xDS concurrency limiter regardless of whether the proxy was registered in the catalog or in the server's local agent state. This caused problems for users who run `consul connect envoy` directly against a server rather than a client agent, as the server's locally registered proxies wouldn't be included in the limiter's capacity. Now, the `ConfigSource` is responsible for beginning the session and we only do so for services in the catalog. Fixes: https://github.com/hashicorp/consul/issues/15753	2023-01-18 12:33:21 -06:00
Chris S. Kim	e4a268e33e	Warn if ACL is enabled but no token is provided to Envoy (#15967 )	2023-01-16 12:31:56 -05:00
Derek Menteer	19a46d6ca4	Enforce lowercase peer names. (#15697 ) Enforce lowercase peer names. Prior to this change peer names could be mixed case. This can cause issues, as peer names are used as DNS labels in various locations. It also caused issues with envoy configuration.	2023-01-13 14:20:28 -06:00
Frank DiRocco	59a3a0749c	Update go-discover to support ECS discovery (#13782 ) Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2023-01-12 12:06:29 -06:00
Dan Stough	5d3643f4f0	docs(access logs): new docs for access logging (#15948 ) Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2023-01-11 16:41:02 +00:00
Paul Glass	f5231b9157	Add new config_file_service_registration token (#15828 )	2023-01-10 10:24:02 -06:00
Chris S. Kim	a7b34d50fc	Output user-friendly name for anonymous token (#15884 )	2023-01-09 12:28:53 -06:00
Derek Menteer	7b4f45e2d5	Fix issue where TLS configuration was ignored for unix sockets in consul connect envoy. (#15913 ) Fix issue where TLS configuration was ignored for unix sockets in consul connect envoy. Disable xds check on bootstrap mode and change check to warn only.	2023-01-06 12:34:49 -06:00
Eric Haberkorn	8d923c1789	Add the Lua Envoy extension (#15906 )	2023-01-06 12:13:40 -05:00
Dan Upton	d53ce39c32	grpc: switch servers and retry on error (#15892 ) This is the OSS portion of enterprise PR 3822. Adds a custom gRPC balancer that replicates the router's server cycling behavior. Also enables automatic retries for RESOURCE_EXHAUSTED errors, which we now get for free.	2023-01-05 10:21:27 +00:00
Nick Irvine	6fb628c07d	fix: return error when config file with unknown extension is passed (#15107 )	2023-01-04 16:57:00 -08:00
Florian Apolloner	077b0a48a3	Allow Operator Generated bootstrap token (#14437 ) Add support to provide an initial token via the bootstrap HTTP API, similar to hashicorp/nomad#12520	2023-01-04 20:19:33 +00:00
Dan Upton	7c7503c849	grpc/acl: relax permissions required for "core" endpoints (#15346 ) Previously, these endpoints required `service:write` permission on _any_ service as a sort of proxy for "is the caller allowed to participate in the mesh?". Now, they're called as part of the process of establishing a server connection by any consumer of the consul-server-connection-manager library, which will include non-mesh workloads (e.g. Consul KV as a storage backend for Vault) as well as ancillary components such as consul-k8s' acl-init process, which likely won't have `service:write` permission. So this commit relaxes those requirements to accept any valid ACL token on the following gRPC endpoints: - `hashicorp.consul.dataplane.DataplaneService/GetSupportedDataplaneFeatures` - `hashicorp.consul.serverdiscovery.ServerDiscoveryService/WatchServers` - `hashicorp.consul.connectca.ConnectCAService/WatchRoots`	2023-01-04 12:40:34 +00:00
Derek Menteer	1f7e7abeac	Fix issue with incorrect proxycfg watch on upstream peer-targets. (#15865 ) This fixes an issue where the incorrect partition was given to the upstream target watch, which meant that failover logic would not work correctly.	2023-01-03 10:44:08 -06:00
Derek Menteer	f3776894bf	Fix agent cache incorrectly notifying unchanged protobufs. (#15866 ) Fix agent cache incorrectly notifying unchanged protobufs. This change fixes a situation where the protobuf private fields would be read by reflect.DeepEqual() and indicate data was modified. This resulted in change notifications being fired every time, which could cause performance problems in proxycfg.	2023-01-03 10:11:56 -06:00
Dan Stough	b3bd3a6586	[OSS] feat: access logs for listeners and listener filters (#15864 ) * feat: access logs for listeners and listener filters * changelog * fix integration test	2022-12-22 15:18:15 -05:00
Michael Wilkerson	1b28b89439	Enhancement: Consul Compatibility Checking (#15818 ) * add functions for returning the max and min Envoy major versions - added an UnsupportedEnvoyVersions list - removed an unused error from TestDetermineSupportedProxyFeaturesFromString - modified minSupportedVersion to use the function for getting the Min Envoy major version. Using just the major version without the patch is equivalent to using `.0` * added a function for executing the envoy --version command - added a new exec.go file to not be locked to unix system * added envoy version check when using consul connect envoy * added changelog entry * added docs change	2022-12-20 09:58:19 -08:00
Derek Menteer	74b11c416c	Fix incorrect protocol check on discovery chains with peer targets. (#15833 )	2022-12-20 10:15:03 -06:00
Nitya Dhanushkodi	d382ca0aec	extensions: refactor serverless plugin to use extensions from config entry fields (#15817 ) docs: update config entry docs and the Lambda manual registration docs Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com> Co-authored-by: Eric <eric@haberkorn.co>	2022-12-19 12:19:37 -08:00
Chris S. Kim	f7b7f5d4b6	Error out `consul connect envoy` if agent explicitly disabled grpc (#15794 ) Co-authored-by: Paul Glass <pglass@hashicorp.com>	2022-12-19 14:37:27 -05:00
Chris S. Kim	831680d2c5	Add custom balancer to always remove subConns (#15701 ) The new balancer is a patched version of gRPC's default pick_first balancer which removes the behavior of preserving the active subconnection if a list of new addresses contains the currently active address.	2022-12-19 17:39:31 +00:00
Dan Stough	20f9e606b2	docs: update changelog from 1.14.3, 1.13.5, 1.12.8 (#15804 )	2022-12-14 18:47:35 -05:00
Pier-Luc Caron St-Pierre	76fc2f6562	connect: Add support for ConsulResolver to specifies a filter expression (#15659 ) * connect: Add support for ConsulResolver to specifies a filter expression	2022-12-14 12:41:07 -08:00
Paul Glass	619032cfcd	Deprecate -join and -join-wan (#15598 )	2022-12-14 20:28:25 +00:00
cskh	04bf24c8c1	feat(ingress-gateway): support outlier detection of upstream service for ingress gateway (#15614 ) * feat(ingress-gateway): support outlier detection of upstream service for ingress gateway * changelog Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>	2022-12-13 11:51:37 -05:00
Derek Menteer	e87d35e313	Fix DialedDirectly configuration for Consul dataplane. (#15760 ) Fix DialedDirectly configuration for Consul dataplane.	2022-12-13 09:16:31 -06:00
Kyle Schochenmaier	06ce35d480	add changelog for enterprise 3846 (#15773 )	2022-12-12 15:08:02 -06:00
John Murret	cd53120cd7	agent: Fix assignment of error when auto-reloading cert and key file changes. (#15769 ) * Adding the setting of errors missing in config file watcher code in agent. * add changelog	2022-12-12 12:24:39 -07:00
Dan Stough	98ef5f28dd	[OSS] security: update x/net module (#15737 ) Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>	2022-12-08 16:45:44 -05:00
Eric Haberkorn	4268c1c25c	Remove the `connect.enable_serverless_plugin` agent configuration option (#15710 )	2022-12-08 14:46:42 -05:00
Dhia Ayachi	0402fd23a3	update go version to 1.19.4 (#15705 ) * update go version to 1.19.4 * add changelog	2022-12-07 15:11:22 -05:00
Derek Menteer	97ec5279aa	Fix local mesh gateway with peering discovery chains. (#15690 ) Fix local mesh gateway with peering discovery chains. Prior to this patch, discovery chains with peers would not properly honor the mesh gateway mode for two reasons. 1. An incorrect target upstream ID was used to lookup the mesh gateway mode. To fix this, the parent upstream uid is now used instead of the discovery-chain-target-uid to find the intended mesh gateway mode. 2. The watch for local mesh gateways was never initialized for discovery chains. To fix this, the discovery chains are now scanned, and a local GW watch is spawned if: the mesh gateway mode is local and the target is a peering connection.	2022-12-07 13:07:42 -06:00
R.B. Boyer	900584ca82	connect: ensure all vault connect CA tests use limited privilege tokens (#15669 ) All of the current integration tests where Vault is the Connect CA now use non-root tokens for the test. This helps us detect privilege changes in the vault model so we can keep our guides up to date. One larger change was that the RenewIntermediate function got refactored slightly so it could be used from a test, rather than the large duplicated function we were testing in a test which seemed error prone.	2022-12-06 10:06:36 -06:00
R.B. Boyer	4940a728ab	Detect Vault 1.11+ import in secondary datacenters and update default issuer (#15661 ) The fix outlined and merged in #15253 fixed the issue as it occurs in the primary DC. There is a similar issue that arises when vault is used as the Connect CA in a secondary datacenter that is fixed by this PR. Additionally: this PR adds support to run the existing suite of vault related integration tests against the last 4 versions of vault (1.9, 1.10, 1.11, 1.12)	2022-12-05 15:39:21 -06:00
Jared Kirschner	5efdd8bb91	Clarify Vault CA changelog entry (#15662 )	2022-12-02 20:16:49 -05:00
Dao Thanh Tung	b890c40ce4	Fixing CLI ACL token processing unexpected precedence (#15274 ) * Fixing CLI ACL token processing unexpected precedence * Minor flow format and add Changelog * Fixed failed tests and improve error logging message * Add unit test cases and minor changes from code review * Unset env var once the test case finishes running * remove label FINISH	2022-12-02 12:19:52 -05:00
Michael Wilkerson	ae9a1e681e	added changelog for enterprise only change (#15621 )	2022-11-30 11:39:20 -08:00
Tyler Wendlandt	b8347ae8c6	ui: Add ServerExternalAddresses to peer token create form (#15555 ) * ui: Add ServerExternalAddresses field to token generation * Add test for ServerExternalAddresses on peer token create * Add changelog entry * Update translations * Format hbs files * Update translations	2022-11-30 11:42:36 -07:00
R.B. Boyer	11a277f372	peering: better represent non-passing states during peer check flattening (#15615 ) During peer stream replication we flatten checks from the source cluster and build one thin overall check to hide the irrelevant details from the consuming cluster. This flattening logic did correctly flip to non-passing if there were any non-passing checks, but WHICH status it got during that was random (warn/error). Also it didn't represent "maintenance" operations. There is an api package call AggregatedStatus which more correctly flattened check statuses. This PR replicated the more complete logic into the peer stream package.	2022-11-30 11:29:21 -06:00
Freddy	941f6da202	Remove log line about server mgmt token init (#15610 ) * Remove log line about server mgmt token init Currently the server management token is only being bootstrapped in the primary datacenter. That means that servers on the secondary datacenter will never have this token available, and would log this line any time a token is resolved. Bootstrapping the token in secondary datacenters will be done in a follow-up. * Add changelog entry	2022-11-29 17:56:03 -05:00
James Oulman	7e78fb7818	Add support for configuring Envoys route idle_timeout (#14340 ) * Add idleTimeout Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> Co-authored-by: Dhia Ayachi <dhia@hashicorp.com>	2022-11-29 17:43:15 -05:00
Derek Menteer	95dc0c7b30	Add peering `.service` and `.node` DNS lookups. (#15596 ) Add peering `.service` and `.node` DNS lookups.	2022-11-29 12:23:18 -06:00
cskh	97c9432843	fix(peering): increase the gRPC limit to 8MB (#15503 ) * fix(peering): increase the gRPC limit to 50MB * changelog * update gRPC limit to 8MB	2022-11-28 17:48:43 -05:00
Chris S. Kim	c9ec9fa320	Fix Vault managed intermediate PKI bug (#15525 )	2022-11-28 16:17:58 -05:00
Chris S. Kim	cc819ad83b	[OSS] Add boilerplate for proto files implementing BlockableQuery (#15554 )	2022-11-25 15:46:56 -05:00
Chris S. Kim	386da5439a	Use rpcHoldTimeout to calculate blocking timeout (#15541 ) Adds buffer to clients so that servers have time to respond to blocking queries.	2022-11-24 10:13:02 -05:00
Chris Thain	b030a3ee99	Add changelog for snapshot agent updates (#15516 )	2022-11-22 06:11:46 -08:00
Jared Kirschner	3e7e8ae9c5	Support RFC 2782 for prepared query DNS lookups (#14465 ) Format: _<query id or name>._tcp.query[.<datacenter>].<domain>	2022-11-20 17:21:24 -05:00
Derek Menteer	6fa8fa4fca	Fix issue with connect Envoy choosing incorrect TLS settings. (#15466 ) This commit fixes a situation where the API TLS configuration incorrectly influences the GRPC port TLS configuration for XDS.	2022-11-18 14:36:20 -06:00
Derek Menteer	f52f3c5afc	Fix SDK to support older versions of Consul. (#15423 ) This change was necessary, because the configuration was always generated with a gRPC TLS port, which did not exist in Consul 1.13, and would result in the server failing to launch with an error. This code checks the version of Consul and conditionally adds the gRPC TLS port, only if the version number is greater than 1.14.	2022-11-18 10:32:01 -06:00
Alexander Scheel	2b90307f6d	Detect Vault 1.11+ import, update default issuer (#15253 ) Consul used to rely on implicit issuer selection when calling Vault endpoints to issue new CSRs. Vault 1.11+ changed that behavior, which caused Consul to check the wrong (previous) issuer when renewing its Intermediate CA. This patch allows Consul to explicitly set a default issuer when it detects that the response from Vault is 1.11+. Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2022-11-17 16:29:49 -05:00
Derek Menteer	dc27e35f82	Consul 1.14 post-release updates (#15382 ) * Update changelog with 1.14 notes. * gomod version bumps for 1.14 release.	2022-11-15 14:22:43 -06:00
Kyle Havlovitz	f4c3e54b11	auto-config: relax node name validation for JWT authorization (#15370 ) * auto-config: relax node name validation for JWT authorization This changes the JWT authorization logic to allow all non-whitespace, non-quote characters when validating node names. Consul had previously allowed these characters in node names, until this validation was added to fix a security vulnerability with whitespace/quotes being passed to the `bexpr` library. This unintentionally broke node names with characters like `.` which aren't related to this vulnerability. * Update website/content/docs/agent/config/cli-flags.mdx Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>	2022-11-14 18:24:40 -06:00
Dhia Ayachi	225ae55e83	Leadership transfer cmd (#14132 ) * add leadership transfer command * add RPC call test (flaky) * add missing import * add changelog * add command registration * Apply suggestions from code review Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> * add the possibility of providing an id to raft leadership transfer. Add few tests. * delete old file from cherry pick * rename changelog filename to PR # * rename changelog and fix import * fix failing test * check for OperatorWrite Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> * rename from leader-transfer to transfer-leader * remove version check and add test for operator read * move struct to operator.go * first pass * add code for leader transfer in the grpc backend and tests * wire the http endpoint to the new grpc endpoint * remove the RPC endpoint * remove non needed struct * fix naming * add mog glue to API * fix comment * remove dead code * fix linter error * change package name for proto file * remove error wrapping * fix failing test * add command registration * add grpc service mock tests * fix receiver to be pointer * use defined values Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> * reuse MockAclAuthorizer * add documentation * remove usage of external.TokenFromContext * fix failing tests * fix proto generation * Apply suggestions from code review Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Apply suggestions from code review * add more context in doc for the reason * Apply suggestions from docs code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * regenerate proto * fix linter errors Co-authored-by: github-team-consul-core <github-team-consul-core@hashicorp.com> Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com> Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2022-11-14 15:35:12 -05:00
Freddy	706866fa00	Ensure that NodeDump imported nodes are filtered (#15356 )	2022-11-14 12:35:20 -07:00
Nitya Dhanushkodi	9e060b8e1b	add changelog (#15351 )	2022-11-14 13:23:09 -06:00
Kyle Havlovitz	dde5c524ad	connect: strip port from DNS SANs for ingress gateway leaf cert (#15320 ) * connect: strip port from DNS SANs for ingress gateway leaf cert * connect: format DNS SANs in CreateCSR * connect: Test wildcard case when formatting SANs	2022-11-14 10:27:03 -08:00
Chris S. Kim	050f26c71a	Add changelog (#15327 )	2022-11-14 11:23:02 -05:00
Derek Menteer	931cec42b3	Prevent serving TLS via ports.grpc (#15339 ) Prevent serving TLS via ports.grpc We remove the ability to run the ports.grpc in TLS mode to avoid confusion and to simplify configuration. This breaking change ensures that any user currently using ports.grpc in an encrypted mode will receive an error message indicating that ports.grpc_tls must be explicitly used. The suggested action for these users is to simply swap their ports.grpc to ports.grpc_tls in the configuration file. If both ports are defined, or if the user has not configured TLS for grpc, then the error message will not be printed.	2022-11-11 14:29:22 -06:00
Kyle Schochenmaier	bf0f61a878	removes ioutil usage everywhere which was deprecated in go1.16 (#15297 ) * update go version to 1.18 for api and sdk, go mod tidy * removes ioutil usage everywhere which was deprecated in go1.16 in favour of io and os packages. Also introduces a lint rule which forbids use of ioutil going forward. Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>	2022-11-10 10:26:01 -06:00
malizz	b51f0e25e9	update ACLs for cluster peering (#15317 ) * update ACLs for cluster peering * add changelog * Update .changelog/15317.txt Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com> Co-authored-by: Eric Haberkorn <erichaberkorn@gmail.com>	2022-11-09 13:02:58 -08:00
malizz	b9a9e1219c	update config defaults, add docs (#15302 ) * update config defaults, add docs * update grpc tls port for non-default values * add changelog * Update website/content/docs/upgrading/upgrade-specific.mdx Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> * Update website/content/docs/agent/config/config-files.mdx Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> * update logic for setting grpc tls port value * move default config to default.go, update changelog * update docs * Fix config tests. * Fix linter error. * Fix ConnectCA tests. * Cleanup markdown on upgrade notes. Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> Co-authored-by: Derek Menteer <derek.menteer@hashicorp.com>	2022-11-09 09:29:55 -08:00
Derek Menteer	418bd62c44	Fix mesh gateway configuration with proxy-defaults (#15186 ) * Fix mesh gateway proxy-defaults not affecting upstreams. * Clarify distinction with upstream settings Top-level mesh gateway mode in proxy-defaults and service-defaults gets merged into NodeService.Proxy.MeshGateway, and only gets merged with the mode attached to an an upstream in proxycfg/xds. * Fix mgw mode usage for peered upstreams There were a couple issues with how mgw mode was being handled for peered upstreams. For starters, mesh gateway mode from proxy-defaults and the top-level of service-defaults gets stored in NodeService.Proxy.MeshGateway, but the upstream watch for peered data was only considering the mesh gateway config attached in NodeService.Proxy.Upstreams[i]. This means that applying a mesh gateway mode via global proxy-defaults or service-defaults on the downstream would not have an effect. Separately, transparent proxy watches for peered upstreams didn't consider mesh gateway mode at all. This commit addresses the first issue by ensuring that we overlay the upstream config for peered upstreams as we do for non-peered. The second issue is addressed by re-using setupWatchesForPeeredUpstream when handling transparent proxy updates. Note that for transparent proxies we do not yet support mesh gateway mode per upstream, so the NodeService.Proxy.MeshGateway mode is used. * Fix upstream mesh gateway mode handling in xds This commit ensures that when determining the mesh gateway mode for peered upstreams we consider the NodeService.Proxy.MeshGateway config as a baseline. In absense of this change, setting a mesh gateway mode via proxy-defaults or the top-level of service-defaults will not have an effect for peered upstreams. * Merge service/proxy defaults in cfg resolver Previously the mesh gateway mode for connect proxies would be merged at three points: 1. On servers, in ComputeResolvedServiceConfig. 2. On clients, in MergeServiceConfig. 3. On clients, in proxycfg/xds. The first merge returns a ServiceConfigResponse where there is a top-level MeshGateway config from proxy/service-defaults, along with per-upstream config. The second merge combines per-upstream config specified at the service instance with per-upstream config specified centrally. The third merge combines the NodeService.Proxy.MeshGateway config containing proxy/service-defaults data with the per-upstream mode. This third merge is easy to miss, which led to peered upstreams not considering the mesh gateway mode from proxy-defaults. This commit removes the third merge, and ensures that all mesh gateway config is available at the upstream. This way proxycfg/xds do not need to do additional overlays. * Ensure that proxy-defaults is considered in wc Upstream defaults become a synthetic Upstream definition under a wildcard key "". Now that proxycfg/xds expect Upstream definitions to have the final MeshGateway values, this commit ensures that values from proxy-defaults/service-defaults are the default for this synthetic upstream. Add changelog. Co-authored-by: freddygv <freddy@hashicorp.com>	2022-11-09 10:14:29 -06:00
Derek Menteer	b64972d486	Bring back parameter ServerExternalAddresses in GenerateToken endpoint (#15267 ) Re-add ServerExternalAddresses parameter in GenerateToken endpoint This reverts commit `5e156772f6` and adds extra functionality to support newer peering behaviors.	2022-11-08 14:55:18 -06:00
cskh	a3f57cc5e8	fix(mesh-gateway): remove deregistered service from mesh gateway (#15272 ) * fix(mesh-gateway): remove deregistered service from mesh gateway * changelog Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com> Co-authored-by: Evan Culver <eculver@users.noreply.github.com>	2022-11-07 20:30:15 -05:00
Freddy	7f5f7e9cf9	Avoid blocking child type updates on parent ack (#15083 )	2022-11-07 18:10:42 -07:00
cskh	94d232ab1e	integ test: reduce flakiness due to compound output from retry (#15233 ) * integ test: avoid flakiness due to compound output from retry * changelog	2022-11-02 14:08:17 -04:00
Evan Culver	62d4517f9e	connect: Add Envoy 1.24 to integration tests, remove Envoy 1.20 (#15093 )	2022-10-31 10:50:45 -05:00
Eric Haberkorn	cf50bdbe20	Fix peering metrics bug (#15178 ) This bug was caused by the peering health metric being set to NaN.	2022-10-28 10:51:12 -04:00
Chris S. Kim	0e176dd6aa	Allow consul debug on non-ACL consul servers (#15155 )	2022-10-27 09:25:18 -04:00
cskh	a9427e1310	fix(peering): nil pointer in calling handleUpdateService (#15160 ) * fix(peering): nil pointer in calling handleUpdateService * changelog	2022-10-26 11:50:34 -04:00
Luke Kysow	d3aa2bd9c5	ingress-gateways: don't log error when registering gateway (#15001 ) * ingress-gateways: don't log error when registering gateway Previously, when an ingress gateway was registered without a corresponding ingress gateway config entry, an error was logged because the watch on the config entry returned a nil result. This is expected so don't log an error.	2022-10-25 10:55:44 -07:00
Luke Kysow	fbd47e1161	config entry: hardcode proxy-defaults name as global (#14833 ) * config entry: hardcode proxy-defaults name as global proxy-defaults can only have the name global. Because of this, we support not even setting the name in the config file: ``` kind = "proxy-defaults" ``` Previously, writing this would result in the output: ``` Config entry written: proxy-defaults/ ``` Now it will output: ``` Config entry written: proxy-defaults/global ``` This change follows what was done for the new Mesh config entry.	2022-10-25 10:55:15 -07:00
Luke Kysow	9999672fd7	autoencrypt: helpful error for clients with wrong dc (#14832 ) * autoencrypt: helpful error for clients with wrong dc If clients have set a different datacenter than the servers they're connecting with for autoencrypt, give a helpful error message.	2022-10-25 10:13:41 -07:00
R.B. Boyer	3c44116a8f	cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956 ) This continues the work done in #14908 where a crude solution to prevent a goroutine leak was implemented. The former code would launch a perpetual goroutine family every iteration (+1 +1) and the fixed code simply caused a new goroutine family to first cancel the prior one to prevent the leak (-1 +1 == 0). This PR refactors this code completely to: - make it more understandable - remove the recursion-via-goroutine strangeness - prevent unnecessary RPC fetches when the prior one has errored. The core issue arose from a conflation of the entry.Fetching field to mean: - there is an RPC (blocking query) in flight right now - there is a goroutine running to manage the RPC fetch retry loop The problem is that the goroutine-leak-avoidance check would treat Fetching like (2), but within the body of a goroutine it would flip that boolean back to false before the retry sleep. This would cause a new chain of goroutines to launch which #14908 would correct crudely. The refactored code uses a plain for-loop and changes the semantics to track state for "is there a goroutine associated with this cache entry" instead of the former. We use a uint64 unique identity per goroutine instead of a boolean so that any orphaned goroutines can tell when they've been replaced when the expiry loop deletes a cache entry while the goroutine is still running and is later replaced.	2022-10-25 10:27:26 -05:00
Chris S. Kim	0aaf49fed7	Add changelog	2022-10-24 16:12:08 -04:00
cskh	db82ffe503	fix(peering): replicating wan address (#15108 ) * fix(peering): replicating wan address * add changelog * unit test	2022-10-24 15:44:57 -04:00
Iryna Shustava	2a25669b13	cli/sdk: Allow redirection to a different consul dns port (#15050 )	2022-10-21 13:15:32 -06:00
Chris S. Kim	f2209b2fbe	Add changelog	2022-10-20 14:32:42 -04:00
cskh	eae5f87002	changelog	2022-10-19 16:37:50 -04:00
Nitya Dhanushkodi	5e156772f6	Remove ability to specify external addresses in GenerateToken endpoint (#14930 ) * Reverts "update generate token endpoint to take external addresses (#13844)" This reverts commit `f47319b7c6`.	2022-10-19 09:31:36 -07:00
Tyler Wendlandt	2a9cc3f084	Merge pull request #14971 from hashicorp/ui/feature/agentless-nodes-banner ui: agentless nodes notice banner banner	2022-10-19 09:06:46 -06:00
Kyle Havlovitz	5c3427608b	Merge pull request #15035 from hashicorp/vault-ttl-update-warn Warn instead of returning error when missing intermediate mount tune permissions	2022-10-18 15:41:52 -07:00
Chris S. Kim	29a297d3e9	Refactor client RPC timeouts (#14965 ) Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.	2022-10-18 15:05:09 -04:00
Kyle Havlovitz	d122108992	Warn instead of returning an error when intermediate mount tune permission is missing	2022-10-18 12:01:25 -07:00
R.B. Boyer	fe2d41ddad	cache: prevent goroutine leak in agent cache (#14908 ) There is a bug in the error handling code for the Agent cache subsystem discovered: 1. NotifyCallback calls notifyBlockingQuery which calls getWithIndex in a loop (which backs off on-error up to 1 minute) 2. getWithIndex calls fetch if there’s no valid entry in the cache 3. fetch starts a goroutine which calls Fetch on the cache-type, waits for a while (again with backoff up to 1 minute for errors) and then calls fetch to trigger a refresh The end result being that every 1 minute notifyBlockingQuery spawns an ancestry of goroutines that essentially lives forever. This PR ensures that the goroutine started by `fetch` cancels any prior goroutine spawned by the same line for the same key. In isolated testing where a cache type was tweaked to indefinitely error, this patch prevented goroutine counts from skyrocketing.	2022-10-17 14:38:10 -05:00
R.B. Boyer	02a858efa0	ca: fix a masked bug in leaf cert generation that would not be notified of root cert rotation after the first one (#15005 ) In practice this was masked by #14956 and was only uncovered fixing the other bug. go test ./agent -run TestAgentConnectCALeafCert_goodNotLocal would fail when only #14956 was fixed.	2022-10-17 13:24:27 -05:00
Chris S. Kim	3d2dffff16	Merge pull request #13388 from deblasis/feature/health-checks_windows_service Feature: Health checks windows service	2022-10-17 09:26:19 -04:00
Kyle Havlovitz	aaf892a383	Extend tcp keepalive settings to work for terminating gateways as well	2022-10-14 17:05:46 -07:00
Kyle Havlovitz	2c569f6b9c	Update docs and add tcp_keepalive_probes setting	2022-10-14 17:05:46 -07:00
Freddy	24d0c8801a	Merge pull request #14981 from hashicorp/peering/dial-through-gateways	2022-10-14 09:44:56 -06:00
Tyler Wendlandt	0c8563f060	Merge pull request #14986 from hashicorp/ui/feature/filter-node-healthchecks-agentless UI: filter node healthchecks on agentless service instances	2022-10-14 09:33:45 -06:00
Dan Upton	328e3ff563	proxycfg: rate-limit delivery of config snapshots (#14960 ) Adds a user-configurable rate limiter to proxycfg snapshot delivery, with a default limit of 250 updates per second. This addresses a problem observed in our load testing of Consul Dataplane where updating a "global" resource such as a wildcard intention or the proxy-defaults config entry could starve the Raft or Memberlist goroutines of CPU time, causing general cluster instability.	2022-10-14 15:52:00 +01:00
Dan Upton	e6b55d1d81	perf: remove expensive reflection from xDS hot path (#14934 ) Replaces the reflection-based implementation of proxycfg's ConfigSnapshot.Clone with code generated by deep-copy. While load testing server-based xDS (for consul-dataplane) we discovered this method is extremely expensive. The ConfigSnapshot struct, directly or indirectly, contains a copy of many of the structs in the agent/structs package, which creates a large graph for copystructure.Copy to traverse at runtime, on every proxy reconfiguration.	2022-10-14 10:26:42 +01:00
wenincode	363db8c849	Add changelog entry	2022-10-13 18:54:39 -06:00
Freddy	ee4cdc4985	Merge pull request #14935 from hashicorp/fix/alias-leak	2022-10-13 16:31:15 -06:00
freddygv	da68ed70c1	Add changelog entry	2022-10-13 16:09:32 -06:00
freddygv	f48d7fbe04	Add changelog entry	2022-10-13 16:03:15 -06:00
wenincode	f9575be4c7	Add changelog	2022-10-13 10:43:57 -06:00
Derek Menteer	caa1396255	Add remote peer partition and datacenter info.	2022-10-13 10:37:41 -05:00
Tyler Wendlandt	e8748503c3	Merge pull request #14970 from hashicorp/ui/feature/filter-synthetic-nodes ui: Filter synthetic nodes on nodes list page	2022-10-13 09:12:03 -06:00
Michael Klein	5ac1bc9cc0	Merge pull request #14947 from hashicorp/ui/feat/peer-detail-page ui: peer detail view	2022-10-13 17:03:57 +02:00
Michael Klein	ceeb823d01	Add changelog for peers detail page	2022-10-13 16:45:03 +02:00
Dan Upton	cbb4a030c4	xds: properly merge central config for "agentless" services (#14962 )	2022-10-13 12:04:59 +01:00
Dan Upton	0af9f16343	bug: fix goroutine leaks caused by incorrect usage of `WatchCh` (#14916 ) memdb's `WatchCh` method creates a goroutine that will publish to the returned channel when the watchset is triggered or the given context is canceled. Although this is called out in its godoc comment, it's not obvious that this method creates a goroutine who's lifecycle you need to manage. In the xDS capacity controller, we were calling `WatchCh` on each iteration of the control loop, meaning the number of goroutines would grow on each autopilot event until there was catalog churn. In the catalog config source, we were calling `WatchCh` with the background context, meaning that the goroutine would keep running after the sync loop had terminated.	2022-10-13 12:04:27 +01:00

... 3 4 5 6 7 ...

1327 Commits