consul

Commit Graph

Author	SHA1	Message	Date
Chris S. Kim	a4653de8da	CA provider doc updates and Vault provider minor update (#17831 ) Update CA provider docs Clarify that providers can differ between primary and secondary datacenters Provide a comparison chart for consul vs vault CA providers Loosen Vault CA provider validation for RootPKIPath Update Vault CA provider documentation	2023-06-21 19:34:42 +00:00
George Bolo	82441a27fa	fixes #17732 - AccessorID in request body should be optional when updating ACL token (#17739 ) * AccessorID in request body should be optional when updating ACL token * add a test case * fix test case * add changelog entry for PR #17739	2023-06-21 13:31:40 -05:00
Ronald	5f95f5f6d8	Stop referenced jwt providers from being deleted (#17755 ) * Stop referenced jwt providers from being deleted	2023-06-16 10:31:53 -04:00
Michael Zalimeni	f9aa7aebb3	Property Override validation improvements (#17759 ) * Reject inbound Prop Override patch with Services Services filtering is only supported for outbound TrafficDirection patches. * Improve Prop Override unexpected type validation - Guard against additional invalid parent and target types - Add specific error handling for Any fields (unsupported)	2023-06-15 13:51:47 -04:00
Derek Menteer	04edace1de	Fix issue with streaming service health watches. (#17775 ) Fix issue with streaming service health watches. This commit fixes an issue where the health streams were unaware of service export changes. Whenever an exported-services config entry is modified, it is effectively an ACL change. The bug would be triggered by the following situation: - no services are exported - an upstream watch to service X is spawned - the streaming backend filters out data for service X (due to lack of exports) - service X is finally exported In the situation above, the streaming backend does not trigger a refresh of its data. This means that any events that were supposed to have been received prior to the export are NOT backfilled, and the watches never see service X spawning. We currently have decided to not trigger a stream refresh in this situation due to the potential for a thundering herd effect (touching exports would cause a re-fetch of all watches for that partition, potentially). Therefore, a local blocking-query approach was added by this commit for agentless. It's also worth noting that the streaming subscription is currently bypassed most of the time with agentful, because proxycfg has a `req.Source.Node != ""` which prevents the `streamingEnabled` check from passing. This means that while agents should technically have this same issue, they don't experience it with mesh health watches. Note that this is a temporary fix that solves the issue for proxycfg, but not service-discovery use cases.	2023-06-15 12:46:58 -05:00
Derek Menteer	8c74a1d33e	Add transparent proxy enhancements changelog (#17757 )	2023-06-15 11:48:39 -05:00
Ashesh Vidyut	fa40654885	[NET-3865] [Supportability] Additional Information in the output of 'consul operator raft list-peers' (#17582 ) * init * fix tests * added -detailed in docs * added change log * fix doc * checking for entry in map * fix tests * removed detailed flag * removed detailed flag * revert unwanted changes * removed unwanted changes * updated change log * pr review comment changes * pr comment changes single API instead of two * fix change log * fix tests * fix tests * fix test operator raft endpoint test * Update .changelog/17582.txt Co-authored-by: Semir Patel <semir.patel@hashicorp.com> * nits * updated docs --------- Co-authored-by: Semir Patel <semir.patel@hashicorp.com>	2023-06-14 15:12:50 +00:00
David Yu	212e0902fb	Bump Alpine to 3.18 (#17719 ) * Update Dockerfile * Create 17719.txt	2023-06-14 01:02:05 +00:00
Dan Stough	d497623266	docs: missing changelog for _5517 (#17706 )	2023-06-13 15:11:33 -04:00
R.B. Boyer	72f991d8d3	agent: remove agent cache dependency from service mesh leaf certificate management (#17075 ) * agent: remove agent cache dependency from service mesh leaf certificate management This extracts the leaf cert management from within the agent cache. This code was produced by the following process: 1. All tests in agent/cache, agent/cache-types, agent/auto-config, agent/consul/servercert were run at each stage. - The tests in agent matching .Leaf were run at each stage. - The tests in agent/leafcert were run at each stage after they existed. 2. The former leaf cert Fetch implementation was extracted into a new package behind a "fake RPC" endpoint to make it look almost like all other cache type internals. 3. The old cache type was shimmed to use the fake RPC endpoint and generally cleaned up. 4. I selectively duplicated all of Get/Notify/NotifyCallback/Prepopulate from the agent/cache.Cache implementation over into the new package. This was renamed as leafcert.Manager. - Code that was irrelevant to the leaf cert type was deleted (inlining blocking=true, refresh=false) 5. Everything that used the leaf cert cache type (including proxycfg stuff) was shifted to use the leafcert.Manager instead. 6. agent/cache-types tests were moved and gently replumbed to execute as-is against a leafcert.Manager. 7. Inspired by some of the locking changes from derek's branch I split the fat lock into N+1 locks. 8. The waiter chan struct{} was eventually replaced with a singleflight.Group around cache updates, which was likely the biggest net structural change. 9. The awkward two layers or logic produced as a byproduct of marrying the agent cache management code with the leaf cert type code was slowly coalesced and flattened to remove confusion. 10. The .Leaf tests from the agent package were copied and made to work directly against a leafcert.Manager to increase direct coverage. I have done a best effort attempt to port the previous leaf-cert cache type's tests over in spirit, as well as to take the e2e-ish tests in the agent package with Leaf in the test name and copy those into the agent/leafcert package to get more direct coverage, rather than coverage tangled up in the agent logic. There is no net-new test coverage, just coverage that was pushed around from elsewhere.	2023-06-13 10:54:45 -05:00
Dan Stough	bba5cd8455	fix: stop peering delete routine on leader loss (#17483 )	2023-06-13 10:20:56 -04:00
Ashesh Vidyut	d54d5fb85c	[NET-4107][Supportability] Log Level set to TRACE and duration set to 5m for consul-debug (#17596 ) * changed duration to 5 mins and log level to trace * documentation update * change log	2023-06-13 11:07:46 +05:30
Joshua Timmons	28d81ec79f	Fix two WAL metrics in docs/agent/telemetry.mdx (#17593 )	2023-06-12 18:50:59 -04:00
Andrew Stucki	3cb70566a9	[API Gateway] Fix rate limiting for API gateways (#17631 ) * [API Gateway] Fix rate limiting for API gateways * Add changelog * Fix failing unit tests * Fix operator usage tests for api package	2023-06-09 08:22:32 -04:00
Michael Zalimeni	30e0c234ab	Update list of Envoy versions (#17546 )	2023-06-09 02:37:49 +00:00
Ronald	7ae457c586	enterprise changelog update for audit (#17625 )	2023-06-08 19:50:51 -04:00
Ronald	17f4689379	backport ent changes to oss (#17614 ) * backport ent changes to oss * Update .changelog/_5669.txt Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> --------- Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>	2023-06-08 16:34:31 +00:00
Andrew Stucki	9a4f503b2b	[API Gateway] Fix trust domain for external peered services in synthesis code (#17609 ) * [API Gateway] Fix trust domain for external peered services in synthesis code * Add changelog	2023-06-08 12:18:17 -04:00
Ronald	8118aae5c1	Add writeAuditRPCEvent to agent_oss (#17607 ) * Add writeAuditRPCEvent to agent_oss * fix the other diffs * backport change log	2023-06-07 22:35:48 +00:00
Joshua Timmons	7a2ee145bf	Fix metric names in Consul agent telemetry docs (#17577 )	2023-06-06 14:42:30 -04:00
Andrew Stucki	f9d9d4db60	Fix subscribing/fetching objects not in the default partition (#17581 ) * Fix subscribing/fetching objects not in the default namespace * add changelog	2023-06-06 09:09:33 -04:00
Andrew Stucki	4ddb88ec7e	Fix up case where subscription is terminated due to ACLs changing or a snapshot restore occurring (#17566 ) * Fix up case where subscription is terminated due to ACLs changing or a snapshot restore occurring * Add changelog entry * Switch to use errors.Is	2023-06-05 13:10:17 -04:00
Dave Rawks	a55d368a0e	Resolves issue-16844 - systemd notify by default (#16845 ) * updates `consul.service` systemd service unit to use `Type=notify` to resolve issue #16844 * add changelog update to match	2023-06-02 10:04:48 -07:00
Poonam Jadhav	d9e18b4bf0	changelog: add changelog for reporting (#17535 )	2023-06-02 08:59:48 -04:00
Dan Stough	a043981cc6	Revert "fix(connect envoy): set initial_fetch_timeout to wait for initial xDS… (#17317 )" (#17540 ) This reverts commit `be7d2a4d84`.	2023-06-01 13:10:41 -04:00
Andrew Stucki	ca12ce926b	[API Gateway] Fix use of virtual resolvers in HTTPRoutes (#17055 ) * [API Gateway] Fix use of virtual resolvers in routes * Add changelog entry	2023-05-31 16:58:40 -04:00
Nathan Coleman	b438a07326	Export peering cli (#15654 ) * Sujata's peering-cli branch * Added error message for connecting to cluster * We can export service to peer * export handling multiple peers * export handles multiple peers * export now can handle multiple services * Export after 1st cleanup * Successful export * Added the namespace option * Add .changelog entry * go mod tidy * Stub unit tests for peering export command * added export in peering.go * Adding export_test * Moved the code to services from peers and cleaned the serviceNamespace * Added support for exporting to partitions * Fixed partition bug * Added unit tests for export command * Add multi-tenancy flags * gofmt * Add some helpful comments * Exclude namespace + partition flags when running OSS * cleaned up partition stuff * Validate required flags differently for OSS vs. ENT * Update success output to include only the requested consumers * cleaned up * fixed broken test * gofmt * Include all flags in OSS build * Remove example previously added to peering command * Move stray import into correct block * Update changelog entry to include support for exporting to a partition * Add required-ness label to consumer-peers flag description * Update command/services/export/export.go Co-authored-by: Dan Stough <dan.stough@hashicorp.com> * Add docs placeholder for new services export command * Moved piece of code to OSS * Break config entry init + update into separate functions * fixed * Vary existing service export comparison for OSS vs. ENT * Move OSS-specific test to export_oss_test.go * Set config entry name based on partition being exported from * Set namespace on added services * Adding namespace * Remove export documentation We will include documentation in a followup PR * Consolidate code from export_oss into export.go * Consolidated export_oss_test.go and export_test.go * Add example of partition export to command synopsis * Allow empty peers flag if partitions flag provided * Add test coverage for -consumer-partitions flag * Update command/services/export/export.go Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Update command/services/export/export.go Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com> * Update changelog entry * Use "cluster peers" to clear up any possible confusion * Update test assertions --------- Co-authored-by: 20sr20 <sujata@hashicorp.com> Co-authored-by: Dan Stough <dan.stough@hashicorp.com> Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>	2023-05-31 14:27:35 -04:00
Dhia Ayachi	da94cbdb25	add changelog (#17528 )	2023-05-31 13:29:59 -04:00
Jared Kirschner	b9c9d79778	Accept ap, datacenter, and namespace query params (#17525 ) This commit only contains the OSS PR (datacenter query param support). A separate enterprise PR adds support for ap and namespace query params. Resources in Consul can exists within scopes such as datacenters, cluster peers, admin partitions, and namespaces. You can refer to those resources from interfaces such as the CLI, HTTP API, DNS, and configuration files. Some scope levels have consistent naming: cluster peers are always referred to as "peer". Other scope levels use a short-hand in DNS lookups... - "ns" for namespace - "ap" for admin partition - "dc" for datacenter ...But use long-hand in CLI commands: - "namespace" for namespace - "partition" for admin partition - and "datacenter" However, HTTP API query parameters do not follow a consistent pattern, supporting short-hand for some scopes but long-hand for others: - "ns" for namespace - "partition" for admin partition - and "dc" for datacenter. This inconsistency is confusing, especially for users who have been exposed to providing scope names through another interface such as CLI or DNS queries. This commit improves UX by consistently supporting both short-hand and long-hand forms of the namespace, partition, and datacenter scopes in HTTP API query parameters.	2023-05-31 11:50:24 -04:00
Nick Ethier	44f90132e0	hoststats: add package for collecting host statistics including cpu memory and disk usage (#17038 )	2023-05-30 18:43:29 +00:00
Ronald	55e283dda9	[NET-3092] JWT Verify claims handling (#17452 ) * [NET-3092] JWT Verify claims handling	2023-05-30 13:38:33 -04:00
Dan Stough	bc9bb99a56	build(deps): update UBI base image to 9.2 (#17513 )	2023-05-30 12:48:13 -04:00
Chris Thain	65b8ccdc1b	Enable Network filters for Wasm Envoy Extension (#17505 )	2023-05-30 07:17:33 -07:00
Ashvitha	091925bcb7	HCP Telemetry Feature (#17460 ) * Move hcp client to subpackage hcpclient (#16800) * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] OTELExporter (#17128) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * [HCP Observability] OTELSink (#17159) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Fix nits * [HCP Observability] Init OTELSink in Telemetry (#17162) * Move hcp client to subpackage hcpclient (#16800) * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * [HCP Observability] OTELExporter (#17128) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Initialize OTELSink with sync.Map for all the instrument stores. * Added telemetry agent to client and init sink in deps * Fixed client * Initalize sink in deps * init sink in telemetry library * Init deps before telemetry * Use concrete telemetry.OtelSink type * add /v1/metrics * Avoid returning err for telemetry init * move sink init within the IsCloudEnabled() * Use HCPSinkOpts in deps instead * update golden test for configuration file * Switch to using extra sinks in the telemetry library * keep name MetricsConfig * fix log in verifyCCMRegistration * Set logger in context * pass around MetricSink in deps * Fix imports * Rebased onto otel sink pr * Fix URL in test * [HCP Observability] OTELSink (#17159) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Fix nits * pass extraSinks as function param instead * Add default interval as package export * remove verifyCCM func * Add clusterID * Fix import and add t.Parallel() for missing tests * Kick Vercel CI * Remove scheme from endpoint path, and fix error logging * return metrics.MetricSink for sink method * Update SDK * [HCP Observability] Metrics filtering and Labels in Go Metrics sink (#17184) * Move hcp client to subpackage hcpclient (#16800) * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * [HCP Observability] New MetricsClient (#17100) * Client configured with TLS using HCP config and retry/throttle * Add tests and godoc for metrics client * close body after request * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * remove clone * Extract CloudConfig and mock for future PR * Switch to hclog.FromContext * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * [HCP Observability] OTELExporter (#17128) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Initialize OTELSink with sync.Map for all the instrument stores. * Added telemetry agent to client and init sink in deps * Fixed client * Initalize sink in deps * init sink in telemetry library * Init deps before telemetry * Use concrete telemetry.OtelSink type * add /v1/metrics * Avoid returning err for telemetry init * move sink init within the IsCloudEnabled() * Use HCPSinkOpts in deps instead * update golden test for configuration file * Switch to using extra sinks in the telemetry library * keep name MetricsConfig * fix log in verifyCCMRegistration * Set logger in context * pass around MetricSink in deps * Fix imports * Rebased onto otel sink pr * Fix URL in test * [HCP Observability] OTELSink (#17159) * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Create new OTELExporter which uses the MetricsClient Add transform because the conversion is in an /internal package * Fix lint error * early return when there are no metrics * Add NewOTELExporter() function * Downgrade to metrics SDK version: v1.15.0-rc.1 * Fix imports * fix small nits with comments and url.URL * Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile * Cleanup error handling and clarify empty metrics case * Fix input/expected naming in otel_transform_test.go * add comment for metric tracking * Add a general isEmpty method * Add clear error types * update to latest version 1.15.0 of OTEL * Client configured with TLS using HCP config and retry/throttle * run go mod tidy * Remove one abstraction to use the config from deps * Address PR feedback * Initialize OTELSink with sync.Map for all the instrument stores. * Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests. * Switch to mutex instead of sync.Map to avoid type assertion * Add gauge store * Clarify comments * return concrete sink type * Fix lint errors * Move gauge store to be within sink * Use context.TODO,rebase and clenaup opts handling * Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1 * Fix imports * Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx * Add lots of documentation to the OTELSink * Fix gauge store comment and check ok * Add select and ctx.Done() check to gauge callback * use require.Equal for attributes * Fixed import naming * Remove float64 calls and add a NewGaugeStore method * Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store * Generate 100 gauge operations * Seperate the labels into goroutines in sink test * Generate kv store for the test case keys to avoid using uuid * Added a race test with 300 samples for OTELSink * Do not pass in waitgroup and use error channel instead. * Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel * Fix nits * pass extraSinks as function param instead * Add default interval as package export * remove verifyCCM func * Add clusterID * Fix import and add t.Parallel() for missing tests * Kick Vercel CI * Remove scheme from endpoint path, and fix error logging * return metrics.MetricSink for sink method * Update SDK * Added telemetry agent to client and init sink in deps * Add node_id and __replica__ default labels * add function for default labels and set x-hcp-resource-id * Fix labels tests * Commit suggestion for getDefaultLabels Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com> * Fixed server.id, and t.Parallel() * Make defaultLabels a method on the TelemetryConfig object * Rename FilterList to lowercase filterList * Cleanup filter implemetation by combining regex into a single one, and making the type lowercase * Fix append * use regex directly for filters * Fix x-resource-id test to use mocked value * Fix log.Error formats * Forgot the len(opts.Label) optimization) * Use cfg.NodeID instead --------- Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com> * remove replic tag (#17484) * [HCP Observability] Add custom metrics for OTEL sink, improve logging, upgrade modules and cleanup metrics client (#17455) * Add custom metrics for Exporter and transform operations * Improve deps logging Run go mod tidy * Upgrade SDK and OTEL * Remove the partial success implemetation and check for HTTP status code in metrics client * Add x-channel * cleanup logs in deps.go based on PR feedback * Change to debug log and lowercase * address test operation feedback * use GetHumanVersion on version * Fix error wrapping * Fix metric names * [HCP Observability] Turn off retries for now until dynamically configurable (#17496) * Remove retries for now until dynamic configuration is possible * Clarify comment * Update changelog * improve changelog --------- Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com>	2023-05-29 16:11:08 -04:00
Michael Zalimeni	5a46a8c604	Add `builtin/property-override` Envoy Extension (#17487 ) `property-override` is an extension that allows for arbitrarily patching Envoy resources based on resource matching filters. Patch operations resemble a subset of the JSON Patch spec with minor differences to facilitate patching pre-defined (protobuf) schemas. See Envoy Extension product documentation for more details. Co-authored-by: Eric Haberkorn <eric.haberkorn@hashicorp.com> Co-authored-by: Kyle Havlovitz <kyle@hashicorp.com>	2023-05-26 19:52:09 +00:00
Chris Thain	516eb4febc	Add `builtin/ext-authz` Envoy Extension (#17495 )	2023-05-26 12:22:54 -07:00
Lincoln Stoll	3605fde865	perf: Remove expensive reflection from raft/mesh hot path (#16552 ) * perf: Remove expensive reflection from raft/mesh hot path Replaces a reflection-based copy of a struct in the mesh topology with a deep-copy generated implementation. This is in the hot-path of raft FSM updates, and the reflection overhead was a substantial part of mesh registration times (~90%). This could manifest as raft thread saturation, and resulting instability. Co-authored-by: Joel Brandhorst <joel.brandhorst@gmail.com> * add changelog --------- Co-authored-by: Joel Brandhorst <joel.brandhorst@gmail.com> Co-authored-by: John Murret <john.murret@hashicorp.com>	2023-05-26 11:42:05 -06:00
Derek Menteer	a90c9ce2b0	Fix ACL check on health endpoint (#17424 ) Fix ACL check on health endpoint Prior to this change, the service health API would not explicitly return an error whenever a token with invalid permissions was given, and it would instead return empty results. With this change, a "Permission denied" error is returned whenever data is queried. This is done to better support the agent cache, which performs a fetch backoff sleep whenever ACL errors are encountered. Affected endpoints are: `/v1/health/connect/` and `/v1/health/ingress/`.	2023-05-24 16:35:55 -05:00
Derek Menteer	e2f15cfe56	Fix namespaced peer service updates / deletes. (#17456 ) * Fix namespaced peer service updates / deletes. This change fixes a function so that namespaced services are correctly queried when handling updates / deletes. Prior to this change, some peered services would not correctly be un-exported. * Add changelog.	2023-05-24 16:32:45 -05:00
Dan Stough	d935c7b466	[OSS] gRPC Blocking Queries (#17426 ) * feat: initial grpc blocking queries * changelog and docs update	2023-05-23 17:29:10 -04:00
Paul Glass	7f4fd2735a	Only synthesize anonymous token in primary DC (#17231 ) * Only synthesize anonymous token in primary DC * Add integration test for wan fed issue	2023-05-23 09:38:04 -05:00
Michael Zalimeni	b8d2640429	Disable remote proxy patching except AWS Lambda (#17415 ) To avoid unintended tampering with remote downstreams via service config, refactor BasicEnvoyExtender and RuntimeConfig to disallow typical Envoy extensions from being applied to non-local proxies. Continue to allow this behavior for AWS Lambda and the read-only Validate builtin extensions. Addresses CVE-2023-2816.	2023-05-23 11:55:06 +00:00
John Landa	8f6b9fe177	Add ACLs Enabled field to consul agent startup status message (#17086 ) * Add ACLs Enabled field to consul agent startup status message * Add changelog * Update startup messages to include default ACL policy configuration * Correct import groupings	2023-05-16 13:47:02 -05:00
Connor	0789661ce5	Rename hcp-metrics-collector to consul-telemetry-collector (#17327 ) * Rename hcp-metrics-collector to consul-telemetry-collector * Fix docs * Fix doc comment --------- Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>	2023-05-16 14:36:05 -04:00
Dan Stough	be7d2a4d84	fix(connect envoy): set initial_fetch_timeout to wait for initial xDS… (#17317 ) * fix(connect envoy): set initial_fetch_timeout to wait for initial xDS indefinitely --------- Co-authored-by: Kiril Angov <kiril.angov@gmail.com>	2023-05-15 10:45:16 -04:00
Dan Bond	95f462d5f1	agent: prevent very old servers re-joining a cluster with stale data (#17171 ) * agent: configure server lastseen timestamp Signed-off-by: Dan Bond <danbond@protonmail.com> * use correct config Signed-off-by: Dan Bond <danbond@protonmail.com> * add comments Signed-off-by: Dan Bond <danbond@protonmail.com> * use default age in test golden data Signed-off-by: Dan Bond <danbond@protonmail.com> * add changelog Signed-off-by: Dan Bond <danbond@protonmail.com> * fix runtime test Signed-off-by: Dan Bond <danbond@protonmail.com> * agent: add server_metadata Signed-off-by: Dan Bond <danbond@protonmail.com> * update comments Signed-off-by: Dan Bond <danbond@protonmail.com> * correctly check if metadata file does not exist Signed-off-by: Dan Bond <danbond@protonmail.com> * follow instructions for adding new config Signed-off-by: Dan Bond <danbond@protonmail.com> * add comments Signed-off-by: Dan Bond <danbond@protonmail.com> * update comments Signed-off-by: Dan Bond <danbond@protonmail.com> * Update agent/agent.go Co-authored-by: Dan Upton <daniel@floppy.co> * agent/config: add validation for duration with min Signed-off-by: Dan Bond <danbond@protonmail.com> * docs: add new server_rejoin_age_max config definition Signed-off-by: Dan Bond <danbond@protonmail.com> * agent: add unit test for checking server last seen Signed-off-by: Dan Bond <danbond@protonmail.com> * agent: log continually for 60s before erroring Signed-off-by: Dan Bond <danbond@protonmail.com> * pr comments Signed-off-by: Dan Bond <danbond@protonmail.com> * remove unneeded todo * agent: fix error message Signed-off-by: Dan Bond <danbond@protonmail.com> --------- Signed-off-by: Dan Bond <danbond@protonmail.com> Co-authored-by: Dan Upton <daniel@floppy.co>	2023-05-15 04:05:47 -07:00
R.B. Boyer	cd80ea18ff	grpc: ensure grpc resolver correctly uses lan/wan addresses on servers (#17270 ) The grpc resolver implementation is fed from changes to the router.Router. Within the router there is a map of various areas storing the addressing information for servers in those areas. All map entries are of the WAN variety except a single special entry for the LAN. Addressing information in the LAN "area" are local addresses intended for use when making a client-to-server or server-to-server request. The client agent correctly updates this LAN area when receiving lan serf events, so by extension the grpc resolver works fine in that scenario. The server agent only initially populates a single entry in the LAN area (for itself) on startup, and then never mutates that area map again. For normal RPCs a different structure is used for LAN routing. Additionally when selecting a server to contact in the local datacenter it will randomly select addresses from either the LAN or WAN addressed entries in the map. Unfortunately this means that the grpc resolver stack as it exists on server agents is either broken or only accidentally functions by having servers dial each other over the WAN-accessible address. If the operator disables the serf wan port completely likely this incidental functioning would break. This PR enforces that local requests for servers (both for stale reads or leader forwarded requests) exclusively use the LAN "area" information and also fixes it so that servers keep that area up to date in the router. A test for the grpc resolver logic was added, as well as a higher level full-stack test to ensure the externally perceived bug does not return.	2023-05-11 11:08:57 -05:00
cskh	48f7d99305	snapshot: some improvments to the snapshot process (#17236 ) * snapshot: some improvments to the snapshot process Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com>	2023-05-09 15:28:52 -04:00
Derek Menteer	4f6da20fe5	Fix multiple issues related to proxycfg health queries. (#17241 ) Fix multiple issues related to proxycfg health queries. 1. The datacenter was not being provided to a proxycfg query, which resulted in bypassing agentless query optimizations and using the normal API instead. 2. The health rpc endpoint would return a zero index when insufficient ACLs were detected. This would result in the agent cache performing an infinite loop of queries in rapid succession without backoff.	2023-05-09 12:37:58 -05:00
Derek Menteer	50ef6a697e	Fix issue with peer stream node cleanup. (#17235 ) Fix issue with peer stream node cleanup. This commit encompasses a few problems that are closely related due to their proximity in the code. 1. The peerstream utilizes node IDs in several locations to determine which nodes / services / checks should be cleaned up or created. While VM deployments with agents will likely always have a node ID, agentless uses synthetic nodes and does not populate the field. This means that for consul-k8s deployments, all services were likely bundled together into the same synthetic node in some code paths (but not all), resulting in strange behavior. The Node.Node field should be used instead as a unique identifier, as it should always be populated. 2. The peerstream cleanup process for unused nodes uses an incorrect query for node deregistration. This query is NOT namespace aware and results in the node (and corresponding services) being deregistered prematurely whenever it has zero default-namespace services and 1+ non-default-namespace services registered on it. This issue is tricky to find due to the incorrect logic mentioned in #1, combined with the fact that the affected services must be co-located on the same node as the currently deregistering service for this to be encountered. 3. The stream tracker did not understand differences between services in different namespaces and could therefore report incorrect numbers. It was updated to utilize the full service name to avoid conflicts and return proper results.	2023-05-08 13:13:25 -05:00

1 2 3 4 5 ...

1121 Commits