Commit Graph

1161 Commits

Author SHA1 Message Date
Dan Stough bba5cd8455
fix: stop peering delete routine on leader loss (#17483) 2023-06-13 10:20:56 -04:00
Ashesh Vidyut d54d5fb85c
[NET-4107][Supportability] Log Level set to TRACE and duration set to 5m for consul-debug (#17596)
* changed duration to 5 mins and log level to trace

* documentation update

* change log
2023-06-13 11:07:46 +05:30
Joshua Timmons 28d81ec79f
Fix two WAL metrics in docs/agent/telemetry.mdx (#17593) 2023-06-12 18:50:59 -04:00
Andrew Stucki 3cb70566a9
[API Gateway] Fix rate limiting for API gateways (#17631)
* [API Gateway] Fix rate limiting for API gateways

* Add changelog

* Fix failing unit tests

* Fix operator usage tests for api package
2023-06-09 08:22:32 -04:00
Michael Zalimeni 30e0c234ab
Update list of Envoy versions (#17546) 2023-06-09 02:37:49 +00:00
Ronald 7ae457c586
enterprise changelog update for audit (#17625) 2023-06-08 19:50:51 -04:00
Ronald 17f4689379
backport ent changes to oss (#17614)
* backport ent changes to oss

* Update .changelog/_5669.txt

Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>

---------

Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com>
2023-06-08 16:34:31 +00:00
Andrew Stucki 9a4f503b2b
[API Gateway] Fix trust domain for external peered services in synthesis code (#17609)
* [API Gateway] Fix trust domain for external peered services in synthesis code

* Add changelog
2023-06-08 12:18:17 -04:00
Ronald 8118aae5c1
Add writeAuditRPCEvent to agent_oss (#17607)
* Add writeAuditRPCEvent to agent_oss

* fix the other diffs

* backport change log
2023-06-07 22:35:48 +00:00
Joshua Timmons 7a2ee145bf
Fix metric names in Consul agent telemetry docs (#17577) 2023-06-06 14:42:30 -04:00
Andrew Stucki f9d9d4db60
Fix subscribing/fetching objects not in the default partition (#17581)
* Fix subscribing/fetching objects not in the default namespace

* add changelog
2023-06-06 09:09:33 -04:00
Andrew Stucki 4ddb88ec7e
Fix up case where subscription is terminated due to ACLs changing or a snapshot restore occurring (#17566)
* Fix up case where subscription is terminated due to ACLs changing or a snapshot restore occurring

* Add changelog entry

* Switch to use errors.Is
2023-06-05 13:10:17 -04:00
Dave Rawks a55d368a0e
Resolves issue-16844 - systemd notify by default (#16845)
* updates `consul.service` systemd service unit to use `Type=notify` to
  resolve issue #16844
* add changelog update to match
2023-06-02 10:04:48 -07:00
Poonam Jadhav d9e18b4bf0
changelog: add changelog for reporting (#17535) 2023-06-02 08:59:48 -04:00
Dan Stough a043981cc6
Revert "fix(connect envoy): set initial_fetch_timeout to wait for initial xDS… (#17317)" (#17540)
This reverts commit be7d2a4d84.
2023-06-01 13:10:41 -04:00
Andrew Stucki ca12ce926b
[API Gateway] Fix use of virtual resolvers in HTTPRoutes (#17055)
* [API Gateway] Fix use of virtual resolvers in routes

* Add changelog entry
2023-05-31 16:58:40 -04:00
Nathan Coleman b438a07326
Export peering cli (#15654)
* Sujata's peering-cli branch

* Added error message for connecting to cluster

* We can export service to peer

* export handling multiple peers

* export handles multiple peers

* export now can handle multiple services

* Export after 1st cleanup

* Successful export

* Added the namespace option

* Add .changelog entry

* go mod tidy

* Stub unit tests for peering export command

* added export in peering.go

* Adding export_test

* Moved the code to services from peers and cleaned the serviceNamespace

* Added support for exporting to partitions

* Fixed partition bug

* Added unit tests for export command

* Add multi-tenancy flags

* gofmt

* Add some helpful comments

* Exclude namespace + partition flags when running OSS

* cleaned up partition stuff

* Validate required flags differently for OSS vs. ENT

* Update success output to include only the requested consumers

* cleaned up

* fixed broken test

* gofmt

* Include all flags in OSS build

* Remove example previously added to peering command

* Move stray import into correct block

* Update changelog entry to include support for exporting to a partition

* Add required-ness label to consumer-peers flag description

* Update command/services/export/export.go

Co-authored-by: Dan Stough <dan.stough@hashicorp.com>

* Add docs placeholder for new services export command

* Moved piece of code to OSS

* Break config entry init + update into separate functions

* fixed

* Vary existing service export comparison for OSS vs. ENT

* Move OSS-specific test to export_oss_test.go

* Set config entry name based on partition being exported from

* Set namespace on added services

* Adding namespace

* Remove export documentation

We will include documentation in a followup PR

* Consolidate code from export_oss into export.go

* Consolidated export_oss_test.go and export_test.go

* Add example of partition export to command synopsis

* Allow empty peers flag if partitions flag provided

* Add test coverage for -consumer-partitions flag

* Update command/services/export/export.go

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Update command/services/export/export.go

Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>

* Update changelog entry

* Use "cluster peers" to clear up any possible confusion

* Update test assertions

---------

Co-authored-by: 20sr20 <sujata@hashicorp.com>
Co-authored-by: Dan Stough <dan.stough@hashicorp.com>
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
2023-05-31 14:27:35 -04:00
Dhia Ayachi da94cbdb25
add changelog (#17528) 2023-05-31 13:29:59 -04:00
Jared Kirschner b9c9d79778
Accept ap, datacenter, and namespace query params (#17525)
This commit only contains the OSS PR (datacenter query param support).
A separate enterprise PR adds support for ap and namespace query params.

Resources in Consul can exists within scopes such as datacenters, cluster
peers, admin partitions, and namespaces. You can refer to those resources from
interfaces such as the CLI, HTTP API, DNS, and configuration files.

Some scope levels have consistent naming: cluster peers are always referred to
as "peer".

Other scope levels use a short-hand in DNS lookups...
- "ns" for namespace
- "ap" for admin partition
- "dc" for datacenter

...But use long-hand in CLI commands:
- "namespace" for namespace
- "partition" for admin partition
- and "datacenter"

However, HTTP API query parameters do not follow a consistent pattern,
supporting short-hand for some scopes but long-hand for others:
- "ns" for namespace
- "partition" for admin partition
- and "dc" for datacenter.

This inconsistency is confusing, especially for users who have been exposed to
providing scope names through another interface such as CLI or DNS queries.

This commit improves UX by consistently supporting both short-hand and
long-hand forms of the namespace, partition, and datacenter scopes in HTTP API
query parameters.
2023-05-31 11:50:24 -04:00
Nick Ethier 44f90132e0
hoststats: add package for collecting host statistics including cpu memory and disk usage (#17038) 2023-05-30 18:43:29 +00:00
Ronald 55e283dda9
[NET-3092] JWT Verify claims handling (#17452)
* [NET-3092] JWT Verify claims handling
2023-05-30 13:38:33 -04:00
Dan Stough bc9bb99a56
build(deps): update UBI base image to 9.2 (#17513) 2023-05-30 12:48:13 -04:00
Chris Thain 65b8ccdc1b
Enable Network filters for Wasm Envoy Extension (#17505) 2023-05-30 07:17:33 -07:00
Ashvitha 091925bcb7
HCP Telemetry Feature (#17460)
* Move hcp client to subpackage hcpclient (#16800)

* [HCP Observability] New MetricsClient (#17100)

* Client configured with TLS using HCP config and retry/throttle

* Add tests and godoc for metrics client

* close body after request

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* remove clone

* Extract CloudConfig and mock for future PR

* Switch to hclog.FromContext

* [HCP Observability] OTELExporter (#17128)

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package

* Fix lint error

* early return when there are no metrics

* Add NewOTELExporter() function

* Downgrade to metrics SDK version: v1.15.0-rc.1

* Fix imports

* fix small nits with comments and url.URL

* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile

* Cleanup error handling and clarify empty metrics case

* Fix input/expected naming in otel_transform_test.go

* add comment for metric tracking

* Add a general isEmpty method

* Add clear error types

* update to latest version 1.15.0 of OTEL

* [HCP Observability] OTELSink (#17159)

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package

* Fix lint error

* early return when there are no metrics

* Add NewOTELExporter() function

* Downgrade to metrics SDK version: v1.15.0-rc.1

* Fix imports

* fix small nits with comments and url.URL

* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile

* Cleanup error handling and clarify empty metrics case

* Fix input/expected naming in otel_transform_test.go

* add comment for metric tracking

* Add a general isEmpty method

* Add clear error types

* update to latest version 1.15.0 of OTEL

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Initialize OTELSink with sync.Map for all the instrument stores.

* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.

* Switch to mutex instead of sync.Map to avoid type assertion

* Add gauge store

* Clarify comments

* return concrete sink type

* Fix lint errors

* Move gauge store to be within sink

* Use context.TODO,rebase and clenaup opts handling

* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1

* Fix imports

* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx

* Add lots of documentation to the OTELSink

* Fix gauge store comment and check ok

* Add select and ctx.Done() check to gauge callback

* use require.Equal for attributes

* Fixed import naming

* Remove float64 calls and add a NewGaugeStore method

* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store

* Generate 100 gauge operations

* Seperate the labels into goroutines in sink test

* Generate kv store for the test case keys to avoid using uuid

* Added a race test with 300 samples for OTELSink

* Do not pass in waitgroup and use error channel instead.

* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel

* Fix nits

* [HCP Observability] Init OTELSink in Telemetry (#17162)

* Move hcp client to subpackage hcpclient (#16800)

* [HCP Observability] New MetricsClient (#17100)

* Client configured with TLS using HCP config and retry/throttle

* Add tests and godoc for metrics client

* close body after request

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* remove clone

* Extract CloudConfig and mock for future PR

* Switch to hclog.FromContext

* [HCP Observability] New MetricsClient (#17100)

* Client configured with TLS using HCP config and retry/throttle

* Add tests and godoc for metrics client

* close body after request

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* remove clone

* Extract CloudConfig and mock for future PR

* Switch to hclog.FromContext

* [HCP Observability] New MetricsClient (#17100)

* Client configured with TLS using HCP config and retry/throttle

* Add tests and godoc for metrics client

* close body after request

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* remove clone

* Extract CloudConfig and mock for future PR

* Switch to hclog.FromContext

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package

* Fix lint error

* early return when there are no metrics

* Add NewOTELExporter() function

* Downgrade to metrics SDK version: v1.15.0-rc.1

* Fix imports

* fix small nits with comments and url.URL

* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile

* Cleanup error handling and clarify empty metrics case

* Fix input/expected naming in otel_transform_test.go

* add comment for metric tracking

* Add a general isEmpty method

* Add clear error types

* update to latest version 1.15.0 of OTEL

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Initialize OTELSink with sync.Map for all the instrument stores.

* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.

* Switch to mutex instead of sync.Map to avoid type assertion

* Add gauge store

* Clarify comments

* return concrete sink type

* Fix lint errors

* Move gauge store to be within sink

* Use context.TODO,rebase and clenaup opts handling

* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1

* Fix imports

* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx

* Add lots of documentation to the OTELSink

* Fix gauge store comment and check ok

* Add select and ctx.Done() check to gauge callback

* use require.Equal for attributes

* Fixed import naming

* Remove float64 calls and add a NewGaugeStore method

* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store

* Generate 100 gauge operations

* Seperate the labels into goroutines in sink test

* Generate kv store for the test case keys to avoid using uuid

* Added a race test with 300 samples for OTELSink

* [HCP Observability] OTELExporter (#17128)

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package

* Fix lint error

* early return when there are no metrics

* Add NewOTELExporter() function

* Downgrade to metrics SDK version: v1.15.0-rc.1

* Fix imports

* fix small nits with comments and url.URL

* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile

* Cleanup error handling and clarify empty metrics case

* Fix input/expected naming in otel_transform_test.go

* add comment for metric tracking

* Add a general isEmpty method

* Add clear error types

* update to latest version 1.15.0 of OTEL

* Do not pass in waitgroup and use error channel instead.

* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel

* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1

* Initialize OTELSink with sync.Map for all the instrument stores.

* Added telemetry agent to client and init sink in deps

* Fixed client

* Initalize sink in deps

* init sink in telemetry library

* Init deps before telemetry

* Use concrete telemetry.OtelSink type

* add /v1/metrics

* Avoid returning err for telemetry init

* move sink init within the IsCloudEnabled()

* Use HCPSinkOpts in deps instead

* update golden test for configuration file

* Switch to using extra sinks in the telemetry library

* keep name MetricsConfig

* fix log in verifyCCMRegistration

* Set logger in context

* pass around MetricSink in deps

* Fix imports

* Rebased onto otel sink pr

* Fix URL in test

* [HCP Observability] OTELSink (#17159)

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package

* Fix lint error

* early return when there are no metrics

* Add NewOTELExporter() function

* Downgrade to metrics SDK version: v1.15.0-rc.1

* Fix imports

* fix small nits with comments and url.URL

* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile

* Cleanup error handling and clarify empty metrics case

* Fix input/expected naming in otel_transform_test.go

* add comment for metric tracking

* Add a general isEmpty method

* Add clear error types

* update to latest version 1.15.0 of OTEL

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Initialize OTELSink with sync.Map for all the instrument stores.

* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.

* Switch to mutex instead of sync.Map to avoid type assertion

* Add gauge store

* Clarify comments

* return concrete sink type

* Fix lint errors

* Move gauge store to be within sink

* Use context.TODO,rebase and clenaup opts handling

* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1

* Fix imports

* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx

* Add lots of documentation to the OTELSink

* Fix gauge store comment and check ok

* Add select and ctx.Done() check to gauge callback

* use require.Equal for attributes

* Fixed import naming

* Remove float64 calls and add a NewGaugeStore method

* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store

* Generate 100 gauge operations

* Seperate the labels into goroutines in sink test

* Generate kv store for the test case keys to avoid using uuid

* Added a race test with 300 samples for OTELSink

* Do not pass in waitgroup and use error channel instead.

* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel

* Fix nits

* pass extraSinks as function param instead

* Add default interval as package export

* remove verifyCCM func

* Add clusterID

* Fix import and add t.Parallel() for missing tests

* Kick Vercel CI

* Remove scheme from endpoint path, and fix error logging

* return metrics.MetricSink for sink method

* Update SDK

* [HCP Observability] Metrics filtering and Labels in Go Metrics sink (#17184)

* Move hcp client to subpackage hcpclient (#16800)

* [HCP Observability] New MetricsClient (#17100)

* Client configured with TLS using HCP config and retry/throttle

* Add tests and godoc for metrics client

* close body after request

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* remove clone

* Extract CloudConfig and mock for future PR

* Switch to hclog.FromContext

* [HCP Observability] New MetricsClient (#17100)

* Client configured with TLS using HCP config and retry/throttle

* Add tests and godoc for metrics client

* close body after request

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* remove clone

* Extract CloudConfig and mock for future PR

* Switch to hclog.FromContext

* [HCP Observability] New MetricsClient (#17100)

* Client configured with TLS using HCP config and retry/throttle

* Add tests and godoc for metrics client

* close body after request

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* remove clone

* Extract CloudConfig and mock for future PR

* Switch to hclog.FromContext

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package

* Fix lint error

* early return when there are no metrics

* Add NewOTELExporter() function

* Downgrade to metrics SDK version: v1.15.0-rc.1

* Fix imports

* fix small nits with comments and url.URL

* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile

* Cleanup error handling and clarify empty metrics case

* Fix input/expected naming in otel_transform_test.go

* add comment for metric tracking

* Add a general isEmpty method

* Add clear error types

* update to latest version 1.15.0 of OTEL

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Initialize OTELSink with sync.Map for all the instrument stores.

* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.

* Switch to mutex instead of sync.Map to avoid type assertion

* Add gauge store

* Clarify comments

* return concrete sink type

* Fix lint errors

* Move gauge store to be within sink

* Use context.TODO,rebase and clenaup opts handling

* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1

* Fix imports

* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx

* Add lots of documentation to the OTELSink

* Fix gauge store comment and check ok

* Add select and ctx.Done() check to gauge callback

* use require.Equal for attributes

* Fixed import naming

* Remove float64 calls and add a NewGaugeStore method

* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store

* Generate 100 gauge operations

* Seperate the labels into goroutines in sink test

* Generate kv store for the test case keys to avoid using uuid

* Added a race test with 300 samples for OTELSink

* [HCP Observability] OTELExporter (#17128)

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package

* Fix lint error

* early return when there are no metrics

* Add NewOTELExporter() function

* Downgrade to metrics SDK version: v1.15.0-rc.1

* Fix imports

* fix small nits with comments and url.URL

* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile

* Cleanup error handling and clarify empty metrics case

* Fix input/expected naming in otel_transform_test.go

* add comment for metric tracking

* Add a general isEmpty method

* Add clear error types

* update to latest version 1.15.0 of OTEL

* Do not pass in waitgroup and use error channel instead.

* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel

* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1

* Initialize OTELSink with sync.Map for all the instrument stores.

* Added telemetry agent to client and init sink in deps

* Fixed client

* Initalize sink in deps

* init sink in telemetry library

* Init deps before telemetry

* Use concrete telemetry.OtelSink type

* add /v1/metrics

* Avoid returning err for telemetry init

* move sink init within the IsCloudEnabled()

* Use HCPSinkOpts in deps instead

* update golden test for configuration file

* Switch to using extra sinks in the telemetry library

* keep name MetricsConfig

* fix log in verifyCCMRegistration

* Set logger in context

* pass around MetricSink in deps

* Fix imports

* Rebased onto otel sink pr

* Fix URL in test

* [HCP Observability] OTELSink (#17159)

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package

* Fix lint error

* early return when there are no metrics

* Add NewOTELExporter() function

* Downgrade to metrics SDK version: v1.15.0-rc.1

* Fix imports

* fix small nits with comments and url.URL

* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile

* Cleanup error handling and clarify empty metrics case

* Fix input/expected naming in otel_transform_test.go

* add comment for metric tracking

* Add a general isEmpty method

* Add clear error types

* update to latest version 1.15.0 of OTEL

* Client configured with TLS using HCP config and retry/throttle

* run go mod tidy

* Remove one abstraction to use the config from deps

* Address PR feedback

* Initialize OTELSink with sync.Map for all the instrument stores.

* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.

* Switch to mutex instead of sync.Map to avoid type assertion

* Add gauge store

* Clarify comments

* return concrete sink type

* Fix lint errors

* Move gauge store to be within sink

* Use context.TODO,rebase and clenaup opts handling

* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1

* Fix imports

* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx

* Add lots of documentation to the OTELSink

* Fix gauge store comment and check ok

* Add select and ctx.Done() check to gauge callback

* use require.Equal for attributes

* Fixed import naming

* Remove float64 calls and add a NewGaugeStore method

* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store

* Generate 100 gauge operations

* Seperate the labels into goroutines in sink test

* Generate kv store for the test case keys to avoid using uuid

* Added a race test with 300 samples for OTELSink

* Do not pass in waitgroup and use error channel instead.

* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel

* Fix nits

* pass extraSinks as function param instead

* Add default interval as package export

* remove verifyCCM func

* Add clusterID

* Fix import and add t.Parallel() for missing tests

* Kick Vercel CI

* Remove scheme from endpoint path, and fix error logging

* return metrics.MetricSink for sink method

* Update SDK

* Added telemetry agent to client and init sink in deps

* Add node_id and __replica__ default labels

* add function for default labels and set x-hcp-resource-id

* Fix labels tests

* Commit suggestion for getDefaultLabels

Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com>

* Fixed server.id, and t.Parallel()

* Make defaultLabels a method on the TelemetryConfig object

* Rename FilterList to lowercase filterList

* Cleanup filter implemetation by combining regex into a single one, and making the type lowercase

* Fix append

* use regex directly for filters

* Fix x-resource-id test to use mocked value

* Fix log.Error formats

* Forgot the len(opts.Label) optimization)

* Use cfg.NodeID instead

---------

Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com>

* remove replic tag (#17484)

* [HCP Observability] Add custom metrics for OTEL sink, improve logging, upgrade modules and cleanup metrics client (#17455)

* Add custom metrics for Exporter and transform operations

* Improve deps logging

Run go mod tidy

* Upgrade SDK and OTEL

* Remove the partial success implemetation and check for HTTP status code in metrics client

* Add x-channel

* cleanup logs in deps.go based on PR feedback

* Change to debug log and lowercase

* address test operation feedback

* use GetHumanVersion on version

* Fix error wrapping

* Fix metric names

* [HCP Observability] Turn off retries for now until dynamically configurable (#17496)

* Remove retries for now until dynamic configuration is possible

* Clarify comment

* Update changelog

* improve changelog

---------

Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com>
2023-05-29 16:11:08 -04:00
Michael Zalimeni 5a46a8c604
Add `builtin/property-override` Envoy Extension (#17487)
`property-override` is an extension that allows for arbitrarily
patching Envoy resources based on resource matching filters. Patch
operations resemble a subset of the JSON Patch spec with minor
differences to facilitate patching pre-defined (protobuf) schemas.

See Envoy Extension product documentation for more details.

Co-authored-by: Eric Haberkorn <eric.haberkorn@hashicorp.com>
Co-authored-by: Kyle Havlovitz <kyle@hashicorp.com>
2023-05-26 19:52:09 +00:00
Chris Thain 516eb4febc
Add `builtin/ext-authz` Envoy Extension (#17495) 2023-05-26 12:22:54 -07:00
Lincoln Stoll 3605fde865
perf: Remove expensive reflection from raft/mesh hot path (#16552)
* perf: Remove expensive reflection from raft/mesh hot path

Replaces a reflection-based copy of a struct in the mesh topology with a
deep-copy generated implementation.

This is in the hot-path of raft FSM updates, and the reflection overhead was a
substantial part of mesh registration times (~90%). This could manifest as raft
thread saturation, and resulting instability.

Co-authored-by: Joel Brandhorst <joel.brandhorst@gmail.com>

* add changelog

---------

Co-authored-by: Joel Brandhorst <joel.brandhorst@gmail.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
2023-05-26 11:42:05 -06:00
Derek Menteer a90c9ce2b0
Fix ACL check on health endpoint (#17424)
Fix ACL check on health endpoint

Prior to this change, the service health API would not explicitly return an
error whenever a token with invalid permissions was given, and it would instead
return empty results.  With this change, a "Permission denied" error is returned
whenever data is queried. This is done to better support the agent cache, which
performs a fetch backoff sleep whenever ACL errors are encountered.  Affected
endpoints are: `/v1/health/connect/` and `/v1/health/ingress/`.
2023-05-24 16:35:55 -05:00
Derek Menteer e2f15cfe56
Fix namespaced peer service updates / deletes. (#17456)
* Fix namespaced peer service updates / deletes.

This change fixes a function so that namespaced services are
correctly queried when handling updates / deletes. Prior to this
change, some peered services would not correctly be un-exported.

* Add changelog.
2023-05-24 16:32:45 -05:00
Dan Stough d935c7b466
[OSS] gRPC Blocking Queries (#17426)
* feat: initial grpc blocking queries

* changelog and docs update
2023-05-23 17:29:10 -04:00
Paul Glass 7f4fd2735a
Only synthesize anonymous token in primary DC (#17231)
* Only synthesize anonymous token in primary DC
* Add integration test for wan fed issue
2023-05-23 09:38:04 -05:00
Michael Zalimeni b8d2640429
Disable remote proxy patching except AWS Lambda (#17415)
To avoid unintended tampering with remote downstreams via service
config, refactor BasicEnvoyExtender and RuntimeConfig to disallow
typical Envoy extensions from being applied to non-local proxies.

Continue to allow this behavior for AWS Lambda and the read-only
Validate builtin extensions.

Addresses CVE-2023-2816.
2023-05-23 11:55:06 +00:00
John Landa 8f6b9fe177
Add ACLs Enabled field to consul agent startup status message (#17086)
* Add ACLs Enabled field to consul agent startup status message

* Add changelog

* Update startup messages to include default ACL policy configuration

* Correct import groupings
2023-05-16 13:47:02 -05:00
Connor 0789661ce5
Rename hcp-metrics-collector to consul-telemetry-collector (#17327)
* Rename hcp-metrics-collector to consul-telemetry-collector

* Fix docs

* Fix doc comment

---------

Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>
2023-05-16 14:36:05 -04:00
Dan Stough be7d2a4d84
fix(connect envoy): set initial_fetch_timeout to wait for initial xDS… (#17317)
* fix(connect envoy): set initial_fetch_timeout to wait for initial xDS indefinitely

---------

Co-authored-by: Kiril Angov <kiril.angov@gmail.com>
2023-05-15 10:45:16 -04:00
Dan Bond 95f462d5f1
agent: prevent very old servers re-joining a cluster with stale data (#17171)
* agent: configure server lastseen timestamp

Signed-off-by: Dan Bond <danbond@protonmail.com>

* use correct config

Signed-off-by: Dan Bond <danbond@protonmail.com>

* add comments

Signed-off-by: Dan Bond <danbond@protonmail.com>

* use default age in test golden data

Signed-off-by: Dan Bond <danbond@protonmail.com>

* add changelog

Signed-off-by: Dan Bond <danbond@protonmail.com>

* fix runtime test

Signed-off-by: Dan Bond <danbond@protonmail.com>

* agent: add server_metadata

Signed-off-by: Dan Bond <danbond@protonmail.com>

* update comments

Signed-off-by: Dan Bond <danbond@protonmail.com>

* correctly check if metadata file does not exist

Signed-off-by: Dan Bond <danbond@protonmail.com>

* follow instructions for adding new config

Signed-off-by: Dan Bond <danbond@protonmail.com>

* add comments

Signed-off-by: Dan Bond <danbond@protonmail.com>

* update comments

Signed-off-by: Dan Bond <danbond@protonmail.com>

* Update agent/agent.go

Co-authored-by: Dan Upton <daniel@floppy.co>

* agent/config: add validation for duration with min

Signed-off-by: Dan Bond <danbond@protonmail.com>

* docs: add new server_rejoin_age_max config definition

Signed-off-by: Dan Bond <danbond@protonmail.com>

* agent: add unit test for checking server last seen

Signed-off-by: Dan Bond <danbond@protonmail.com>

* agent: log continually for 60s before erroring

Signed-off-by: Dan Bond <danbond@protonmail.com>

* pr comments

Signed-off-by: Dan Bond <danbond@protonmail.com>

* remove unneeded todo

* agent: fix error message

Signed-off-by: Dan Bond <danbond@protonmail.com>

---------

Signed-off-by: Dan Bond <danbond@protonmail.com>
Co-authored-by: Dan Upton <daniel@floppy.co>
2023-05-15 04:05:47 -07:00
R.B. Boyer cd80ea18ff
grpc: ensure grpc resolver correctly uses lan/wan addresses on servers (#17270)
The grpc resolver implementation is fed from changes to the
router.Router. Within the router there is a map of various areas storing
the addressing information for servers in those areas. All map entries
are of the WAN variety except a single special entry for the LAN.

Addressing information in the LAN "area" are local addresses intended
for use when making a client-to-server or server-to-server request.

The client agent correctly updates this LAN area when receiving lan serf
events, so by extension the grpc resolver works fine in that scenario.

The server agent only initially populates a single entry in the LAN area
(for itself) on startup, and then never mutates that area map again.
For normal RPCs a different structure is used for LAN routing.

Additionally when selecting a server to contact in the local datacenter
it will randomly select addresses from either the LAN or WAN addressed
entries in the map.

Unfortunately this means that the grpc resolver stack as it exists on
server agents is either broken or only accidentally functions by having
servers dial each other over the WAN-accessible address. If the operator
disables the serf wan port completely likely this incidental functioning
would break.

This PR enforces that local requests for servers (both for stale reads
or leader forwarded requests) exclusively use the LAN "area" information
and also fixes it so that servers keep that area up to date in the
router.

A test for the grpc resolver logic was added, as well as a higher level
full-stack test to ensure the externally perceived bug does not return.
2023-05-11 11:08:57 -05:00
cskh 48f7d99305
snapshot: some improvments to the snapshot process (#17236)
* snapshot: some improvments to the snapshot process

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
2023-05-09 15:28:52 -04:00
Derek Menteer 4f6da20fe5
Fix multiple issues related to proxycfg health queries. (#17241)
Fix multiple issues related to proxycfg health queries.

1. The datacenter was not being provided to a proxycfg query, which resulted in
bypassing agentless query optimizations and using the normal API instead.

2. The health rpc endpoint would return a zero index when insufficient ACLs were
detected. This would result in the agent cache performing an infinite loop of
queries in rapid succession without backoff.
2023-05-09 12:37:58 -05:00
Derek Menteer 50ef6a697e
Fix issue with peer stream node cleanup. (#17235)
Fix issue with peer stream node cleanup.

This commit encompasses a few problems that are closely related due to their
proximity in the code.

1. The peerstream utilizes node IDs in several locations to determine which
nodes / services / checks should be cleaned up or created. While VM deployments
with agents will likely always have a node ID, agentless uses synthetic nodes
and does not populate the field. This means that for consul-k8s deployments, all
services were likely bundled together into the same synthetic node in some code
paths (but not all), resulting in strange behavior. The Node.Node field should
be used instead as a unique identifier, as it should always be populated.

2. The peerstream cleanup process for unused nodes uses an incorrect query for
node deregistration. This query is NOT namespace aware and results in the node
(and corresponding services) being deregistered prematurely whenever it has zero
default-namespace services and 1+ non-default-namespace services registered on
it. This issue is tricky to find due to the incorrect logic mentioned in #1,
combined with the fact that the affected services must be co-located on the same
node as the currently deregistering service for this to be encountered.

3. The stream tracker did not understand differences between services in
different namespaces and could therefore report incorrect numbers. It was
updated to utilize the full service name to avoid conflicts and return proper
results.
2023-05-08 13:13:25 -05:00
John Murret 6fa104409e
security: update go version to 1.20.4 (#17240)
* update go version to 1.20.3

* add changelog

* rename changelog file to remove underscore

* update to use 1.20.4

* update change log entry to reflect 1.20.4
2023-05-08 11:57:11 -06:00
John Eikenberry bd76fdeaeb
enable auto-tidy expired issuers in vault (as CA)
When using vault as a CA and generating the local signing cert, try to
enable the PKI endpoint's auto-tidy feature with it set to tidy expired
issuers.
2023-05-03 20:30:37 +00:00
Eric Haberkorn 2c0da88ce7
fix panic in `injectSANMatcher` when `tlsContext` is `nil` (#17185) 2023-04-28 16:27:57 -04:00
Paul Glass e4a341c88a
Permissive mTLS: Config entry filtering and CLI warnings (#17183)
This adds filtering for service-defaults: consul config list -filter 'MutualTLSMode == "permissive"'.

It adds CLI warnings when the CLI writes a config entry and sees that either service-defaults or proxy-defaults contains MutualTLSMode=permissive, or sees that the mesh config entry contains AllowEnablingPermissiveMutualTLSMode=true.
2023-04-28 12:51:36 -05:00
R.B. Boyer 6b4986907d
peering: ensure that merged central configs of peered upstreams for partitioned downstreams work (#17179)
Partitioned downstreams with peered upstreams could not properly merge central config info (i.e. proxy-defaults and service-defaults things like mesh gateway modes) if the upstream had an empty DestinationPartition field in Enterprise.

Due to data flow, if this setup is done using Consul client agents the field is never empty and thus does not experience the bug.

When a service is registered directly to the catalog as is the case for consul-dataplane use this field may be empty and and the internal machinery of the merging function doesn't handle this well.

This PR ensures the internal machinery of that function is referentially self-consistent.
2023-04-28 12:36:08 -05:00
John Landa eded58b62a
Remove artificial ACLTokenMaxTTL limit for configuring acl token expiry (#17066)
* Remove artificial ACLTokenMaxTTL limit for configuring acl token expiry

* Add changelog

* Remove test on default MaxTokenTTL

* Change to imperitive tense for changelog entry
2023-04-28 10:57:30 -05:00
Freddy e02ef16f02
Update HCP bootstrapping to support existing clusters (#16916)
* Persist HCP management token from server config

We want to move away from injecting an initial management token into
Consul clusters linked to HCP. The reasoning is that by using a separate
class of token we can have more flexibility in terms of allowing HCP's
token to co-exist with the user's management token.

Down the line we can also more easily adjust the permissions attached to
HCP's token to limit it's scope.

With these changes, the cloud management token is like the initial
management token in that iit has the same global management policy and
if it is created it effectively bootstraps the ACL system.

* Update SDK and mock HCP server

The HCP management token will now be sent in a special field rather than
as Consul's "initial management" token configuration.

This commit also updates the mock HCP server to more accurately reflect
the behavior of the CCM backend.

* Refactor HCP bootstrapping logic and add tests

We want to allow users to link Consul clusters that already exist to
HCP. Existing clusters need care when bootstrapped by HCP, since we do
not want to do things like change ACL/TLS settings for a running
cluster.

Additional changes:

* Deconstruct MaybeBootstrap so that it can be tested. The HCP Go SDK
  requires HTTPS to fetch a token from the Auth URL, even if the backend
  server is mocked. By pulling the hcp.Client creation out we can modify
  its TLS configuration in tests while keeping the secure behavior in
  production code.

* Add light validation for data received/loaded.

* Sanitize initial_management token from received config, since HCP will
  only ever use the CloudConfig.MangementToken.

* Add changelog entry
2023-04-27 22:27:39 +02:00
John Maguire 391ed069c4
APIGW: Update how status conditions for certificates are handled (#17115)
* Move status condition for invalid certifcate to reference the listener
that is using the certificate

* Fix where we set the condition status for listeners and certificate
refs, added tests

* Add changelog
2023-04-27 15:54:44 +00:00
Semir Patel 5eaeb7b8e5
Support Envoy's MaxEjectionPercent and BaseEjectionTime config entries for passive health checks (#15979)
* Add MaxEjectionPercent to config entry

* Add BaseEjectionTime to config entry

* Add MaxEjectionPercent and BaseEjectionTime to protobufs

* Add MaxEjectionPercent and BaseEjectionTime to api

* Fix integration test breakage

* Verify MaxEjectionPercent and BaseEjectionTime in integration test upstream confings

* Website docs for MaxEjectionPercent and BaseEjection time

* Add `make docs` to browse docs at http://localhost:3000

* Changelog entry

* so that is the difference between consul-docker and dev-docker

* blah

* update proto funcs

* update proto

---------

Co-authored-by: Maliz <maliheh.monshizadeh@hashicorp.com>
2023-04-26 15:59:48 -07:00
Anita Akaeze d4cacc7232
Merge pull request #5200 from hashicorp/NET-3758 (#17102)
* Merge pull request #5200 from hashicorp/NET-3758

NET-3758: connect: update supported envoy versions to 1.26.0

* lint
2023-04-24 18:23:24 +00:00
Paul Banks a011d8c944
Bump raft to 1.5.0 (#17081)
* Bump raft to 1.5.0

* Add CHANGELOG entry

* Add CHANGELOG entry with right extension (thanks VSCode)

* Add CHANGELOG entry with right extension (thanks VSCode)

* Go mod tidy
2023-04-21 20:13:55 +01:00
Paul Glass 77ecff3209
Permissive mTLS (#17035)
This implements permissive mTLS , which allows toggling services into "permissive" mTLS mode.
Permissive mTLS mode allows incoming "non Consul-mTLS" traffic to be forward unmodified to the application.

* Update service-defaults and proxy-defaults config entries with a MutualTLSMode field
* Update the mesh config entry with an AllowEnablingPermissiveMutualTLS field and implement the necessary validation. AllowEnablingPermissiveMutualTLS must be true to allow changing to MutualTLSMode=permissive, but this does not require that all proxy-defaults and service-defaults are currently in strict mode.
* Update xDS listener config to add a "permissive filter chain" when MutualTLSMode=permissive for a particular service. The permissive filter chain matches incoming traffic by the destination port. If the destination port matches the service port from the catalog, then no mTLS is required and the traffic sent is forwarded unmodified to the application.
2023-04-19 14:45:00 -05:00
R.B. Boyer d07aac8d7e
Revert "cache: refactor agent cache fetching to prevent unnecessary f… (#16818) (#17046)
Revert "cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956)"

Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>
2023-04-19 13:17:21 -05:00
Kyle Havlovitz bdc3dd14c2
Avoid decoding nil pointer in map walker (#17048) 2023-04-19 10:23:38 -07:00
Kevin Wang 268f93e6f4
Bump the golang.org/x/net to 0.7.0 to address CVE-2022-41723 (#16754)
* Bump the golang.org/x/net to 0.7.0 to address CVE-2022-41723

https://nvd.nist.gov/vuln/detail/CVE-2022-41723

* Add changelog entry

---------

Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
2023-04-18 17:31:08 +00:00
Andrei Komarov eb9f671eaf
api: enable query options on agent force-leave endpoint (#15987) 2023-04-18 11:31:48 -05:00
Nathan Coleman 5410139575
Update list of Envoy versions (#16889)
* Update list of Envoy versions

* Update docs + CI + tests

* Add changelog entry

* Add newly-released Envoy versions 1.23.8 and 1.24.6

* Add newly-released Envoy version 1.22.11
2023-04-12 17:43:15 -04:00
Dhia Ayachi b85a149eaf
Memdb Txn Commit race condition fix (#16871)
* Add a test to reproduce the race condition

* Fix race condition by publishing the event after the commit and adding a lock to prevent out of order events.

* split publish to generate the list of events before committing the transaction.

* add changelog

* remove extra func

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* add comment to explain test

---------

Co-authored-by: Dan Upton <daniel@floppy.co>
2023-04-12 13:18:01 -04:00
Derek Menteer 1bcaeabfc3
Remove deprecated service-defaults upstream behavior. (#16957)
Prior to this change, peer services would be targeted by service-default
overrides as long as the new `peer` field was not found in the config entry.
This commit removes that deprecated backwards-compatibility behavior. Now
it is necessary to specify the `peer` field in order for upstream overrides
to apply to a peer upstream.
2023-04-11 10:20:33 -05:00
Chris Thain 175bb1a303
Wasm Envoy HTTP extension (#16877) 2023-04-06 14:12:07 -07:00
Freddy f6de5ff635
Allow dialer to re-establish terminated peering (#16776)
Currently, if an acceptor peer deletes a peering the dialer's peering
will eventually get to a "terminated" state. If the two clusters need to
be re-peered the acceptor will re-generate the token but the dialer will
encounter this error on the call to establish:

"failed to get addresses to dial peer: failed to refresh peer server
addresses, will continue to use initial addresses: there is no active
peering for "<<<ID>>>""

This is because in `exchangeSecret().GetDialAddresses()` we will get an
error if fetching addresses for an inactive peering. The peering shows
up as inactive at this point because of the existing terminated state.

Rather than checking whether a peering is active we can instead check
whether it was deleted. This way users do not need to delete terminated
peerings in the dialing cluster before re-establishing them.
2023-04-03 12:07:45 -06:00
Eric Haberkorn 0d1d2fc4c9
add order by locality failover to Consul enterprise (#16791) 2023-03-30 10:08:38 -04:00
John Maguire c833464daf
Update normalization of route refs (#16789)
* Use merge of enterprise meta's rather than new custom method

* Add merge logic for tcp routes

* Add changelog

* Normalize certificate refs on gateways

* Fix infinite call loop

* Explicitly call enterprise meta
2023-03-28 11:23:49 -04:00
Michael Wilkerson e5d58c59c9
changes to support new PQ enterprise fields (#16793) 2023-03-27 15:40:49 -07:00
John Maguire 351bdc3c0d
Fix struct tags for TCPService enterprise meta (#16781)
* Fix struct tags for TCPService enterprise meta

* Add changelog
2023-03-27 16:17:04 +00:00
Derek Menteer 2236975011
Change partition for peers in discovery chain targets (#16769)
This commit swaps the partition field to the local partition for
discovery chains targeting peers. Prior to this change, peer upstreams
would always use a value of default regardless of which partition they
exist in. This caused several issues in xds / proxycfg because of id
mismatches.

Some prior fixes were made to deal with one-off id mismatches that this
PR also cleans up, since they are no longer needed.
2023-03-24 15:40:19 -05:00
Luke Kysow 4845816c60
Changelog for audit logging fix. (#16700)
* Changelog for audit logging fix.
2023-03-22 13:06:53 -07:00
Eric Haberkorn 3c5c53aa80
fix bug where pqs that failover to a cluster peer dont un-fail over (#16729) 2023-03-22 09:24:13 -04:00
Nitya Dhanushkodi b9bd2c3780
peering: peering partition failover fixes (#16673)
add local source partition for peered upstreams
2023-03-20 10:00:29 -07:00
John Maguire 1ef9f4dade
Fix route subscription when using namespaces (#16677)
* Fix route subscription when using namespaces

* Update changelog

* Fix changelog entry to reference that the bug was enterprise only
2023-03-20 12:42:30 -04:00
Melisa Griffin 606f8fbbab
Adds check to verify that the API Gateway is being created with at least one listener 2023-03-20 12:37:30 -04:00
Dhia Ayachi b9d8552e25
Snapshot restore tests (#16647)
* add snapshot restore test

* add logstore as test parameter

* Use the correct image version

* make sure we read the logs from a followers to test the follower snapshot install path.

* update to raf-wal v0.3.0

* add changelog.

* updating changelog for bug description and removed integration test.

* setting up test container builder to only set logStore for 1.15 and higher

---------

Co-authored-by: Paul Banks <pbanks@hashicorp.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
2023-03-18 14:43:22 -06:00
Andrew Stucki 501b87fd31
[API Gateway] Fix invalid cluster causing gateway programming delay (#16661)
* Add test for http routes

* Add fix

* Fix tests

* Add changelog entry

* Refactor and fix flaky tests
2023-03-17 13:31:04 -04:00
Valeriia Ruban b473151994
fix: add AccessorID property to PUT token request (#16660) 2023-03-16 18:57:59 -07:00
Valeriia Ruban ad25ba3068
feat: update typography to consume hds styles (#16577) 2023-03-14 19:49:14 -07:00
Derek Menteer 8f75d99299
Fix issue with trust bundle read ACL check. (#16630)
This commit fixes an issue where trust bundles could not be read
by services in a non-default namespace, unless they had excessive
ACL permissions given to them.

Prior to this change, `service:write` was required in the default
namespace in order to read the trust bundle. Now, `service:write`
to a service in any namespace is sufficient.
2023-03-14 12:24:33 -05:00
Chris S. Kim d5677e5680
Preserve CARoots when updating Vault CA configuration (#16592)
If a CA config update did not cause a root change, the codepath would return early and skip some steps which preserve its intermediate certificates and signing key ID. This commit re-orders some code and prevents updates from generating new intermediate certificates.
2023-03-13 17:32:59 -04:00
Ashvitha f95ffe0355
Allow HCP metrics collection for Envoy proxies
Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>

Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory
where a unix socket will be created with the name
`<namespace>_<proxy_id>.sock` to forward Envoy metrics.

If set, this will configure:
- In bootstrap configuration a local stats_sink and static cluster.
  These will forward metrics to a loopback listener sent over xDS.

- A dynamic listener listening at the socket path that the previously
  defined static cluster is sending metrics to.

- A dynamic cluster that will forward traffic received at this listener
  to the hcp-metrics-collector service.


Reasons for having a static cluster pointing at a dynamic listener:
- We want to secure the metrics stream using TLS, but the stats sink can
  only be defined in bootstrap config. With dynamic listeners/clusters
  we can use the proxy's leaf certificate issued by the Connect CA,
  which isn't available at bootstrap time.

- We want to intelligently route to the HCP collector. Configuring its
  addreess at bootstrap time limits our flexibility routing-wise. More
  on this below.

Reasons for defining the collector as an upstream in `proxycfg`:
- The HCP collector will be deployed as a mesh service.

- Certificate management is taken care of, as mentioned above.

- Service discovery and routing logic is automatically taken care of,
  meaning that no code changes are required in the xds package.

- Custom routing rules can be added for the collector using discovery
  chain config entries. Initially the collector is expected to be
  deployed to each admin partition, but in the future could be deployed
  centrally in the default partition. These config entries could even be
  managed by HCP itself.
2023-03-10 13:52:54 -07:00
Tyler Wendlandt e6aeb31a26
UI: Fix htmlsafe errors throughout the app (#16574)
* Upgrade ember-intl

* Add changelog

* Add yarn lock
2023-03-09 12:43:35 -07:00
Eric Haberkorn 89de91b263
fix bug that can lead to peering service deletes impacting the state of local services (#16570) 2023-03-08 11:24:03 -05:00
John Eikenberry f5641ffccc
support vault auth config for alicloud ca provider
Add support for using existing vault auto-auth configurations as the
provider configuration when using Vault's CA provider with AliCloud.

AliCloud requires 2 extra fields to enable it to use STS (it's preferred
auth setup). Our vault-plugin-auth-alicloud package contained a method
to help generate them as they require you to make an http call to
a faked endpoint proxy to get them (url and headers base64 encoded).
2023-03-07 03:02:05 +00:00
Valeriia Ruban 63204b5183
feat: update notification to use hds toast component (#16519) 2023-03-06 14:10:09 -08:00
Chris S. Kim 8daddff08d
Follow-up fixes to consul connect envoy command (#16530) 2023-03-06 10:32:06 -05:00
Ronald bf501a337b
Improve ux around ACL token to help users avoid overwriting node/service identities (#16506)
* Deprecate merge-node-identities and merge-service-identities flags

* added tests for node identities changes

* added changelog file and docs
2023-03-06 15:00:39 +00:00
Melisa Griffin fc232326a0
NET-2904 Fixes API Gateway Route Service Weight Division Error 2023-03-06 08:41:57 -05:00
Andrew Stucki 897e5ef2d3
Add some basic UI improvements for api-gateway services (#16508)
* Add some basic ui improvements for api-gateway services

* Add changelog entry

* Use ternary for null check

* Update gateway doc links

* rename changelog entry for new PR

* Fix test
2023-03-03 16:59:04 -05:00
Melisa Griffin 129eca8fdb
NET-2903 Normalize weight for http routes (#16512)
* NET-2903 Normalize weight for http routes

* Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
2023-03-03 16:39:59 -05:00
R.B. Boyer 9a485cdb49
proxycfg: ensure that an irrecoverable error in proxycfg closes the xds session and triggers a replacement proxycfg watcher (#16497)
Receiving an "acl not found" error from an RPC in the agent cache and the
streaming/event components will cause any request loops to cease under the
assumption that they will never work again if the token was destroyed. This
prevents log spam (#14144, #9738).

Unfortunately due to things like:

- authz requests going to stale servers that may not have witnessed the token
  creation yet

- authz requests in a secondary datacenter happening before the tokens get
  replicated to that datacenter

- authz requests from a primary TO a secondary datacenter happening before the
  tokens get replicated to that datacenter

The caller will get an "acl not found" *before* the token exists, rather than
just after. The machinery added above in the linked PRs will kick in and
prevent the request loop from looping around again once the tokens actually
exist.

For `consul-dataplane` usages, where xDS is served by the Consul servers
rather than the clients ultimately this is not a problem because in that
scenario the `agent/proxycfg` machinery is on-demand and launched by a new xDS
stream needing data for a specific service in the catalog. If the watching
goroutines are terminated it ripples down and terminates the xDS stream, which
CDP will eventually re-establish and restart everything.

For Consul client usages, the `agent/proxycfg` machinery is ahead-of-time
launched at service registration time (called "local" in some of the proxycfg
machinery) so when the xDS stream comes in the data is already ready to go. If
the watching goroutines terminate it should terminate the xDS stream, but
there's no mechanism to re-spawn the watching goroutines. If the xDS stream
reconnects it will see no `ConfigSnapshot` and will not get one again until
the client agent is restarted, or the service is re-registered with something
changed in it.

This PR fixes a few things in the machinery:

- there was an inadvertent deadlock in fetching snapshot from the proxycfg
  machinery by xDS, such that when the watching goroutine terminated the
  snapshots would never be fetched. This caused some of the xDS machinery to
  get indefinitely paused and not finish the teardown properly.

- Every 30s we now attempt to re-insert all locally registered services into
  the proxycfg machinery.

- When services are re-inserted into the proxycfg machinery we special case
  "dead" ones such that we unilaterally replace them rather that doing that
  conditionally.
2023-03-03 14:27:53 -06:00
John Eikenberry 56ffee6d42
add provider ca support for approle auth-method
Adds support for the approle auth-method. Only handles using the approle
role/secret to auth and it doesn't support the agent's extra management
configuration options (wrap and delete after read) as they are not
required as part of the auth (ie. they are vault agent things).
2023-03-03 19:29:53 +00:00
Andrew Stucki cc0765b87d
Fix resolution of service resolvers with subsets for external upstreams (#16499)
* Fix resolution of service resolvers with subsets for external upstreams

* Add tests

* Add changelog entry

* Update view filter logic
2023-03-03 14:17:11 -05:00
Andrew Stucki 5deffbd95b
Fix issue where terminating gateway service resolvers weren't properly cleaned up (#16498)
* Fix issue where terminating gateway service resolvers weren't properly cleaned up

* Add integration test for cleaning up resolvers

* Add changelog entry

* Use state test and drop integration test
2023-03-03 09:56:57 -05:00
Andrew Stucki 4b661d1e0c
Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable (#16495)
* Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable

* Regenerate golden files

* Add RequestTimeout field

* Add changelog entry
2023-03-03 09:37:12 -05:00
John Eikenberry e8eec1fa80
add provider ca auth support for kubernetes
Adds support for Kubernetes jwt/token file based auth. Only needs to
read the file and save the contents as the jwt/token.
2023-03-02 22:05:40 +00:00
John Eikenberry 4211069080
add provider ca support for jwt file base auth
Adds support for a jwt token in a file. Simply reads the file and sends
the read in jwt along to the vault login.

It also supports a legacy mode with the jwt string being passed
directly. In which case the path is made optional.
2023-03-02 20:33:06 +00:00
Ronald 4f8594b28f
Improve ux to help users avoid overwriting fields of ACL tokens, roles and policies (#16288)
* Deprecate merge-policies and add options add-policy-name/add-policy-id to improve CLI token update command

* deprecate merge-roles fields

* Fix potential flakey tests and update ux to remove 'completely' + typo fixes
2023-03-01 15:00:37 -05:00
cskh 3970115753
fix (cli): return error msg if acl policy not found (#16485)
* fix: return error msg if acl policy not found

* changelog

* add test
2023-03-01 19:50:03 +00:00
John Eikenberry 4f2d9a91e5
add provider ca auth-method support for azure
Does the required dance with the local HTTP endpoint to get the required
data for the jwt based auth setup in Azure. Keeps support for 'legacy'
mode where all login data is passed on via the auth methods parameters.
Refactored check for hardcoded /login fields.
2023-03-01 00:07:33 +00:00
R.B. Boyer 26820219cd
cli: ensure acl token read -self works (#16445)
Fixes a regression in #16044

The consul acl token read -self cli command should not require an -accessor-id because typically the persona invoking this would not already know the accessor id of their own token.
2023-02-28 10:58:29 -06:00
Tyler Wendlandt a0862e64a5
UI: Fix rendering issue in search and lists (#16444)
* Upgrade ember-cli-string-helpers

* add extra lock change
2023-02-27 16:31:47 -07:00
Valeriia Ruban 859abf8a34
fix: ui tests run is fixed (applying class attribute twice to the hbs element caused the issue (#16428) 2023-02-24 23:46:45 -08:00