3219 Commits

Author SHA1 Message Date
Nitya Dhanushkodi
c9e5177b35 proxycfg: Ensure that endpoints for explicit upstreams in other datacenters are watched in transparent mode (#10391)
Co-authored-by: Freddy Vallenilla <freddy@hashicorp.com>
2021-06-15 18:03:52 +00:00
Dhia Ayachi
d4aa152850 improve monitor performance (#10368)
* remove flush for each write to http response in the agent monitor endpoint

* fix race condition when we stop and start monitor multiple times, the doneCh is closed and never recover.

* start log reading goroutine before adding the sink to avoid filling the log channel before getting a chance of reading from it

* flush every 500ms to optimize log writing in the http server side.

* add changelog file

* add issue url to changelog

* fix changelog url

* Update changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* use ticker to flush and avoid race condition when flushing in a different goroutine

* stop the ticker when done

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* Revert "fix race condition when we stop and start monitor multiple times, the doneCh is closed and never recover."

This reverts commit 1eeddf7a

* wait for log consumer loop to start before registering the sink

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-06-15 16:23:20 +00:00
R.B. Boyer
f72774618d xds: ensure that dependent xDS resources are reconfigured during primary type warming (#10381)
Updates to a cluster will clear the associated endpoints, and updates to
a listener will clear the associated routes. Update the incremental xDS
logic to account for this implicit cleanup so that we can finish warming
the clusters and listeners.

Fixes #10379
2021-06-14 22:21:04 +00:00
Freddy
645e406ca0 Rename CatalogDestinationsOnly (#10397)
CatalogDestinationsOnly is a passthrough that would enable dialing
addresses outside of Consul's catalog. However, when this flag is set to
true only _connect_ endpoints for services can be dialed.

This flag is being renamed to signal that non-Connect endpoints can't be
dialed by transparent proxies when the value is set to true.
2021-06-14 20:15:58 +00:00
Freddy
f6e32892dc Relax validation for expose.paths config (#10394)
Previously we would return an error if duplicate paths were specified.
This could lead to problems in cases where a user has the same path,
say /healthz, on two different ports.

This validation was added to signal a potential misconfiguration.
Instead we will only check for duplicate listener ports, since that is
what would lead to ambiguity issues when generating xDS config.

In the future we could look into using a single listener and creating
distinct filter chains for each path/port.
2021-06-14 20:04:50 +00:00
Daniel Nephin
a5524f26c0 Merge pull request #10378 from hashicorp/dnephin/agent-self-primary-dc
http: add PrimaryDatacenter to the /v1/agent/self  response
2021-06-11 17:45:04 +00:00
hc-github-team-consul-core
d4bfdafff4
update bindata_assetfs.go 2021-06-10 00:14:47 +00:00
Freddy
168073c4dc Add flag for transparent proxies to dial individual instances (#10329) 2021-06-09 20:39:37 +00:00
Daniel Nephin
e5baf32f22 Merge pull request #10367 from hashicorp/dnephin/submatview-store-get-tests
submatview: add test cases for store.Get with timeout and no index
2021-06-09 15:54:22 +00:00
Daniel Nephin
1ed213470c Merge pull request #10364 from hashicorp/dnephin/streaming-e2e-test
submatview: and Store integration test with stream backend
2021-06-08 20:14:25 +00:00
Freddy
f0fe3cf4a6 Revert "Avoid adding original_dst filter when not needed" (#10365) 2021-06-08 19:19:31 +00:00
Daniel Nephin
6327c3fb3f Merge pull request #10348 from hashicorp/dnephin/fix-submatview-store-bug
submatview: fix a bug with Store.Get
2021-06-04 16:06:56 +00:00
Paul Ewing
a9c2f6a741 usagemetrics: add cluster members to metrics API (#10340)
This PR adds cluster members to the metrics API. The number of members per
segment are reported as well as the total number of members.

Tested by running a multi-node cluster locally and ensuring the numbers were
correct. Also added unit test coverage to add the new expected gauges to
existing test cases.
2021-06-03 15:26:35 +00:00
Daniel Nephin
749a0b01c3 Merge pull request #10334 from hashicorp/dnephin/grpc-fix-resolver-data-race
grpc: fix resolver data race
2021-06-02 17:24:05 +00:00
Daniel Nephin
cab82ca5a0 Merge pull request #9556 from hashicorp/dnephin/add-more-cache-key-completness-tests
structs: Add more cache key completeness tests
2021-06-01 15:28:34 +00:00
Dhia Ayachi
db23df862c debug: remove the CLI check for debug_enabled (#10273)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* Add changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-05-31 18:46:42 +00:00
hc-github-team-consul-core
6efc5751ca
update bindata_assetfs.go 2021-05-27 15:01:08 +00:00
Freddy
f1ab78757e Ensure passthrough clusters can be created (#10301) 2021-05-26 21:05:55 +00:00
Freddy
a2dcb9621d Avoid adding original_dst filter when not needed (#10302) 2021-05-26 21:05:24 +00:00
Matt Keeler
f054099e84 Move some things around to allow for license updating via config reload
The bulk of this commit is moving the LeaderRoutineManager from the agent/consul package into its own package: lib/gort. It also got a renaming and its Start method now requires a context. Requiring that context required updating a whole bunch of other places in the code.
2021-05-25 13:58:35 +00:00
Matt Keeler
c87ed75400 hcs-1936: Prepare for adding license auto-retrieval to auto-config in enterprise 2021-05-24 17:21:08 +00:00
Matt Keeler
c6377111f5 Preparation for changing where license management is done. 2021-05-24 14:22:27 +00:00
Daniel Nephin
4a6b53fa22 Merge pull request #10272 from hashicorp/dnephin/backport-namespace-license-fix
Backport some ent changes for serf tags
2021-05-21 16:35:30 +00:00
Matt Keeler
d80ae8baa8 Add OSS bits for supporting specifying the enterprise license via config 2021-05-20 20:12:05 +00:00
Daniel Nephin
accc5db292 Merge pull request #8812 from jjshanks/GH-8728
GH-8728 add raft default values
2021-05-18 19:33:09 +00:00
R.B. Boyer
4025a6349a xds: emit a labeled gauge of connected xDS streams by version (#10243)
Fixes #10099
2021-05-14 19:00:15 +00:00
R.B. Boyer
e83dc4375d server: ensure that central service config flattening properly resets the state each time (#10239)
The prior solution to call reply.Reset() aged poorly since newer fields
were added to the reply, but not added to Reset() leading serial
blocking query loops on the server to blend replies.

This could manifest as a service-defaults protocol change from
default=>http not reverting back to default after the config entry
reponsible was deleted.
2021-05-14 15:22:16 +00:00
R.B. Boyer
54f5b96a5b agent: ensure we hash the non-deprecated upstream fields on ServiceConfigRequest (#10240) 2021-05-14 15:16:27 +00:00
Iryna Shustava
47d8f050d2 Save exposed ports in agent's store and expose them via API (#10173)
* Save exposed HTTP or GRPC ports to the agent's store
* Add those the health checks API so we can retrieve them from the API
* Change redirect-traffic command to also exclude those ports from inbound traffic redirection when expose.checks is set to true.
2021-05-12 20:56:15 +00:00
Daniel Nephin
50b66ebb63 Merge pull request #10217 from hashicorp/dnephin/test-flakes
testing: attempt to fix some test flakes
2021-05-12 19:39:07 +00:00
R.B. Boyer
88a8656e13 connect: update supported envoy versions to 1.18.3, 1.17.3, 1.16.4, and 1.15.5 (#10231) 2021-05-12 19:06:43 +00:00
Daniel Nephin
ac0697ac48 Merge pull request #10188 from hashicorp/dnephin/dont-persist-agent-tokens
agent/local: do not persist the agent or user token
2021-05-10 19:58:59 +00:00
Daniel Nephin
2b5b54bd37 Merge pull request #10075 from hashicorp/dnephin/handle-raft-apply-errors
rpc: some cleanup of canRetry and ForwardRPC
2021-05-06 21:00:35 +00:00
Freddy
32e013c834 Merge pull request #10187 from hashicorp/fixup/ent-tproxy-test 2021-05-06 20:48:25 +00:00
Daniel Nephin
862d9b9d43 Merge pull request #10047 from hashicorp/dnephin/config-entry-validate
state: reduce arguments to validateProposedConfigEntryInServiceGraph
2021-05-06 18:11:52 +00:00
Daniel Nephin
dd6257e17c Merge pull request #10189 from hashicorp/dnephin/http-api-health-query-meta
http: set consistency header properly for health endpoint
2021-05-06 18:10:12 +00:00
hc-github-team-consul-core
acc171aa38
update bindata_assetfs.go 2021-05-05 23:41:12 +00:00
Mark Anderson
0a6d439dbb Merge pull request #10185 from hashicorp/ma/uds_fixups
Fixup UDS failing tests.
2021-05-05 16:17:32 -04:00
Mark Anderson
42ff449d4f Merge pull request #9981 from hashicorp/ma/uds_upstreams
Unix Domain Socket support for upstreams and downstreams
2021-05-05 16:17:32 -04:00
Daniel Nephin
c1d1be2a4b Merge pull request #10155 from hashicorp/dnephin/config-entry-remove-fields
config-entry: remove Kind and Name field from Mesh config entry
2021-05-04 21:28:26 +00:00
Daniel Nephin
a583415bed Merge pull request #10161 from hashicorp/dnephin/update-deps
Update a couple dependencies
2021-05-04 18:32:22 +00:00
Freddy
2d633ed804 Fixup discovery chain handling in transparent mode (#10168)
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

Previously we would associate the address of a discovery chain target
with the discovery chain's filter chain. This was broken for a few reasons:

- If the upstream is a virtual service, the client proxy has no way of
dialing it because virtual services are not targets of their discovery
chains. The targets are distinct services. This is addressed by watching
the endpoints of all upstream services, not just their discovery chain
targets.

- If multiple discovery chains resolve to the same target, that would
lead to multiple filter chains attempting to match on the target's
virtual IP. This is addressed by only matching on the upstream's virtual
IP.

NOTE: this implementation requires an intention to the redirecting
virtual service and not just to the final destination. This is how
we can know that the virtual service is an upstream to watch.

A later PR will look into traversing discovery chains when computing
upstreams so that intentions are only required to the discovery chain
targets.
2021-05-04 14:46:53 +00:00
Paul Banks
fa1b308c7b Make Raft trailing logs and snapshot timing reloadable (#10129)
* WIP reloadable raft config

* Pre-define new raft gauges

* Update go-metrics to change gauge reset behaviour

* Update raft to pull in new metric and reloadable config

* Add snapshot persistance timing and installSnapshot to our 'protected' list as they can be infrequent but are important

* Update telemetry docs

* Update config and telemetry docs

* Add note to oldestLogAge on when it is visible

* Add changelog entry

* Update website/content/docs/agent/options.mdx

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
2021-05-04 14:40:40 +00:00
Freddy
4a4a1ebff8 Only consider virtual IPs for transparent proxies (#10162)
Initially we were loading every potential upstream address into Envoy
and then routing traffic to the logical upstream service. The downside
of this behavior is that traffic meant to go to a specific instance
would be load balanced across ALL instances.

Traffic to specific instance IPs should be forwarded to the original
destination and if it's a destination in the mesh then we should ensure
the appropriate certificates are used.

This PR makes transparent proxying a Kubernetes-only feature for now
since support for other environments requires generating virtual IPs,
and Consul does not do that at the moment.
2021-05-03 15:06:36 -06:00
Luke Kysow
c816e29ef7 Give descriptive error if auth method not found (#10163)
* Give descriptive error if auth method not found

Previously during a `consul login -method=blah`, if the auth method was not found, the
error returned would be "ACL not found". This is potentially confusing
because there may be many different ACLs involved in a login: the ACL of
the Consul client, perhaps the binding rule or the auth method.

Now the error will be "auth method blah not found", which is much easier
to debug.
2021-05-03 20:39:51 +00:00
Daniel Nephin
ac2aeb8f44 Merge pull request #10149 from hashicorp/dnephin/config-use-streaming-backend-defualt-true
config: default UseStreamingBackend to true
2021-04-30 20:30:28 +00:00
R.B. Boyer
145a83e436 connect: update supported envoy versions to 1.18.2, 1.17.2, 1.16.3, and 1.15.4 (#10101)
The only thing that needed fixing up pertained to this section of the 1.18.x release notes:

> grpc_stats: the default value for stats_for_all_methods is switched from true to false, in order to avoid possible memory exhaustion due to an untrusted downstream sending a large number of unique method names. The previous default value was deprecated in version 1.14.0. This only changes the behavior when the value is not set. The previous behavior can be used by setting the value to true. This behavior change by be overridden by setting runtime feature envoy.deprecated_features.grpc_stats_filter_enable_stats_for_all_methods_by_default.

For now to maintain status-quo I'm explicitly setting `stats_for_all_methods=true` in all versions to avoid relying upon the default.

Additionally the naming of the emitted metrics for these gRPC requests changed slightly so the integration test assertions for `case-grpc` needed adjusting.
2021-04-29 20:22:41 +00:00
R.B. Boyer
df5e55fc50 xds: ensure that all envoyproxy/go-control-plane protobuf symbols are linked into the final binary (#10131)
This ensures that if someone does include some extension Consul does not currently make use of, that extension is actually usable. Without linking these envoy protobufs into the main binary it can't round trip the escape hatches to send them down to envoy.

Whenenver the go-control-plane library is upgraded next we just have to re-run 'make envoy-library'.
2021-04-29 19:58:58 +00:00
R.B. Boyer
6a39b47448 Support Incremental xDS mode (#9855)
This adds support for the Incremental xDS protocol when using xDS v3. This is best reviewed commit-by-commit and will not be squashed when merged.

Union of all commit messages follows to give an overarching summary:

xds: exclusively support incremental xDS when using xDS v3

Attempts to use SoTW via v3 will fail, much like attempts to use incremental via v2 will fail.
Work around a strange older envoy behavior involving empty CDS responses over incremental xDS.
xds: various cleanups and refactors that don't strictly concern the addition of incremental xDS support

Dissolve the connectionInfo struct in favor of per-connection ResourceGenerators instead.
Do a better job of ensuring the xds code uses a well configured logger that accurately describes the connected client.
xds: pull out checkStreamACLs method in advance of a later commit

xds: rewrite SoTW xDS protocol tests to use protobufs rather than hand-rolled json strings

In the test we very lightly reuse some of the more boring protobuf construction helper code that is also technically under test. The important thing of the protocol tests is testing the protocol. The actual inputs and outputs are largely already handled by the xds golden output tests now so these protocol tests don't have to do double-duty.

This also updates the SoTW protocol test to exclusively use xDS v2 which is the only variant of SoTW that will be supported in Consul 1.10.

xds: default xds.Server.AuthCheckFrequency at use-time instead of construction-time
2021-04-29 18:54:53 +00:00
Freddy
740613fcf1 Rename cluster config files to mesh as well (#10148) 2021-04-29 00:16:06 +00:00