Commit Graph

304 Commits

Author SHA1 Message Date
Daniel Nephin d81f527be8
Merge pull request #9924 from hashicorp/dnephin/cert-expiration-metric
connect: emit a metric for the seconds until root CA expiry
2021-06-18 14:18:55 -04:00
Nitya Dhanushkodi 49d25448b4
Update .changelog/10423.txt
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2021-06-17 12:06:26 -07:00
Nitya Dhanushkodi 52043830b4 proxycfg: reference to entry in map should not panic 2021-06-17 11:49:04 -07:00
Mike Morris 20921192ee changelog: add notes section to changelog template 2021-06-16 16:58:11 -04:00
Mike Morris 46bd1965e5 changelog: add note about packaging EULA and ToE alongside Enterprise binaries 2021-06-16 16:58:08 -04:00
Freddy ae886136f1
Merge pull request #10404 from hashicorp/ingress-stats 2021-06-15 14:28:07 -06:00
freddygv fb974978da Add changelog entry 2021-06-15 14:15:30 -06:00
Nitya Dhanushkodi b8b44419a0
proxycfg: Ensure that endpoints for explicit upstreams in other datacenters are watched in transparent mode (#10391)
Co-authored-by: Freddy Vallenilla <freddy@hashicorp.com>
2021-06-15 11:00:26 -07:00
Dhia Ayachi c8ba2d40fd
improve monitor performance (#10368)
* remove flush for each write to http response in the agent monitor endpoint

* fix race condition when we stop and start monitor multiple times, the doneCh is closed and never recover.

* start log reading goroutine before adding the sink to avoid filling the log channel before getting a chance of reading from it

* flush every 500ms to optimize log writing in the http server side.

* add changelog file

* add issue url to changelog

* fix changelog url

* Update changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* use ticker to flush and avoid race condition when flushing in a different goroutine

* stop the ticker when done

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* Revert "fix race condition when we stop and start monitor multiple times, the doneCh is closed and never recover."

This reverts commit 1eeddf7a

* wait for log consumer loop to start before registering the sink

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-06-15 12:05:52 -04:00
R.B. Boyer 848ad8535b
xds: ensure that dependent xDS resources are reconfigured during primary type warming (#10381)
Updates to a cluster will clear the associated endpoints, and updates to
a listener will clear the associated routes. Update the incremental xDS
logic to account for this implicit cleanup so that we can finish warming
the clusters and listeners.

Fixes #10379
2021-06-14 17:20:27 -05:00
Daniel Nephin aec7e798b0 Update metric name
and handle the case where there is no active root CA.
2021-06-14 17:01:16 -04:00
Daniel Nephin 1c980e4700 connect: emit a metric for the number of seconds until root CA expiration 2021-06-14 16:57:01 -04:00
R.B. Boyer a2460eea24
grpc: move gRPC INFO logs to be emitted as TRACE logs from Consul (#10395)
Fixes #10183
2021-06-14 15:13:58 -05:00
Freddy 33bd9b5be8
Relax validation for expose.paths config (#10394)
Previously we would return an error if duplicate paths were specified.
This could lead to problems in cases where a user has the same path,
say /healthz, on two different ports.

This validation was added to signal a potential misconfiguration.
Instead we will only check for duplicate listener ports, since that is
what would lead to ambiguity issues when generating xDS config.

In the future we could look into using a single listener and creating
distinct filter chains for each path/port.
2021-06-14 14:04:11 -06:00
Freddy 429f9d8bb8
Add flag for transparent proxies to dial individual instances (#10329) 2021-06-09 14:34:17 -06:00
Freddy 7577f0e991
Revert "Avoid adding original_dst filter when not needed" (#10365) 2021-06-08 13:18:41 -06:00
Dhia Ayachi 005ad9e46d
generate a single debug file for a long duration capture (#10279)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* extract wait group outside the go routine to avoid a race condition

* capture pprof in a separate go routine

* perform a single capture for pprof data for the whole duration

* add missing vendor dependency

* add a change log and fix documentation to reflect the change

* create function for timestamp dir creation and simplify error handling

* use error groups and ticker to simplify interval capture loop

* Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval

* refactor Logs capture routine and add log capture specific test

* improve error reporting when log test fail

* change test duration to 1s

* make time parsing in log line more robust

* refactor log time format in a const

* test on log line empty the earliest possible and return

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* rename function to captureShortLived

* more specific changelog

Co-authored-by: Paul Banks <banks@banksco.de>

* update documentation to reflect current implementation

* add test for behavior when invalid param is passed to the command

* fix argument line in test

* a more detailed description of the new behaviour

Co-authored-by: Paul Banks <banks@banksco.de>

* print success right after the capture is done

* remove an unnecessary error check

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 13:00:51 -04:00
Dhia Ayachi dda3e68791
fix monitor to only start the monitor in json format when requested (#10358)
* fix monitor to only start the monitor in json format when requested

* add release notes

* add test to validate json format when requested
2021-06-07 12:08:48 -04:00
Matt Keeler af3ffdf4c8
Add license inspect command documentation and changelog (#10351)
Also reformatted another changelog entry.
2021-06-04 14:33:13 -04:00
Daniel Nephin f204daf538
Merge pull request #10338 from hashicorp/dnephin/fix-logging-indent
agent: remove leading whitespace from agent log lines
2021-06-03 12:16:45 -04:00
Paul Ewing 42a51b1a2c
usagemetrics: add cluster members to metrics API (#10340)
This PR adds cluster members to the metrics API. The number of members per
segment are reported as well as the total number of members.

Tested by running a multi-node cluster locally and ensuring the numbers were
correct. Also added unit test coverage to add the new expected gauges to
existing test cases.
2021-06-03 08:25:53 -07:00
Daniel Nephin 99956cde37 Add changelog 2021-06-02 17:39:30 -04:00
Daniel Nephin c9bc5f92b7 envoy: fix bootstrap deadlock caused by a full named pipe
Normally the named pipe would buffer up to 64k, but in some cases when a
soft limit is reached, they will start only buffering up to 4k.
In either case, we should not deadlock.

This commit changes the pipe-bootstrap command to first buffer all of
stdin into the process, before trying to write it to the named pipe.
This allows the process memory to act as the buffer, instead of the
named pipe.

Also changed the order of operations in `makeBootstrapPipe`. The new
test added in this PR showed that simply buffering in the process memory
was not enough to fix the issue. We also need to ensure that the
`pipe-bootstrap` process is started before we try to write to its
stdin. Otherwise the write will still block.

Also set stdout/stderr on the subprocess, so that any errors are visible
to the user.
2021-05-31 18:53:17 -04:00
Dhia Ayachi f785c5b332
RPC Timeout/Retries account for blocking requests (#8978) 2021-05-27 17:29:43 -04:00
Matt Keeler 3a0007b158
Bump raft-autopilot version to the latest. (#10306) 2021-05-27 12:59:14 -04:00
Dhia Ayachi 4c7f5f31c7
debug: remove the CLI check for debug_enabled (#10273)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* Add changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-05-27 09:41:53 -04:00
Mike Morris 04e733a284 changelog: add entry for network areas WAN config fix 2021-05-26 17:49:19 -04:00
Freddy 353280660f
Ensure passthrough clusters can be created (#10301) 2021-05-26 15:05:14 -06:00
Freddy 19334e8abf
Avoid adding original_dst filter when not needed (#10302) 2021-05-26 15:04:45 -06:00
John Cowen af72b9e9ab
ui: Unix Domain Socket support (#10287)
This commit adds UI support for Unix Domain Sockets for upstream and downstreams (see #9981 and #10252)
2021-05-26 17:52:25 +01:00
John Cowen 2e4c9f5330
ui: Support Route optional parameters/segments (#10212)
Moves our URLs with 'optional namespace segment' into a separately abstracted 'optional URL segment' feature
2021-05-26 17:43:46 +01:00
Rémi Lapeyre 213524a657
Always set the Content-Type header when a body is present (#10204)
* Always set the Content-Type header when a body is present

Closes https://github.com/hashicorp/consul/issues/10011

* Add Changelog entry

* Add more Content-Type exceptions

* Fix tests
2021-05-25 16:03:48 +01:00
Kenia f3efde3843
ui: Create and use collapsible notices component (#10270)
* Create and use collapsible notices

* Refactor collapsible-notices

* Split up the topology acceptance tests

* Add acceptance tests for tproxy notices

* Add component file

* Adds additional TProxy notices tests

* Adds conditional to only show collapsable if more than 2 notices are present

* Adds changelog

* Refactorting the conditonal for collapsing the notices

* Renaming undefinedIntention to be notDefinedIntention

* Refactor tests
2021-05-25 11:02:38 -04:00
Matt Keeler da31e0449e Move some things around to allow for license updating via config reload
The bulk of this commit is moving the LeaderRoutineManager from the agent/consul package into its own package: lib/gort. It also got a renaming and its Start method now requires a context. Requiring that context required updating a whole bunch of other places in the code.
2021-05-25 09:57:50 -04:00
Matt Keeler caafc02449 hcs-1936: Prepare for adding license auto-retrieval to auto-config in enterprise 2021-05-24 13:20:30 -04:00
Daniel Nephin d5dec93992
Merge pull request #10272 from hashicorp/dnephin/backport-namespace-license-fix
Backport some ent changes for serf tags
2021-05-21 12:31:34 -04:00
Matt Keeler 08eb600ee5 Deprecate API driven licensing.
The two methods in the API client to Put or Reset a license will now always return an error.
2021-05-21 11:08:50 -04:00
Matt Keeler 50cc9fdd06 Add OSS bits for supporting specifying the enterprise license via config 2021-05-20 16:11:33 -04:00
Daniel Nephin 21a62b4bac Add changelog 2021-05-20 12:57:15 -04:00
John Cowen 39302041e9
ui: Miscellaneous Lock Session fixes (#10225) 2021-05-19 11:05:54 +01:00
Daniel Nephin 2b0700fe39 add changelog 2021-05-18 15:04:12 -04:00
R.B. Boyer ede14b7c54
xds: emit a labeled gauge of connected xDS streams by version (#10243)
Fixes #10099
2021-05-14 13:59:13 -05:00
R.B. Boyer 597448da47
server: ensure that central service config flattening properly resets the state each time (#10239)
The prior solution to call reply.Reset() aged poorly since newer fields
were added to the reply, but not added to Reset() leading serial
blocking query loops on the server to blend replies.

This could manifest as a service-defaults protocol change from
default=>http not reverting back to default after the config entry
reponsible was deleted.
2021-05-14 10:21:44 -05:00
R.B. Boyer 7e1d7803b8
agent: ensure we hash the non-deprecated upstream fields on ServiceConfigRequest (#10240) 2021-05-14 10:15:48 -05:00
Freddy f8c4075953
Add changelog entry for network area timeout updates (#10241) 2021-05-13 15:05:38 -06:00
John Cowen 04bd576179
ui: Serf Health Check warning notice (#10194)
When the Consul serf health check is failing, this means that the health checks registered with the agent may no longer be correct. Therefore we show a notice to the user when we detect that the serf health check is failing both for the health check listing for nodes and for service instances.

There were a few little things we fixed up whilst we were here:

- We use our @replace decorator to replace an empty Type with serf in the model.
- We noticed that ServiceTags can be null, so we replace that with an empty array.
- We added docs for both our Notice component and the Consul::HealthCheck::List component. Notice now defaults to @type=info.
2021-05-13 11:36:51 +01:00
Iryna Shustava d7d44f6ae7
Save exposed ports in agent's store and expose them via API (#10173)
* Save exposed HTTP or GRPC ports to the agent's store
* Add those the health checks API so we can retrieve them from the API
* Change redirect-traffic command to also exclude those ports from inbound traffic redirection when expose.checks is set to true.
2021-05-12 13:51:39 -07:00
R.B. Boyer 3b50a55533
connect: update supported envoy versions to 1.18.3, 1.17.3, 1.16.4, and 1.15.5 (#10231) 2021-05-12 14:06:06 -05:00
Kenia ecbeaa87c1
ui: Add conditionals to Lock Session list items (#10121)
* Add conditionals to Lock Session list items

* Add changelog

* Show ID in details if there is a name to go in title

* Add copy-button if ID is in the title

* Update TTL conditional

* Update .changelog/10121.txt

Co-authored-by: John Cowen <johncowen@users.noreply.github.com>

Co-authored-by: John Cowen <johncowen@users.noreply.github.com>
2021-05-11 11:35:15 -04:00
Daniel Nephin b823cd3994
Merge pull request #10188 from hashicorp/dnephin/dont-persist-agent-tokens
agent/local: do not persist the agent or user token
2021-05-10 15:58:20 -04:00