14675 Commits

Author SHA1 Message Date
Dhia Ayachi
005ad9e46d
generate a single debug file for a long duration capture (#10279)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* extract wait group outside the go routine to avoid a race condition

* capture pprof in a separate go routine

* perform a single capture for pprof data for the whole duration

* add missing vendor dependency

* add a change log and fix documentation to reflect the change

* create function for timestamp dir creation and simplify error handling

* use error groups and ticker to simplify interval capture loop

* Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval

* refactor Logs capture routine and add log capture specific test

* improve error reporting when log test fail

* change test duration to 1s

* make time parsing in log line more robust

* refactor log time format in a const

* test on log line empty the earliest possible and return

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* rename function to captureShortLived

* more specific changelog

Co-authored-by: Paul Banks <banks@banksco.de>

* update documentation to reflect current implementation

* add test for behavior when invalid param is passed to the command

* fix argument line in test

* a more detailed description of the new behaviour

Co-authored-by: Paul Banks <banks@banksco.de>

* print success right after the capture is done

* remove an unnecessary error check

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 13:00:51 -04:00
allisaurus
6dbfec50ce
docs: Improve ECS routing example nesting (#10316) 2021-06-07 09:28:06 -07:00
Dhia Ayachi
dda3e68791
fix monitor to only start the monitor in json format when requested (#10358)
* fix monitor to only start the monitor in json format when requested

* add release notes

* add test to validate json format when requested
2021-06-07 12:08:48 -04:00
Mark Anderson
1bf3dc5a5f
Docs for Unix Domain Sockets (#10252)
* Docs for Unix Domain Sockets

There are a number of cases where a user might wish to either 1)
expose a service through a Unix Domain Socket in the filesystem
('downstream') or 2) connect to an upstream service by a local unix
domain socket (upstream).
As of Consul (1.10-beta2) we've added new syntax and support to configure
the Envoy proxy to support this
To connect to a service via local Unix Domain Socket instead of a
port, add local_bind_socket_path and optionally local_bind_socket_mode
to the upstream config for a service:
    upstreams = [
      {
         destination_name = "service-1"
         local_bind_socket_path = "/tmp/socket_service_1"
         local_bind_socket_mode = "0700"
	 ...
      }
      ...
    ]
This will cause Envoy to create a socket with the path and mode
provided, and connect that to service-1
The mode field is optional, and if omitted will use the default mode
for Envoy. This is not applicable for abstract sockets. See
https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/address.proto#envoy-v3-api-msg-config-core-v3-pipe
for details
NOTE: These options conflict the local_bind_socket_port and
local_bind_socket_address options. We can bind to an port or we can
bind to a socket, but not both.
To expose a service listening on a Unix Domain socket to the service
mesh use either the 'socket_path' field in the service definition or the
'local_service_socket_path' field in the proxy definition. These
fields are analogous to the 'port' and 'service_port' fields in their
respective locations.
    services {
      name = "service-2"
      socket_path = "/tmp/socket_service_2"
      ...
    }
OR
    proxy {
      local_service_socket_path = "/tmp/socket_service_2"
      ...
    }
There is no mode field since the service is expected to create the
socket it is listening on, not the Envoy proxy.
Again, the socket_path and local_service_socket_path fields conflict
with address/port and local_service_address/local_service_port
configuration entries.
Set up a simple service mesh with dummy services:
socat -d UNIX-LISTEN:/tmp/downstream.sock,fork UNIX-CONNECT:/tmp/upstream.sock
socat -v tcp-l:4444,fork exec:/bin/cat
services {
  name = "sock_forwarder"
  id = "sock_forwarder.1"
  socket_path = "/tmp/downstream.sock"
  connect {
    sidecar_service {
      proxy {
	upstreams = [
	  {
	    destination_name = "echo-service"
	    local_bind_socket_path = "/tmp/upstream.sock"
	    config {
	      passive_health_check {
		interval = "10s"
		max_failures = 42
	      }
	    }
	  }
	]
      }
    }
  }
}
services {
  name = "echo-service"
  port = 4444
  connect = { sidecar_service {} }
Kind = "ingress-gateway"
Name = "ingress-service"
Listeners = [
 {
   Port = 8080
   Protocol = "tcp"
   Services = [
     {
       Name = "sock_forwarder"
     }
   ]
 }
]
consul agent -dev -enable-script-checks -config-dir=./consul.d
consul connect envoy -sidecar-for sock_forwarder.1
consul connect envoy -sidecar-for echo-service -admin-bind localhost:19001
consul config write ingress-gateway.hcl
consul connect envoy -gateway=ingress -register -service ingress-service -address '{{ GetInterfaceIP "eth0" }}:8888' -admin-bind localhost:19002
netcat 127.0.0.1 4444
netcat 127.0.0.1 8080

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* fixup Unix capitalization

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Update website/content/docs/connect/registration/service-registration.mdx

Co-authored-by: Blake Covarrubias <blake@covarrubi.as>

* Provide examples in hcl and json

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

* Apply suggestions from code review

Co-authored-by: Blake Covarrubias <blake@covarrubi.as>

* One more fixup for docs

Signed-off-by: Mark Anderson <manderson@hashicorp.com>

Co-authored-by: Blake Covarrubias <blake@covarrubi.as>
2021-06-04 18:54:31 -07:00
Jeff Escalante
35e524fdde
rotate algolia api key (#10297) 2021-06-04 19:54:05 -04:00
Daniel Nephin
5ef8a045f3 stream: fix a bug with creating a snapshot
The head of the topic buffer was being ignored when creating a snapshot. This commit fixes
the bug by ensuring that the head of the topic buffer is included in the snapshot
before handing it off to the subscription.
2021-06-04 18:33:04 -04:00
Matt Keeler
af3ffdf4c8
Add license inspect command documentation and changelog (#10351)
Also reformatted another changelog entry.
2021-06-04 14:33:13 -04:00
Daniel Nephin
3ebb65ea60
Merge pull request #10348 from hashicorp/dnephin/fix-submatview-store-bug
submatview: fix a bug with Store.Get
2021-06-04 12:04:29 -04:00
Daniel Nephin
59d201e148 submatview: fix a bug with Store.Get
When info.Timeout is 0, it should have no timeout. Previously it was using a 0 duration timeout
which caused it to return without waiting.

This bug was masked by using a timeout in the tests. Removing the timeout caused the tests to fail.
2021-06-03 17:48:44 -04:00
Matt Keeler
c5dc729dda
Follow on to PR 10336 (#10343)
There was some PR feedback that came in just after I merged that other PR. This addresses that feedback.
2021-06-03 12:29:41 -04:00
Daniel Nephin
f204daf538
Merge pull request #10338 from hashicorp/dnephin/fix-logging-indent
agent: remove leading whitespace from agent log lines
2021-06-03 12:16:45 -04:00
Paul Ewing
42a51b1a2c
usagemetrics: add cluster members to metrics API (#10340)
This PR adds cluster members to the metrics API. The number of members per
segment are reported as well as the total number of members.

Tested by running a multi-node cluster locally and ensuring the numbers were
correct. Also added unit test coverage to add the new expected gauges to
existing test cases.
2021-06-03 08:25:53 -07:00
Matt Keeler
7f47603c1a
Merge pull request #10336 from hashicorp/docs/licensing-updates
[Docs] Update documentation with information about v1.10 licensing changes.
2021-06-03 10:50:55 -04:00
Matt Keeler
ca423c80b9 Add enterprise v1.10 specific upgrade notes. 2021-06-03 10:48:16 -04:00
Matt Keeler
4222242f1c Add licensing information to snapshot agent docs. 2021-06-03 10:48:16 -04:00
Matt Keeler
aeaeec15e8 Add deprecation/removal notices regarding the APIs/CLI commands for licensing that are going away.
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2021-06-03 10:48:16 -04:00
Matt Keeler
f3595f5394 Update licensing docs for 1.10 licensing 2021-06-03 10:47:33 -04:00
Matt Keeler
6b7ca99a69 Add licensing telemetry docs. 2021-06-03 10:47:33 -04:00
Daniel Nephin
99956cde37 Add changelog 2021-06-02 17:39:30 -04:00
Daniel Nephin
cec8bc88a9 cmd: remove unnecessary GatedUi
The intent of this struct was to prevent non-json output to stdout. With
the previous cleanup, this can now be done by simply changing the stdout
stream to io.Discard.

This is just one example of why passing around io.Writers for the
streams is better than the UI interface.
2021-06-02 17:33:20 -04:00
Daniel Nephin
2261a469e3 cmd: move agent running message to logs
Previously this line was mixed up with logging, which made the output
quite ugly. Use the logger to output this message, instead of printing
directly to stdout.

This has the advantage that the message will be visible when json logs
are enabled.
2021-06-02 17:17:43 -04:00
Daniel Nephin
b4b85bd83a agent: fix agent logging
Remove the leading whitespace on every log line. This was causing problems for
a customer because their logging system was interpretting the logs as a single
multi-line log.
2021-06-02 17:15:12 -04:00
Daniel Nephin
2fc988d51d cmd: introduce a shim to expose Stdout/Stderr writers
This will allow commands to do the right thing, and write to the proper
output stream.
2021-06-02 16:51:34 -04:00
Daniel Nephin
e573641995 cmd: remove unnecessary args to agent.New
The version args are static and passed in from the caller. Instead read
the static values in New.

The shutdownCh was never closed, so did nothing. Remove it as a field
and an arg.
2021-06-02 16:29:29 -04:00
Daniel Nephin
071007e3fd
Merge pull request #10334 from hashicorp/dnephin/grpc-fix-resolver-data-race
grpc: fix resolver data race
2021-06-02 13:23:27 -04:00
Daniel Nephin
29e93f6338 grpc: fix a data race by using a static resolver
We have seen test flakes caused by 'concurrent map read and map write', and the race detector
reports the problem as well (prevent us from running some tests with -race).

The root of the problem is the grpc expects resolvers to be registered at init time
before any requests are made, but we were using a separate resolver for each test.

This commit introduces a resolver registry. The registry is registered as the single
resolver for the consul scheme. Each test uses the Authority section of the target
(instead of the scheme) to identify the resolver that should be used for the test.
The scheme is used for lookup, which is why it can no longer be used as the unique
key.

This allows us to use a lock around the map of resolvers, preventing the data race.
2021-06-02 11:35:38 -04:00
Jimmy Merritello
faf05bcb85
Fix broken link (#10335) 2021-06-02 09:33:12 -05:00
Jimmy Merritello
5f8797dad3
[Website] WIP - Update Homepage (#10314)
* Initial structure for updated homepage

* Bring back <UseCases />

* Add section stubs

* Add ecosystem section

* Add features section

* Iron out features section

* Add Learn Callout section

* Copy updates

* Better together copy

* Add updated copy & swap assets

* Remove comment & just add existing icon for now

* Copy and asset tweaks

* Remove unwanted copy

* Process the codeblock

* Add transparent img

* Swap for transparent img

* More transparent img

* Use Learn cards pattern

* Rearrange img and finishing padding touches
2021-06-02 09:22:52 -05:00
Daniel Nephin
c94eaa4957 submatview: improve a couple comments 2021-06-01 17:49:31 -04:00
Kendall Strautman
3df2cb7ea1 chore: updates alert banner — hashiconf 2021-06-01 14:28:08 -07:00
Daniel Nephin
a2f3d0b139
Merge pull request #10325 from hashicorp/docs/clarify-acl-set-agent-token-persistence
docs: Clarify set-agent-token token persistence behavior
2021-06-01 16:47:45 -04:00
Daniel Nephin
7d324b4795
Merge pull request #10315 from hashicorp/ma/haxandmat-replace
Fix strings.Replace->strings.ReplaceAll
2021-06-01 15:05:47 -04:00
Daniel Nephin
eb4f8b17e9
Merge pull request #10324 from hashicorp/dnephin/fix-envoy-bootstrap-exec
envoy: fix deadlock when input is larger than named pipe buffer size
2021-06-01 13:02:51 -04:00
Dhia Ayachi
15dddc9edb
make tests use a dummy node_name to avoid environment related failures (#10262)
* fix tests to use a dummy nodeName and not fail when hostname is not a valid nodeName

* remove conditional testing

* add test when node name is invalid
2021-06-01 11:58:03 -04:00
Daniel Nephin
2054402a53 envoy: improve comments 2021-06-01 11:35:32 -04:00
Daniel Nephin
d3ca202a32
Merge pull request #9556 from hashicorp/dnephin/add-more-cache-key-completness-tests
structs: Add more cache key completeness tests
2021-06-01 11:24:02 -04:00
Blake Covarrubias
2196262eab docs: Clarify set-agent-token token persistence behavior
Clarify that tokens configured via `set-agent-token` will not be
persisted if `acl.enable_token_persistence` is `false`.
2021-05-31 16:08:43 -07:00
Daniel Nephin
c9bc5f92b7 envoy: fix bootstrap deadlock caused by a full named pipe
Normally the named pipe would buffer up to 64k, but in some cases when a
soft limit is reached, they will start only buffering up to 4k.
In either case, we should not deadlock.

This commit changes the pipe-bootstrap command to first buffer all of
stdin into the process, before trying to write it to the named pipe.
This allows the process memory to act as the buffer, instead of the
named pipe.

Also changed the order of operations in `makeBootstrapPipe`. The new
test added in this PR showed that simply buffering in the process memory
was not enough to fix the issue. We also need to ensure that the
`pipe-bootstrap` process is started before we try to write to its
stdin. Otherwise the write will still block.

Also set stdout/stderr on the subprocess, so that any errors are visible
to the user.
2021-05-31 18:53:17 -04:00
Daniel Nephin
e1b1ab7ef6 envoy: start timeout func after validation
This removes the need to check arg length in the timeout function.
2021-05-31 17:37:58 -04:00
Daniel Nephin
ba15f92a8a structs: fix cache keys
So that requests are cached properly, and the cache does not return the wrong data for a
request.
2021-05-31 17:22:16 -04:00
Daniel Nephin
920ae31598 structs: add two cache completeness tests types that implement cache.Request 2021-05-31 16:54:41 -04:00
Daniel Nephin
46dfdb611f structs: improve the interface of assertCacheInfoKeyIsComplete 2021-05-31 16:54:41 -04:00
Daniel Nephin
7c2957e24d structs: Add more cache key tests 2021-05-31 16:54:40 -04:00
Blake Covarrubias
665e052e96 docs: Fix agent token name under ACL Agent Token
Reference the correct name of the agent token in the ACL Agent Token
section for the ACL System docs.
2021-05-31 10:52:15 -07:00
Konstantin Albutov
616cec4b45 Fix strings.Replace->strings.ReplaceAll 2021-05-27 16:57:10 -07:00
Dhia Ayachi
f785c5b332
RPC Timeout/Retries account for blocking requests (#8978) 2021-05-27 17:29:43 -04:00
Stanko
b130f355ee ui: Fix broken link format in ECS install page 2021-05-27 14:11:04 -07:00
Bryce Kalow
8abc38658a
fix(website): update node version to latest LTS for website (#10312) 2021-05-27 15:35:06 -05:00
allisaurus
0b52545c01
Add note about new ECS ARN format to ECS docs (#10304)
* docs: Add note about ECS task ARN format to ECS docs
2021-05-27 10:59:28 -07:00
Matt Keeler
3a0007b158
Bump raft-autopilot version to the latest. (#10306) 2021-05-27 12:59:14 -04:00