Commit Graph

1558 Commits

Author SHA1 Message Date
Kyle Havlovitz 106b8b0b33 Kill check processes after the timeout is reached (#3567)
* Kill check processes after the timeout is reached

Kill the subprocess spawned by a script check once the timeout is reached. Previously Consul just marked the check critical and left the subprocess around.

Fixes #3565.

* Set err to non-nil when timeout occurs

* Fix check timeout test

* Kill entire process subtree on check timeout

* Add a docs note about windows subprocess termination
2017-10-11 11:57:39 -07:00
Frank Schröder 94f58199b1 agent: add option to discard health output (#3562)
* agent: add option to discard health output

In high volatile environments consul will have checks with "noisy"
output which changes every time even though the status does not change.
Since the output is stored in the raft log every health check update
unblocks a blocking call on health checks since the raft index has
changed even though the status of the health checks may not have changed
at all. By discarding the output of the health checks the users can
choose a different tradeoff. Less visibility on why a check failed in
exchange for a reduced change rate on the raft log.

* agent: discard output also when adding a check

* agent: add test for discard check output

* agent: update docs

* go vet

* Adds discard_check_output to reloadable config table.

* Updates the change log.
2017-10-10 17:04:52 -07:00
Frank Schröder 759ef8a1d4 config: add generic method to translate between CamelCase and snake_case (#3557)
* doc: document discrepancy between id and CheckID

* doc: document enable_tag_override change

* config: add TranslateKeys helper

TranslateKeys makes it easier to map between different representations
of internal structures. It allows to recursively map alias keys to
canonical keys in structured maps.

* config: use TranslateKeys for config file

This also adds support for 'enabletagoverride' and removes
the need for a separate CheckID alias field.

* config: remove dead code

* agent: use TranslateKeys for FixupCheckType

* agent: translate enable_tag_override during service registration

* doc: add '.hcl' as valid extension

* config: map ScriptArgs to args

* config: add comment for TranslateKeys
2017-10-10 16:40:59 -07:00
James Phillips d3cebace7c
Updates documentation to s/script/args/ in API docs. 2017-10-10 16:37:08 -07:00
James Phillips bb12368eac Makes RPC handling more robust when rolling servers. (#3561)
* Adds client-side retry for no leader errors.

This paves over the case where the client was connected to the leader
when it loses leadership.

* Adds a configurable server RPC drain time and a fail-fast path for RPCs.

When a server leaves it gets removed from the Raft configuration, so it will
never know who the new leader server ends up being. Without this we'd be
doomed to wait out the RPC hold timeout and then fail. This makes things fail
a little quicker while a sever is draining, and since we added a client retry
AND since the server doing this has already shut down and left the Serf LAN,
clients should retry against some other server.

* Makes the RPC hold timeout configurable.

* Reorders struct members.

* Sets the RPC hold timeout default for test servers.

* Bumps the leave drain time up to 5 seconds.

* Robustifies retries with a simpler client-side RPC hold.

* Reverts untended delete.
2017-10-10 15:19:50 -07:00
James Phillips d8a1ec70f8 Update compatibility.html.md 2017-10-09 14:18:37 -07:00
James Phillips 486e1a277e Update compatibility.html.md 2017-10-09 14:18:23 -07:00
Radek Simko 0075421b1a docs: agent/options gcp's project_name is optional
Per https://github.com/hashicorp/go-discover/blob/master/provider/gce/gce_discover.go#L53-L61
2017-10-08 13:08:50 +02:00
James Phillips 5f5e7529df Update raft.html.md 2017-10-06 14:38:21 -07:00
Preetha Appan 194b09a86b Update documentation with subcommand example 2017-10-06 15:23:43 -05:00
Preetha Appan 84ee8b8aff Autocomplete support 2017-10-06 15:05:45 -05:00
Kyle Havlovitz adf29675f3 Merge pull request #3535 from hashicorp/metric-docs
Update metric names and add a legacy config flag
2017-10-04 17:39:16 -07:00
Kyle Havlovitz 766d1259d8
Move http request metric to the agent section 2017-10-04 17:36:10 -07:00
Kyle Havlovitz a3e9ac5840
Add a test for legacy metrics with a whitelist filter 2017-10-04 17:27:57 -07:00
Kyle Havlovitz 198ed6076d Clean up subprocess handling and make shell use optional (#3509)
* Clean up handling of subprocesses and make using a shell optional

* Update docs for subprocess changes

* Fix tests for new subprocess behavior

* More cleanup of subprocesses

* Minor adjustments and cleanup for subprocess logic

* Makes the watch handler reload test use the new path.

* Adds check tests for new args path, and updates existing tests to use new path.

* Adds support for script args in Docker checks.

* Fixes the sanitize unit test.

* Adds panic for unknown watch type, and reverts back to Run().

* Adds shell option back to consul lock command.

* Adds shell option back to consul exec command.

* Adds shell back into consul watch command.

* Refactors signal forwarding and makes Windows-friendly.

* Adds a clarifying comment.

* Changes error wording to a warning.

* Scopes signals to interrupt and kill.

This avoids us trying to send SIGCHILD to the dead process.

* Adds an error for shell=false for consul exec.

* Adds notes about the deprecated script and handler fields.

* De-nests an if statement.
2017-10-04 16:48:00 -07:00
Kyle Havlovitz c728564994
Update metric names and add a legacy config flag 2017-10-04 16:43:27 -07:00
Frank Schröder ce887a0c45 Provide stable config for agent/self (#3532)
* config: provide stable config for /v1/agent/self (#3530)

This patch adds a stable subset of the previous Config struct to the
agent/self response. The actual runtime configuration is moved into
DebugConfig and will be documented to change.

Fixes #3530

* config: fix tests

* doc: update api documentation for /v1/agent/self
2017-10-04 10:43:17 -07:00
Frank Schroeder 2191511e9f doc: drop last references to -retry-join-* options 2017-10-04 19:12:28 +02:00
Frank Schroeder 012ec7876e doc: document go-discover format change 2017-10-04 19:12:28 +02:00
Preetha Appan 41ec69f71a Update ACL guide to describe the new list policy for Keys 2017-10-04 06:19:20 -05:00
Kyle Havlovitz 93bf819c36
Update snapshot agent docs 2017-09-29 12:28:04 -07:00
Frank Schroeder b0b84604fc update docs 2017-09-29 20:26:43 +02:00
Preetha Appan 9b3481265a Fix grammar in containers guide. 2017-09-29 10:37:04 -05:00
Preetha Appan 2c684ebb51 Update containers guide to mention that Consul now handles nodes changing IP addresses. 2017-09-29 10:20:33 -05:00
James Phillips 1181ab0d11 Clarifies server requirement for bootstrap-expect.
Fixes #3510.
2017-09-28 22:02:37 -07:00
Preetha Appan 54bb478372 Update sentinel documentation to remove features that are coming in a future release 2017-09-28 21:00:00 -05:00
Patrick Sodré fa67334361
Update docs on RFC1464 vs RFC1035 options 2017-09-28 12:32:46 +02:00
Patrick Sodré 8e14b527e8
Update docs to include support for TXT records
- Add explanation to the difference between RFC1035
    and RFC1464 queries.
2017-09-28 12:32:42 +02:00
James Phillips 38b2d76d39 Update options.html.md 2017-09-27 15:55:46 -07:00
James Phillips 5fa5f6ef01 Update options.html.md 2017-09-27 15:40:00 -07:00
James Phillips 7deed7162f Cleans up some docs for the 1.0 release. (#3508)
* Cleans up information about file extensions, now that they are required.

* Removes references to deprecated configuration options.

* Adds docs for multiple bind address support.
2017-09-27 15:30:30 -07:00
Alex Dadgar 26dfd4cf16 Fix mispelled words 2017-09-27 11:20:01 -07:00
James Phillips a8f228c2ae Adds a "required" note for the port in the network segments configuration. 2017-09-26 17:57:34 -07:00
Frank Schröder e84c2b2edd Metrics service prefix (#3498)
* metrics: replace statsite_prefix with service_prefix

The metrics prefix isn't statsite specific and is in fact used
for all metrics providers. Since we are deprecating fields
anyway we should fix this one as well.

Fixes #3293

* Updates docs and sorts telemetry section.

* Renames to "metrics_prefix" to disambiguate with Consul services.

* Updates the change log.
2017-09-26 17:49:55 -07:00
Frank Schröder 1e461110e6 agent: consolidate handling of 405 Method Not Allowed (#3405)
* agent: consolidate http method not allowed checks

This patch uses the error handling of the http handlers to handle HTTP
method not allowed errors across all available endpoints. It also adds a
test for testing whether the endpoints respond with the correct status
code.

* agent: do not panic on metrics tests

* agent: drop other tests for MethodNotAllowed

* agent: align /agent/join with reality

/agent/join uses PUT instead of GET as documented.

* agent: align /agent/check/{fail,warn,pass} with reality

/agent/check/{fail,warn,pass} uses PUT instead of GET as documented.

* fix some tests

* Drop more tests for method not allowed

* Align TestAgent_RegisterService_InvalidAddress with reality

* Changes API client join to use PUT instead of GET.

* Fixes agent endpoint verbs and removes obsolete tests.

* Updates the change log.
2017-09-25 23:11:19 -07:00
James Phillips 45646ac3f4 Bumps default Raft protocol to version 3. (#3477)
* Changes default Raft protocol to 3.

* Changes numPeers() to report only voters.

This should have been there before, but it's more obvious that this
is incorrect now that we default the Raft protocol to 3, which puts
new servers in a read-only state while Autopilot waits for them to
become healthy.

* Fixes TestLeader_RollRaftServer.

* Fixes TestOperator_RaftRemovePeerByAddress.

* Fixes TestServer_*.

Relaxed the check for a given number of voter peers and instead do
a thorough check that all servers see each other in their Raft
configurations.

* Fixes TestACL_*.

These now just check for Raft replication to be set up, and don't
care about the number of voter peers.

* Fixes TestOperator_Raft_ListPeers.

* Fixes TestAutopilot_CleanupDeadServerPeriodic.

* Fixes TestCatalog_ListNodes_ConsistentRead_Fail.

* Fixes TestLeader_ChangeServerID and adjusts the conn pool to throw away
sockets when it sees io.EOF.

* Changes version to 1.0.0 in the options doc.

* Makes metrics test more deterministic with autopilot metrics possible.
2017-09-25 15:27:04 -07:00
Frank Schröder 12216583a1 New config parser, HCL support, multiple bind addrs (#3480)
* new config parser for agent

This patch implements a new config parser for the consul agent which
makes the following changes to the previous implementation:

 * add HCL support
 * all configuration fragments in tests and for default config are
   expressed as HCL fragments
 * HCL fragments can be provided on the command line so that they
   can eventually replace the command line flags.
 * HCL/JSON fragments are parsed into a temporary Config structure
   which can be merged using reflection (all values are pointers).
   The existing merge logic of overwrite for values and append
   for slices has been preserved.
 * A single builder process generates a typed runtime configuration
   for the agent.

The new implementation is more strict and fails in the builder process
if no valid runtime configuration can be generated. Therefore,
additional validations in other parts of the code should be removed.

The builder also pre-computes all required network addresses so that no
address/port magic should be required where the configuration is used
and should therefore be removed.

* Upgrade github.com/hashicorp/hcl to support int64

* improve error messages

* fix directory permission test

* Fix rtt test

* Fix ForceLeave test

* Skip performance test for now until we know what to do

* Update github.com/hashicorp/memberlist to update log prefix

* Make memberlist use the default logger

* improve config error handling

* do not fail on non-existing data-dir

* experiment with non-uniform timeouts to get a handle on stalled leader elections

* Run tests for packages separately to eliminate the spurious port conflicts

* refactor private address detection and unify approach for ipv4 and ipv6.

Fixes #2825

* do not allow unix sockets for DNS

* improve bind and advertise addr error handling

* go through builder using test coverage

* minimal update to the docs

* more coverage tests fixed

* more tests

* fix makefile

* cleanup

* fix port conflicts with external port server 'porter'

* stop test server on error

* do not run api test that change global ENV concurrently with the other tests

* Run remaining api tests concurrently

* no need for retry with the port number service

* monkey patch race condition in go-sockaddr until we understand why that fails

* monkey patch hcl decoder race condidtion until we understand why that fails

* monkey patch spurious errors in strings.EqualFold from here

* add test for hcl decoder race condition. Run with go test -parallel 128

* Increase timeout again

* cleanup

* don't log port allocations by default

* use base command arg parsing to format help output properly

* handle -dc deprecation case in Build

* switch autopilot.max_trailing_logs to int

* remove duplicate test case

* remove unused methods

* remove comments about flag/config value inconsistencies

* switch got and want around since the error message was misleading.

* Removes a stray debug log.

* Removes a stray newline in imports.

* Fixes TestACL_Version8.

* Runs go fmt.

* Adds a default case for unknown address types.

* Reoders and reformats some imports.

* Adds some comments and fixes typos.

* Reorders imports.

* add unix socket support for dns later

* drop all deprecated flags and arguments

* fix wrong field name

* remove stray node-id file

* drop unnecessary patch section in test

* drop duplicate test

* add test for LeaveOnTerm and SkipLeaveOnInt in client mode

* drop "bla" and add clarifying comment for the test

* split up tests to support enterprise/non-enterprise tests

* drop raft multiplier and derive values during build phase

* sanitize runtime config reflectively and add test

* detect invalid config fields

* fix tests with invalid config fields

* use different values for wan sanitiziation test

* drop recursor in favor of recursors

* allow dns_config.udp_answer_limit to be zero

* make sure tests run on machines with multiple ips

* Fix failing tests in a few more places by providing a bind address in the test

* Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder.

* Add porter to server_test.go to make tests there less flaky

* go fmt
2017-09-25 11:40:42 -07:00
Frank Schroeder 58c0a3f16d
Update docs for addr_type option for AWS Cloud auto-join (#3471)
Fixes #3471
2017-09-25 09:54:58 +02:00
Frank Schroeder 511dc3e95a
Fix Azure cloud auto-join docs (#3466)
Fixes #3466
2017-09-25 02:44:04 +02:00
James Phillips 0ebae85bb9 Merge pull request #3490 from ruslansennov/javadoc-fix
minor doc fix
2017-09-21 19:38:59 -05:00
Ruslan Sennov 18d2556e3c
minor doc fix 2017-09-21 22:28:49 +03:00
James Phillips 3a78fd0030 Merge pull request #3485 from mstewa34/master
Fix docs/guides/segements sidebar selection.
2017-09-20 09:38:56 -05:00
Frank Schroeder f264bd9294
Fix health endpoint docs (#3483)
Fixes #3483
2017-09-20 09:05:23 +02:00
Michael Stewart 30106fc421 Fix docs/guides/segements sidebar selection. 2017-09-19 16:45:39 -05:00
Preetha Appan 7ca8b3ad8b
Adds documentation for Sentinel integration in Consul Enterprise. 2017-09-19 09:02:53 -05:00
Frank Schroeder 552681b2a5
Update example 2017-09-11 13:01:56 +02:00
Mitsunori Komatsu 2282f412d7 Fix wrong field name: Meta -> NodeMeta 2017-09-11 19:14:47 +09:00
James Phillips 17681f04f9 Merge pull request #3456 from hashicorp/gossip-fix
Adds gossip keys to network segment memberlist configs.
2017-09-07 12:27:34 -07:00
James Phillips 00605c0214
Shows the segment name in the keyring API and command output. 2017-09-07 12:17:39 -07:00
James Phillips 5888d1884f Update outage.html.md 2017-09-06 21:19:46 -07:00