Commit Graph

1279 Commits

Author SHA1 Message Date
Alex Dadgar 18bf9647d5 Test autopilots start/stop idempotency 2018-02-21 10:19:30 -08:00
Alex Dadgar 33c5afdb31 Improve autopilot shutdown to be idempotent 2018-02-20 15:51:59 -08:00
Pierre Souchay a8d3745104 Fixed comments for function maxIndexForService 2018-02-20 23:57:28 +01:00
Pierre Souchay 09351ba9a6 [Revert] Only update services if tags are different
This patch did give some better results, but break watches on
the services of a node.

It is possible to apply the same optimization for nodes than
to services (one index per instance), but it would complicate
further the patch.

Let's do it in another PR.
2018-02-20 23:34:42 +01:00
Pierre Souchay 60454b570a Only update services if tags are different 2018-02-20 23:08:04 +01:00
Pierre Souchay a05d38737c Enable Raft index optimization per service name on health endpoint
Had to fix unit test in order to check properly indexes.
2018-02-20 01:35:50 +01:00
Pierre Souchay 4f10fae3c3 Get only first service to test whether we have to cleanup index of a service 2018-02-19 22:44:49 +01:00
Pierre Souchay bac8fb046f Fixed comment about raftIndex + use test.Helper() 2018-02-19 19:30:25 +01:00
Pierre Souchay 73127ef407 Services Indexes modified per service instead of using a global Index
This patch improves the watches for services on large cluster:
each service has now its own index, such watches on a specific service
are not modified by changes in the global catalog.

It should improve a lot the performance of tools such as consul-template
or libraries performing watches on very large clusters with many
services/watches.
2018-02-19 18:29:22 +01:00
Veselkov Konstantin 7de57ba4de remove golint warnings 2018-01-28 22:40:13 +04:00
Kyle Havlovitz bfeb09983b
Reset clusterHealth when autopilot starts 2018-01-23 12:52:28 -08:00
Kyle Havlovitz 17805e4634
Move autopilot health loop into leader operations 2018-01-23 11:17:41 -08:00
James Phillips da6a4635b0
Fixes a `go fmt` cleanup. 2017-12-20 13:43:38 -08:00
Kyle Havlovitz 11a0c9cc58
Fix vet error 2017-12-18 18:04:42 -08:00
Kyle Havlovitz 77dc52f430
Move autopilot initializing to oss file 2017-12-18 18:02:44 -08:00
Kyle Havlovitz 039e7f1880
Move autopilot setup to a separate file 2017-12-18 16:55:51 -08:00
Kyle Havlovitz d08ab9fd19
Make some final tweaks to autopilot package 2017-12-18 12:26:47 -08:00
Kyle Havlovitz a86d11ec0a
Merge pull request #3737 from hashicorp/autopilot-refactor
Move autopilot to a standalone package
2017-12-15 14:09:40 -08:00
James Phillips 06f980061e
Merge pull request #3728 from weiwei04/fix_globalRPC_goroutine_leak
fix globalRPC goroutine leak
2017-12-14 17:54:19 -08:00
Kyle Havlovitz 324c2ecb53
Expose IsPotentialVoter for advanced autopilot logic 2017-12-13 17:53:51 -08:00
Kyle Havlovitz 12bf61c851
Merge branch 'master' into autopilot-refactor 2017-12-13 11:54:32 -08:00
Kyle Havlovitz d6b266c045
A few last autopilot adjustments 2017-12-13 11:19:17 -08:00
Kyle Havlovitz 2310687c1d
More autopilot reorganizing 2017-12-13 10:57:37 -08:00
James Phillips 46742a5041
Adds TODOs referencing #3744. 2017-12-13 10:52:06 -08:00
Kyle Havlovitz b92f895c23
More refactoring to make autopilot consul-agnostic 2017-12-12 17:46:28 -08:00
Kyle Havlovitz de28555671
Move autopilot to a standalone package 2017-12-11 16:45:33 -08:00
James Phillips d12e81860f
Moves Serf helper into lib to fix import cycle in consul-enterprise. 2017-12-07 16:57:58 -08:00
James Phillips 5065f3d82e
Turns of intent queue warnings and enables dynamic queue sizing. 2017-12-07 16:27:06 -08:00
Wei Wei cc9648c957 fix globalRPC goroutine leak
Signed-off-by: Wei Wei <weiwei.inf@gmail.com>
2017-12-05 11:53:30 +08:00
James Phillips 3e46544085
Creates a registration mechanism for snapshot and restore. 2017-11-29 18:36:53 -08:00
James Phillips f53f521072
Begins split out of snapshots from the main FSM class. 2017-11-29 18:36:53 -08:00
James Phillips c8e763667f
Creates a registration mechanism for FSM commands. 2017-11-29 18:36:53 -08:00
James Phillips 78292662d7
Moves the FSM into its own package.
This will help make it clearer what happens when we add some registration
plumbing for the different operations and snapshots.
2017-11-29 18:36:53 -08:00
James Phillips e810697e06
Resolves an FSM snapshot TODO.
This adds checks for sink write calls before we continue the refactor, which
will resolve the other TODO comment we deleted as part of this change.
2017-11-29 18:36:53 -08:00
James Phillips aa61159b74
Creates a registration mechanism for schemas.
This also splits out the registration into the table-specific source
files.
2017-11-29 18:36:52 -08:00
James Phillips 93ff33b1be
Creates a registration mechanism for RPC endpoints. 2017-11-29 18:36:52 -08:00
James Phillips 8bf1f57737
Renames stubs to be more consistent. 2017-11-29 18:36:52 -08:00
James Phillips 8abd2050fa
Sheds monotonic time info so tombstone GC bins work properly. 2017-11-29 10:34:24 -08:00
James Phillips de57a9ef51
Gives back the lock before writing to the expire channel.
The lock isn't needed after we clean up the expire bin, and as seen
in #3700 we can get into a deadlock waiting to place the expire index
into the channel while holding this lock.

Fixes #3700
2017-11-19 16:24:16 -08:00
James Phillips f19ba41144
Moves the LAN event handler after the router is created.
Fixes #3680
2017-11-10 12:26:48 -08:00
James Phillips 17737ee030
Revert "Adds a small sleep to make sure we are in the next GC bucket." 2017-11-08 22:18:37 -08:00
James Phillips 24475048e2
Adds a sleep to make sure we are in the next GC bucket, ups time.
Fixes #3670
2017-11-08 22:02:40 -08:00
James Phillips c57884fffe
Skips the tombstone GC test in Travis for now.
Related to #3670
2017-11-08 20:14:20 -08:00
James Phillips f6b7dcbcf6
Removes bogus getPort() in favor of freeport. 2017-11-08 19:55:50 -08:00
James Phillips 7b966e2d26
Tightens timing up and reorders GC test to be less flaky. 2017-11-08 15:09:29 -08:00
James Phillips 7c6ab5e783
Doubles the GC timing. 2017-11-08 15:01:11 -08:00
James Phillips 8de7c77482
Opens up test timing a little more. 2017-11-08 14:01:19 -08:00
James Phillips c46612f691
Shifts off a gran boundary to help make test less flaky. 2017-11-08 13:57:17 -08:00
James Phillips f31856c1b7
Opens up the tombstone GC test timing. 2017-11-08 13:43:39 -08:00
Kyle Havlovitz d3dd2b1402
Move check definition to a sub-struct 2017-11-01 14:54:46 -07:00
Kyle Havlovitz dbab3cd5f6
Merge branch 'master' into esm-changes 2017-11-01 11:37:48 -07:00
Kyle Havlovitz c4375d5a47
Merge pull request #3622 from hashicorp/coordinate-node-endpoint
agent: add /v1/coordianate/node/:node endpoint
2017-11-01 11:35:50 -07:00
Kyle Havlovitz b0536a96cc
Fill out the tests around coordinate/node functionality 2017-10-31 15:36:44 -07:00
Kyle Havlovitz 1e3b0d441b
Factor out registerNodes function 2017-10-31 13:34:49 -07:00
James Phillips 6bf55d16a2
Relaxes Autopilot promotion logic. (#3623)
* Relaxes Autopilot promotion logic.

When we defaulted the Raft protocol version to 3 in #3477 we made
the numPeers() routine more strict to only count voters (this is
more conservative and more correct). This had the side effect of
breaking rolling updates because it's at odds with the Autopilot
non-voter promotion logic.

That logic used to wait to only promote to maintain an odd quorum
of servers. During a rolling update (add one new server, wait, and
then kill an old server) the dead server cleanup would still count
the old server as a peer, which is conservative and the right thing
to do, and no longer count the non-voter. This would wait to promote,
so you could get into a stalemate. It is safer to promote early than
remove early, so by promoting as soon as possible we have chosen
that as the solution here.

Fixes #3611

* Gets rid of unnecessary extra not-a-voter check.
2017-10-31 15:16:56 -05:00
Kyle Havlovitz 2392545adc
Merge branch 'coordinate-node-endpoint' of github.com:hashicorp/consul into esm-changes 2017-10-26 19:20:24 -07:00
Kyle Havlovitz 5589eadcf5
Added Coordinate.Node rpc endpoint and client api method 2017-10-26 19:16:40 -07:00
Kyle Havlovitz a7c42a6c2a
Expose SkipNodeUpdate field and some health check info in the http api 2017-10-25 19:37:30 +02:00
Frank Schroeder c94751ad43 test: replace porter tool with freeport lib
This patch removes the porter tool which hands out free ports from a
given range with a library which does the same thing. The challenge for
acquiring free ports in concurrent go test runs is that go packages are
tested concurrently and run in separate processes. There has to be some
inter-process synchronization in preventing processes allocating the
same ports.

freeport allocates blocks of ports from a range expected to be not in
heavy use and implements a system-wide mutex by binding to the first
port of that block for the lifetime of the application. Ports are then
provided sequentially from that block and are tested on localhost before
being returned as available.
2017-10-21 22:01:09 +02:00
Ryan Slade 85e4aea9d1 Replace time.Now().Sub(x) with time.Since(x) 2017-10-17 20:38:24 +02:00
James Phillips 575d70aaa7
Cleans up some drift between the OSS and Enterprise trees. 2017-10-11 15:53:07 -07:00
James Phillips bb12368eac Makes RPC handling more robust when rolling servers. (#3561)
* Adds client-side retry for no leader errors.

This paves over the case where the client was connected to the leader
when it loses leadership.

* Adds a configurable server RPC drain time and a fail-fast path for RPCs.

When a server leaves it gets removed from the Raft configuration, so it will
never know who the new leader server ends up being. Without this we'd be
doomed to wait out the RPC hold timeout and then fail. This makes things fail
a little quicker while a sever is draining, and since we added a client retry
AND since the server doing this has already shut down and left the Serf LAN,
clients should retry against some other server.

* Makes the RPC hold timeout configurable.

* Reorders struct members.

* Sets the RPC hold timeout default for test servers.

* Bumps the leave drain time up to 5 seconds.

* Robustifies retries with a simpler client-side RPC hold.

* Reverts untended delete.
2017-10-10 15:19:50 -07:00
James Phillips 4dab70cb93 Fixes handling of stop channel and failed barrier attempts. (#3546)
* Fixes handling of stop channel and failed barrier attempts.

There were two issues here. First, we needed to not exit when there
was a timeout trying to write the barrier, because Raft might not
step down, so we'd be left as the leader but having run all the step
down actions.

Second, we didn't close over the stopCh correctly, so it was possible
to nil that out and have the leaderLoop never exit. We close over it
properly AND sequence the nil-ing of it AFTER the leaderLoop exits for
good measure, so the code is more robust.

Fixes #3545

* Cleans up based on code review feedback.

* Tweaks comments.

* Renames variables and removes comments.
2017-10-06 07:54:49 -07:00
Kyle Havlovitz c728564994
Update metric names and add a legacy config flag 2017-10-04 16:43:27 -07:00
Preetha Appan 8dcd7e700c Remove extra newline 2017-10-03 15:19:31 -05:00
Preetha Appan 26accb3b8a Only allow 'list' policies within 'key' policy definitions. Consolidated two similar tests into one and fixed alignment. 2017-10-03 15:15:56 -05:00
Preetha Appan 51a04ec87d Introduces new 'list' permission that applies to KV store recursive reads, and enforced only when opted in. 2017-10-02 17:10:21 -05:00
James Phillips 0190c4a081
Gets rid of flaky clause in stats fetcher unit test.
Given how the rutine is coded we can still get data so this wasn't
a reliable thing to check.
2017-09-26 20:53:06 -07:00
preetapan 4d9fc638b4 Issue 3452 (#3500)
* Make sure that id and address are set in member created during reaping of catalog nodes that have been removed from serf

* Get address from node table in the state store rather than from service address

* Fix incorrect lookup by checkname instead of node name

* Make sure that serverlookup is called with the right address format, added unit test.

* Address code review comments

* Tweaks style stuff.
2017-09-26 20:49:41 -07:00
James Phillips 5fa2322e0b
Cleans up some edge cases in TestSnapshot_Forward_Leader.
These could cause the tests to hang.
2017-09-26 14:07:28 -07:00
Preetha Appan 3c4a108769 Move Raft protocol version for list peers end point to server side, fix unit tests. This fixes #3449 2017-09-26 09:35:39 -05:00
James Phillips 45646ac3f4 Bumps default Raft protocol to version 3. (#3477)
* Changes default Raft protocol to 3.

* Changes numPeers() to report only voters.

This should have been there before, but it's more obvious that this
is incorrect now that we default the Raft protocol to 3, which puts
new servers in a read-only state while Autopilot waits for them to
become healthy.

* Fixes TestLeader_RollRaftServer.

* Fixes TestOperator_RaftRemovePeerByAddress.

* Fixes TestServer_*.

Relaxed the check for a given number of voter peers and instead do
a thorough check that all servers see each other in their Raft
configurations.

* Fixes TestACL_*.

These now just check for Raft replication to be set up, and don't
care about the number of voter peers.

* Fixes TestOperator_Raft_ListPeers.

* Fixes TestAutopilot_CleanupDeadServerPeriodic.

* Fixes TestCatalog_ListNodes_ConsistentRead_Fail.

* Fixes TestLeader_ChangeServerID and adjusts the conn pool to throw away
sockets when it sees io.EOF.

* Changes version to 1.0.0 in the options doc.

* Makes metrics test more deterministic with autopilot metrics possible.
2017-09-25 15:27:04 -07:00
Preetha Appan d7e27e67c1 Introduce Code Policy validation via sentinel, with a noop implementation 2017-09-25 13:44:55 -05:00
Frank Schröder 12216583a1 New config parser, HCL support, multiple bind addrs (#3480)
* new config parser for agent

This patch implements a new config parser for the consul agent which
makes the following changes to the previous implementation:

 * add HCL support
 * all configuration fragments in tests and for default config are
   expressed as HCL fragments
 * HCL fragments can be provided on the command line so that they
   can eventually replace the command line flags.
 * HCL/JSON fragments are parsed into a temporary Config structure
   which can be merged using reflection (all values are pointers).
   The existing merge logic of overwrite for values and append
   for slices has been preserved.
 * A single builder process generates a typed runtime configuration
   for the agent.

The new implementation is more strict and fails in the builder process
if no valid runtime configuration can be generated. Therefore,
additional validations in other parts of the code should be removed.

The builder also pre-computes all required network addresses so that no
address/port magic should be required where the configuration is used
and should therefore be removed.

* Upgrade github.com/hashicorp/hcl to support int64

* improve error messages

* fix directory permission test

* Fix rtt test

* Fix ForceLeave test

* Skip performance test for now until we know what to do

* Update github.com/hashicorp/memberlist to update log prefix

* Make memberlist use the default logger

* improve config error handling

* do not fail on non-existing data-dir

* experiment with non-uniform timeouts to get a handle on stalled leader elections

* Run tests for packages separately to eliminate the spurious port conflicts

* refactor private address detection and unify approach for ipv4 and ipv6.

Fixes #2825

* do not allow unix sockets for DNS

* improve bind and advertise addr error handling

* go through builder using test coverage

* minimal update to the docs

* more coverage tests fixed

* more tests

* fix makefile

* cleanup

* fix port conflicts with external port server 'porter'

* stop test server on error

* do not run api test that change global ENV concurrently with the other tests

* Run remaining api tests concurrently

* no need for retry with the port number service

* monkey patch race condition in go-sockaddr until we understand why that fails

* monkey patch hcl decoder race condidtion until we understand why that fails

* monkey patch spurious errors in strings.EqualFold from here

* add test for hcl decoder race condition. Run with go test -parallel 128

* Increase timeout again

* cleanup

* don't log port allocations by default

* use base command arg parsing to format help output properly

* handle -dc deprecation case in Build

* switch autopilot.max_trailing_logs to int

* remove duplicate test case

* remove unused methods

* remove comments about flag/config value inconsistencies

* switch got and want around since the error message was misleading.

* Removes a stray debug log.

* Removes a stray newline in imports.

* Fixes TestACL_Version8.

* Runs go fmt.

* Adds a default case for unknown address types.

* Reoders and reformats some imports.

* Adds some comments and fixes typos.

* Reorders imports.

* add unix socket support for dns later

* drop all deprecated flags and arguments

* fix wrong field name

* remove stray node-id file

* drop unnecessary patch section in test

* drop duplicate test

* add test for LeaveOnTerm and SkipLeaveOnInt in client mode

* drop "bla" and add clarifying comment for the test

* split up tests to support enterprise/non-enterprise tests

* drop raft multiplier and derive values during build phase

* sanitize runtime config reflectively and add test

* detect invalid config fields

* fix tests with invalid config fields

* use different values for wan sanitiziation test

* drop recursor in favor of recursors

* allow dns_config.udp_answer_limit to be zero

* make sure tests run on machines with multiple ips

* Fix failing tests in a few more places by providing a bind address in the test

* Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder.

* Add porter to server_test.go to make tests there less flaky

* go fmt
2017-09-25 11:40:42 -07:00
James Phillips d84c0b1a01
Robustifies check in TestCatalog_ListNodes_ConsistentRead_Fail test.
Fixes #3469
2017-09-13 21:22:53 -07:00
James Phillips 828be5771a
Revert "Manages segments list via a pointer."
This reverts commit c277a42504.
2017-09-07 16:37:11 -07:00
James Phillips c277a42504
Manages segments list via a pointer. 2017-09-07 16:21:07 -07:00
James Phillips 96a89a3381
Cleans up formatting. 2017-09-07 12:26:58 -07:00
James Phillips 00605c0214
Shows the segment name in the keyring API and command output. 2017-09-07 12:17:39 -07:00
James Phillips 88a150cee1
Moves reconcile loop into segment stub. 2017-09-06 18:01:53 -07:00
James Phillips 5c03cb571d
Takes the skip out of the client check.
Without this the merge delegate won't check the segment for non-servers
a little below here.
2017-09-06 17:05:40 -07:00
James Phillips 3418c7ff93 Merge pull request #3447 from hashicorp/issue-3070
Skips unique node ID check for old versions of Consul.
2017-09-06 13:24:15 -07:00
James Phillips 520060e138
Fixes incorrect comment. 2017-09-06 13:23:19 -07:00
James Phillips 084679ab65
Pulls down some code for the check loop. 2017-09-06 13:07:42 -07:00
James Phillips 3535652595
Uses the Raft configuration for the self-add skip check. 2017-09-06 13:05:51 -07:00
Preetha Appan 5f2e1c9b07 Change member join reconcile step to process joining itself, to handle node IP address changes correctly when number of servers < 3 2017-09-06 13:53:01 -05:00
James Phillips 1333fa57a1
Skips unique node ID check for old versions of Consul.
Fixes #3070.
2017-09-05 22:57:29 -07:00
James Phillips 1a117ba0a8
Makes the all segments query explict, and the default for `consul members`. 2017-09-05 12:22:20 -07:00
James Phillips 9258506dab Adds simple rate limiting for client agent RPC calls to Consul servers. (#3440)
* Added rate limiting for agent RPC calls.
* Initializes the rate limiter based on the config.
* Adds the rate limiter into the snapshot RPC path.
* Adds unit tests for the RPC rate limiter.
* Groups the RPC limit parameters under "limits" in the config.
* Adds some documentation about the RPC limiter.
* Sends a 429 response when the rate limiter kicks in.
* Adds docs for new telemetry.
* Makes snapshot telemetry look like RPC telemetry and cleans up comments.
2017-09-01 15:02:50 -07:00
Kyle Havlovitz 220db48aa7 Merge pull request #3431 from hashicorp/network-segments-oss 2017-09-01 10:24:58 -07:00
Kyle Havlovitz 0e33e2ecab
Pass listeners into setupSegments 2017-08-31 17:56:43 -07:00
Kyle Havlovitz 62102a537e
Organize segments for a cleaner split between enterprise and OSS 2017-08-31 17:39:46 -07:00
Kyle Havlovitz 7e565d7338
Fix some inconsistencies with segment logic and comments 2017-08-30 17:43:46 -07:00
Preetha Appan 2386214655 Wire server provider for raft layer only on protocol version 3 and above, and update changelog 2017-08-30 14:36:47 -05:00
Kyle Havlovitz 14b027a3c2
Add segment addr field to tags for LAN flood joiner 2017-08-30 11:58:29 -07:00
Kyle Havlovitz d129767657
Add agent.segment interpolation to prepared queries 2017-08-30 11:58:29 -07:00
Kyle Havlovitz 2ada0439d4
Add rpc_listener option to segment config 2017-08-30 11:58:29 -07:00
James Phillips b1a15e0c3d
Adds open source side of network segments (feature is Enterprise-only). 2017-08-30 11:58:29 -07:00
Preetha Appan a231eea0e7 More cleanup from code review 2017-08-30 12:31:36 -05:00
Preetha Appan c6ee9bfa69 Remove copy pasted duplicate line, update documentation. 2017-08-30 10:02:10 -05:00
Preetha Appan 0f4e24f72c Consolidate server lookup into one place and replace usages of localConsuls. 2017-08-30 09:30:33 -05:00
Preetha Appan e639154abd Remove stray commented line 2017-08-30 09:30:33 -05:00
Preetha Appan 00836a6aab Remove server address tracking logic from manager/router and maintain it as part of lan event listener instead. Used sync.Map to track this, and added unit tests 2017-08-30 09:30:33 -05:00
Preetha Appan 830aca958a ServerAddressProvider interface also returns an error now 2017-08-30 09:30:33 -05:00
Preetha Appan c68fce89b5 Use config struct to create NetworkTransport layer when setting up raft 2017-08-30 09:30:33 -05:00
Preetha Appan 393ce1581b Implement AddressProvider and wire that up to raft transport layer to support server nodes changing their IP addresses in containerized environments 2017-08-30 09:30:33 -05:00
Frank Schroeder 831d84c940 build: make tests independent of build tags
When the metadata server is scanning the agents for potential servers
it is parsing the version number which the agent provided when it
joined. This version number has to conform to a certain format, i.e.
'n.n.n'. Without this version number properly set some tests fail with
error messages that disguise the root cause.

The default version number is currently set to 'unknown' in
version/version.go which does not parse and triggers the tests to fail.
The work around is to use a build tag 'consul' which will use the
version number set in version_base.go instead which has the correct
format and is set to the current release version.

In addition, some parts of the code also require the version number to
be of a certain value. Setting it to '0.0.0' for example makes some
tests pass and others fail since they don't pass the semantic check.

When using go build/install/test one has to remember to use '-tags
consul' or tests will fail with non-obvious error messages.

Using build tags makes the build process more complex and error prone
since it prevents the use of the plain go toolchain and - at least in
its current form - introduces subtle build and test issues. We should
try to eliminate build tags for anything else but platform specific
code.

This patch removes all references to specific version numbers in the
code and tests and sets the default version to '9.9.9' which is
syntactically correct and passes the semantic check. This solves the
issue of running go build/install/test without tags for the OSS build.
2017-08-30 13:40:18 +02:00
Frank Schröder a3934c263c acl: consolidate error handling (#3401)
The error handling of the ACL code relies on the presence of certain
magic error messages. Since the error values are sent via RPC between
older and newer consul agents we cannot just replace the magic values
with typed errors and switch to type checks since this would break
compatibility with older clients.

Therefore, this patch moves all magic ACL error messages into the acl
package and provides default error values and helper functions which
determine the type of error.
2017-08-23 16:52:48 +02:00
Frank Schroeder 16c58da27d agent: drop unused code
This code from http://github.com/hashicorp/consul/pull/3353 is no longer
required.
2017-08-22 00:02:46 +02:00
James Phillips e8a83bb463 Revert "Return 403 rather than a 404 when acls cause all results to be filter…" 2017-08-09 15:06:57 -07:00
James Phillips 02a87df044 Revert "Ensure that we return a permission denied only if the list of keys/en…" 2017-08-09 15:06:20 -07:00
Preetha Appan 42fb49c00b Added unit test case to kvs_endpointtest 2017-08-09 15:50:22 -05:00
Preetha Appan 3276891142 Ensure that we return a permission denied only if the list of keys/entries prior to filtering by ACL is non empty 2017-08-09 15:32:18 -05:00
Frank Schroeder 7cff50a4df
agent: move agent/consul/agent to agent/metadata 2017-08-09 14:36:52 +02:00
Frank Schroeder c395599cea
agent: move agent/consul/servers to agent/router 2017-08-09 14:36:37 +02:00
Frank Schroeder 1acff3533e
agent: move agent/consul/structs to agent/structs 2017-08-09 14:32:12 +02:00
Kyle Havlovitz cf02e3bc22 Merge pull request #3369 from hashicorp/metrics-enhancements
Add support for labels/filters from go-metrics
2017-08-08 13:55:30 -07:00
Kyle Havlovitz d5634fe2a8
Add support for labels/filters from go-metrics 2017-08-08 01:45:10 -07:00
Preetha Appan 37f75a393e
Use sanitized version of node name of server in NS record, and start with "server" rather than "ns" 2017-08-07 11:11:55 +02:00
Preetha Appan 794d1afe44
Removed a copy pasted irrelevant comment, and other code review feedback 2017-08-07 11:11:54 +02:00
Preetha Appan f9db387097
Add NS records and A records for each server. Constructs ns host names using the advertise address of the server. 2017-08-07 11:11:54 +02:00
James Phillips 4bee2e49f5 Adds secure introduction for the ACL replication token. (#3357)
Adds secure introduction for the ACL replication token, as well as a separate enable config for ACL replication.
2017-08-03 15:39:31 -07:00
James Phillips c0a5ad7903 Adds a new /v1/acl/bootstrap API (#3349) 2017-08-02 17:05:18 -07:00
Preetha Appan 4076c0d741 Return nil instead of empty list when returning a PermissionDenied error, updated unit test 2017-07-31 17:23:20 -05:00
Preetha Appan 6336014a86 Return 403 rather than a 404 when acls cause all results to be filtered out. This fixes #2637 2017-07-31 13:50:29 -05:00
James Phillips 10b660d77a Adds missing autopilot snapshot test and avoids snapshotting nil. (#3333) 2017-07-28 15:48:42 -07:00
James Phillips 6250cd70f5 Adds option to prepared queries to remove empty tags. (#3330) 2017-07-26 22:46:43 -07:00
James Phillips 496b0bcf07 Adds support for agent-side ACL token management via API instead of config files. (#3324)
* Adds token store and removes all runtime use of config for ACL tokens.
* Adds a new API for changing agent tokens on the fly.
2017-07-26 11:03:43 -07:00
Preetha Appan b94617b281 Add extra test case for deleting entire tree with empty prefix 2017-07-26 09:42:07 -05:00
Preetha Appan 4498814843 Don't insert tombstone for empty prefix delete. Other minor unit test fixes 2017-07-25 21:54:11 -05:00
Preetha Appan fee418d378 Removed redundant comments and unit test 2017-07-25 20:39:33 -05:00
Preetha Appan b772c477c2 Removed redundant call to reap tombstone from unit test 2017-07-25 19:39:05 -05:00
Preetha Appan ae443e21d6 Improved unit test per code review 2017-07-25 19:17:40 -05:00
Preetha Appan 36acf8d6a4 Use new DeletePrefixMethod for implementing KVSDeleteTree operation. This makes deletes on sub trees larger than one million nodes about 100 times faster. Added unit tests. 2017-07-25 17:21:18 -05:00
Frank Schroeder 0047b7d3f0
fix spelling in filenames
Fixes #3301
2017-07-19 13:16:38 +02:00
Kyle Havlovitz 19eae3d14b
Add UpgradeVersionTag to autopilot config 2017-07-18 13:35:41 -07:00
James Phillips 1791d99a10 Adds new config to make script checks opt-in, updates documentation. (#3284) 2017-07-17 11:20:35 -07:00
Kyle Havlovitz 78c3a86405
Add TLS setting to router areas 2017-07-14 17:38:08 -07:00
James Phillips 0881e46111 Cleans up version 8 ACLs in the agent and the docs. (#3248)
* Moves magic check and service constants into shared structs package.

* Removes the "consul" service from local state.

Since this service is added by the leader, it doesn't really make sense to
also keep it in local state (which requires special ACLs to configure), and
requires a bunch of special cases in the local state logic. This requires
fewer special cases and makes ACL bootstrapping cleaner.

* Makes coordinate update ACL log message a warning, similar to other AE warnings.

* Adds much more detailed examples for bootstrapping ACLs.

This can hopefully replace https://gist.github.com/slackpad/d89ce0e1cc0802c3c4f2d84932fa3234.
2017-07-13 22:33:47 -07:00
Frank Schroeder 1781fd311f address review comments 2017-07-07 09:22:34 +02:00
Frank Schroeder e4b40acc7e agent: remove unused code 2017-07-07 09:22:34 +02:00
Frank Schroeder 8c792ad57d agent: make TestClient_RPC_ConsulServerPing more robust 2017-07-07 09:22:34 +02:00
James Phillips a855d31f84 Adds a comment about flood joining. 2017-07-07 09:22:34 +02:00
James Phillips 5b5217528a Simplifies Serf dynamic port selection code.
This isn't racy, it's just a little dirty. The listen will happen and a port
will be selected and injected into the config once the Serf instance is
created, so we don't need the retry loop here.
2017-07-07 09:22:34 +02:00
James Phillips d8db4bc086 test: Changes WAN/LAN join confirmer to use port number vs. address.
This fixes TestServer_JoinSeparateLanAndWanAddresses which sets bogus
advertise addresses as part of the test. Port numbers uniquely identify
members since everything is running on localhost.
2017-07-07 09:22:34 +02:00
Frank Schroeder d92f70f313 test: make joinLAN/WAN reliable
only return if the members can see each other
2017-07-07 09:22:34 +02:00
Frank Schroeder 112bc19cd5 rpc: make TestServer_JoinSeparateLanAndWanAddresses more robust 2017-07-07 09:22:34 +02:00
Frank Schroeder ffd45f5da5 rpc: make TestClient_SnapshotRPC_TLS more robust 2017-07-07 09:22:34 +02:00
Frank Schroeder 2159d499e3 rpc: try shutting down leader first to avoid hang in TestLeader_LeftServer 2017-07-07 09:22:34 +02:00
Frank Schroeder f12fac278e rpc: fix logging and try quicker timing of TestServer_JoinSeparateLanAndWanAddresses 2017-07-07 09:22:34 +02:00
Frank Schroeder bae4b1d045 rpc: less agressive raft timeouts
Allowing more time for raft to consolidate should
drop the number of leader elections.
2017-07-07 09:22:34 +02:00
Frank Schroeder 457b98a099 rpc: run agent/consul tests in parallel 2017-07-07 09:22:34 +02:00
Frank Schroeder 13eeeb720d rpc: refactor sessionTimers and fix racy tests
The sessionTimers map was secured by a lock which wasn't used
properly in the tests. This lead to data races and failing tests
when accessing the length or the members of the map.

This patch adds a separate SessionTimers struct which is safe
for concurrent use and which ecapsulates the behavior of the
sessionTimers map.
2017-07-07 09:22:34 +02:00
Frank Schroeder 05f756853e rpc: fix TestServer_Leave
wait for the leader election.
2017-07-07 09:22:34 +02:00
Frank Schroeder 583959392b rpc: fix TestSession_Renew
make the timing less tight
2017-07-07 09:22:34 +02:00
Frank Schroeder ff2c29c0be rpc: fix TestReadyForConsistentRead
timing was too tight. Standardized name.
2017-07-07 09:22:34 +02:00
Frank Schroeder fcab525053 rpc: fix for 'no leader' in TLS tests
Ensure both servers know about each other before looking
for a leader.
2017-07-07 09:22:34 +02:00
Frank Schroeder b2a71fd8b0 rpc: fix TestServer_JoinWAN_Flood
The second server in the first data center should not be
in bootstrap mode.
2017-07-07 09:22:34 +02:00
Frank Schroeder 8369b6cb9d rpc: provide unique node names for server and client 2017-07-07 09:22:34 +02:00
Frank Schroeder 534977239b rpc: prefix log output with test name 2017-07-07 09:22:34 +02:00
Frank Schroeder c8ef588d8d rpc: discover serf wan port before starting serf lan
When using dynamic ports for the serf clusters then
the actual bind port of the serf WAN cluster needs to
be discovered before the serf LAN cluster is started
since the serf LAN cluster announces the port of the WAN
cluster.
2017-07-07 09:22:34 +02:00
Frank Schroeder 53eab7e970 rpc: bind rpc test server to port 0 2017-07-07 09:22:34 +02:00
Frank Schroeder e9e2c599db rpc: refactor: unify test server setup 2017-07-07 09:22:34 +02:00
Frank Schroeder c803146550 rpc: fix typos 2017-07-07 09:22:34 +02:00
Frank Schroeder a0368e3827 agent: refactor: log to stderr during tests 2017-07-07 09:22:34 +02:00
Preetha Appan f549c06764 Rename to raftNotifyCh, fix typo 2017-07-06 09:10:36 -05:00
Preetha Appan f2171a6720 Fixes deadlock between barrier write and leader notify channel read . Fixes #3230 2017-07-05 17:09:18 -05:00
James Phillips e4b11682bc Fixes broken HTTP header and method for health checks. (#3178)
* Fixes broken HTTP header and method for health checks.
* Adds a fuzz utility and test to make sure copy is complete.
2017-06-23 01:15:48 -07:00
Frank Schroeder 2b41f2e3a3 agent: make the RPC endpoint overwrite mechanism more transparent
This patch hides the RPC handler overwrite mechanism from the
rest of the code so that it works in all cases and that there
is no cooperation required from the tested code, i.e. we can
drop a.getEndpoint().
2017-06-21 05:42:39 +02:00
Frank Schroeder c49a15d0f3 agent: move structs into consul/structs pkg
* CheckDefinition
 * ServiceDefinition
 * CheckType
2017-06-21 05:42:39 +02:00
Frank Schroeder 4273fb8444 agent: move NotifyGroup into the agent pkg 2017-06-21 05:42:39 +02:00
Frank Schroeder 82a132da60 agent: move conn pool for muxed connections into separate pkg 2017-06-21 05:42:39 +02:00
Frank Schroeder 80971c8a85 agent: move the SnapshotReplyFn out of the way
When splitting up the consul package into server and client
the SnapshotReplyFn needs to be in a separate package to avoid
a circular dependency.
2017-06-21 05:42:39 +02:00
Frank Schroeder 04b9392b00 agent: use the delegate interface for local state 2017-06-21 05:42:39 +02:00
Preetha Appan f658231ab9 Minor fixes per code review 2017-06-20 19:43:07 -05:00
Preetha Appan b3b2e9dcb4 Added unit test to verify consistentRead method behavior 2017-06-16 11:58:12 -05:00
Preetha Appan 44f5086873 Code review feedback, fixed major logic bug 2017-06-16 10:49:54 -05:00
Preetha Appan 72af7b9bc4 Redo bug fix for stale reads on server startup, leveraging RPCHOldtimeout instead of maxQueryTime, plus tests 2017-06-15 22:41:30 -05:00
Frank Schroeder 1c75cf1af5 pkg refactor
command/agent/*                  -> agent/*
    command/consul/*                 -> agent/consul/*
    command/agent/command{,_test}.go -> command/agent{,_test}.go
    command/base/command.go          -> command/base.go
    command/base/*                   -> command/*
    commands.go                      -> command/commands.go

The script which did the refactor is:

(
	cd $GOPATH/src/github.com/hashicorp/consul
	git mv command/agent/command.go command/agent.go
	git mv command/agent/command_test.go command/agent_test.go
	git mv command/agent/flag_slice_value{,_test}.go command/
	git mv command/agent .
	git mv command/base/command.go command/base.go
	git mv command/base/config_util{,_test}.go command/
	git mv commands.go command/
	git mv consul agent
	rmdir command/base/

	gsed -i -e 's|package agent|package command|' command/agent{,_test}.go
	gsed -i -e 's|package agent|package command|' command/flag_slice_value{,_test}.go
	gsed -i -e 's|package base|package command|' command/base.go command/config_util{,_test}.go
	gsed -i -e 's|package main|package command|' command/commands.go

	gsed -i -e 's|base.Command|BaseCommand|' command/commands.go
	gsed -i -e 's|agent.Command|AgentCommand|' command/commands.go
	gsed -i -e 's|\tCommand:|\tBaseCommand:|' command/commands.go
	gsed -i -e 's|base\.||' command/commands.go
	gsed -i -e 's|command\.||' command/commands.go

	gsed -i -e 's|command|c|' main.go
	gsed -i -e 's|range Commands|range command.Commands|' main.go
	gsed -i -e 's|Commands: Commands|Commands: command.Commands|' main.go

	gsed -i -e 's|base\.BoolValue|BoolValue|' command/operator_autopilot_set.go
	gsed -i -e 's|base\.DurationValue|DurationValue|' command/operator_autopilot_set.go
	gsed -i -e 's|base\.StringValue|StringValue|' command/operator_autopilot_set.go
	gsed -i -e 's|base\.UintValue|UintValue|' command/operator_autopilot_set.go

	gsed -i -e 's|\bCommand\b|BaseCommand|' command/base.go
	gsed -i -e 's|BaseCommand Options|Command Options|' command/base.go
	gsed -i -e 's|base.Command|BaseCommand|' command/*.go
	gsed -i -e 's|c\.Command|c.BaseCommand|g' command/*.go
	gsed -i -e 's|\tCommand:|\tBaseCommand:|' command/*_test.go
	gsed -i -e 's|base\.||' command/*_test.go

	gsed -i -e 's|\bCommand\b|AgentCommand|' command/agent{,_test}.go
	gsed -i -e 's|cmd.AgentCommand|cmd.BaseCommand|' command/agent.go

	gsed -i -e 's|cli.AgentCommand = new(Command)|cli.Command = new(AgentCommand)|' command/agent_test.go
	gsed -i -e 's|exec.AgentCommand|exec.Command|' command/agent_test.go
	gsed -i -e 's|exec.BaseCommand|exec.Command|' command/agent_test.go
	gsed -i -e 's|NewTestAgent|agent.NewTestAgent|' command/agent_test.go
	gsed -i -e 's|= TestConfig|= agent.TestConfig|' command/agent_test.go
	gsed -i -e 's|: RetryJoin|: agent.RetryJoin|' command/agent_test.go

	gsed -i -e 's|\.\./\.\./|../|' command/config_util_test.go

	gsed -i -e 's|\bverifyUniqueListeners|VerifyUniqueListeners|' agent/config{,_test}.go command/agent.go
	gsed -i -e 's|\bserfLANKeyring\b|SerfLANKeyring|g' agent/{agent,keyring,testagent}.go command/agent.go
	gsed -i -e 's|\bserfWANKeyring\b|SerfWANKeyring|g' agent/{agent,keyring,testagent}.go command/agent.go
	gsed -i -e 's|\bNewAgent\b|agent.New|g' command/agent{,_test}.go
	gsed -i -e 's|\bNewAgent|New|' agent/{acl_test,agent,testagent}.go

	gsed -i -e 's|\bAgent\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bBool\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bConfig\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bDefaultConfig\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bDevConfig\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bMergeConfig\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bReadConfigPaths\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bParseMetaPair\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bSerfLANKeyring\b|agent.&|g' command/agent{,_test}.go
	gsed -i -e 's|\bSerfWANKeyring\b|agent.&|g' command/agent{,_test}.go

	gsed -i -e 's|circonus\.agent|circonus|g' command/agent{,_test}.go
	gsed -i -e 's|logger\.agent|logger|g' command/agent{,_test}.go
	gsed -i -e 's|metrics\.agent|metrics|g' command/agent{,_test}.go
	gsed -i -e 's|// agent.Agent|// agent|' command/agent{,_test}.go
	gsed -i -e 's|a\.agent\.Config|a.Config|' command/agent{,_test}.go

	gsed -i -e 's|agent\.AppendSliceValue|AppendSliceValue|' command/{configtest,validate}.go

	gsed -i -e 's|consul/consul|agent/consul|' GNUmakefile

	gsed -i -e 's|\.\./test|../../test|' agent/consul/server_test.go

	# fix imports
	f=$(grep -rl 'github.com/hashicorp/consul/command/agent' * | grep '\.go')
	gsed -i -e 's|github.com/hashicorp/consul/command/agent|github.com/hashicorp/consul/agent|' $f
	goimports -w $f

	f=$(grep -rl 'github.com/hashicorp/consul/consul' * | grep '\.go')
	gsed -i -e 's|github.com/hashicorp/consul/consul|github.com/hashicorp/consul/agent/consul|' $f
	goimports -w $f

	goimports -w command/*.go main.go
)
2017-06-10 18:52:45 +02:00